SEQanswers

SEQanswers (http://seqanswers.com/forums/index.php)
-   Bioinformatics (http://seqanswers.com/forums/forumdisplay.php?f=18)
-   -   Please help with assembly 2x250 bp with 350 insert (http://seqanswers.com/forums/showthread.php?t=75324)

Vinn 04-09-2017 04:07 PM

Please help with assembly 2x250 bp with 350 insert
 
Hi folks,

I have got data generated from 2x250 bp with 350 bp insert for fungal genome assembly. I got very good results from SPAdes, but later discovered that SPAdes recommends an insert of 550-700 bp for 2x250 bp sequencing.

I have tried Abyss, Velvet and CLC so far, but they did not give as good results as SPAdes. Do you have opinion or suggestion what else I shall try? Or would it be just okay to use SPAdes?

Thanks in advance and have a great weekend!

Ola 04-10-2017 03:45 AM

Discovar de novo was designed specifically for 2x250 bp reads so you could give it a try. Of course it is recommended to have inserts longer than the sequencing reads (500 bp in your case) but it goes for all assemblers and doesn't mean your assembly is not valid. Best way to improve the assembly would likely be to add long/linked reads though.

GenoMax 04-10-2017 05:22 AM

@Vinn: Have you tried to merge your reads (since they must overlap in the middle) and then try assembly as a single end dataset?

Brian Bushnell 04-10-2017 09:47 AM

Quote:

Originally Posted by Vinn (Post 206100)
I have got data generated from 2x250 bp with 350 bp insert for fungal genome assembly. I got very good results from SPAdes, but later discovered that SPAdes recommends an insert of 550-700 bp for 2x250 bp sequencing.

I think that means "If you want to use SPAdes for 2x250bp reads, we recommend you target you libraries for 550-700bp" rather than "If you have a library outside of 550-700bp, don't use SPAdes". Once you have the library, it's too late, but SPAdes is very flexible with insert sizes.

As Genomax mentioned, you might try merging the reads first; I have found that to improve SPAdes assemblies.

Vinn 04-10-2017 01:06 PM

Dear Genomax,

Thanks for your reply and for the suggestion. I was thinking about that too, but since I just got another PE library (with another insert size), I am not sure if SPAdes can handle one single read and one paired-end?

Have a great Easter holiday!

Vinn 04-10-2017 01:12 PM

Dear Ola,

Thanks for your reply and for your suggestion. I just received another library with another insert size and will try using it to improve the one I have. :)

Have a great Easter holiday!

Vinn 04-10-2017 01:17 PM

Dear Brian,

Thanks for your reply and for the suggestion. I will try merging and reassembling again.
Happy Easter holiday! :)

Brian Bushnell 04-10-2017 03:10 PM

Hi Vinn,

SPAdes can handle one paired and one single-ended set of reads. I recommend that anyway when using a single library and merging reads, because not all the reads will merge.

Vinn 04-11-2017 03:36 AM

Hi Brian,

Thank you very much; I will try as you suggested. :) Anyway, I couldn't stop wondering what if I trim both R1 and R2 reads to 150 bp using bbduk (ftr=150), and use them as a 150PE?

Brian Bushnell 04-12-2017 04:39 PM

Hi Vinn,

You could certainly do that, but unless your sequence quality is very bad at the ends, it won't give you a better assembly; it will mainly just reduce your sequence volume. In my testing, SPAdes produces the best assemblies when you merge reads (if they are overlapping) and feed it both the merged and unmerged reads. Remember that SPAdes supports kmers up to 127bp; with 150bp reads, the kmer depth at k=127 will be quite low. Whereas with 250bp reads (or merged 350+bp reads) it will be much higher, potentially resulting in a superior 127-mer assembly.


All times are GMT -8. The time now is 07:44 PM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.