SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Optimal insert size for de novo assembly from PE150 reads anth RNA Sequencing 0 01-12-2017 10:21 AM
MaSurCA 454 PE + Illumina 2x250 PE Assembly Issue David_Cleary Bioinformatics 3 02-26-2015 03:29 AM
Optimum insert size for De-novo transcriptome assembly tellsparck Bioinformatics 1 02-17-2015 12:17 PM
Cuffdiff results puzzling me. fpkm 350 vs 5.6 IS sig while 6 v 350 ISN'T ?? beeman Bioinformatics 1 04-22-2014 01:13 AM
mate pair insert size variation and de novo assembly Mark Introductions 2 10-13-2012 01:48 AM

Reply
 
Thread Tools
Old 04-09-2017, 04:07 PM   #1
Vinn
Member
 
Location: Sweden

Join Date: Nov 2014
Posts: 17
Default Please help with assembly 2x250 bp with 350 insert

Hi folks,

I have got data generated from 2x250 bp with 350 bp insert for fungal genome assembly. I got very good results from SPAdes, but later discovered that SPAdes recommends an insert of 550-700 bp for 2x250 bp sequencing.

I have tried Abyss, Velvet and CLC so far, but they did not give as good results as SPAdes. Do you have opinion or suggestion what else I shall try? Or would it be just okay to use SPAdes?

Thanks in advance and have a great weekend!
Vinn is offline   Reply With Quote
Old 04-10-2017, 03:45 AM   #2
Ola
Member
 
Location: Sweden

Join Date: Aug 2011
Posts: 30
Default

Discovar de novo was designed specifically for 2x250 bp reads so you could give it a try. Of course it is recommended to have inserts longer than the sequencing reads (500 bp in your case) but it goes for all assemblers and doesn't mean your assembly is not valid. Best way to improve the assembly would likely be to add long/linked reads though.
Ola is offline   Reply With Quote
Old 04-10-2017, 05:22 AM   #3
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 6,800
Default

@Vinn: Have you tried to merge your reads (since they must overlap in the middle) and then try assembly as a single end dataset?
GenoMax is offline   Reply With Quote
Old 04-10-2017, 09:47 AM   #4
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707
Default

Quote:
Originally Posted by Vinn View Post
I have got data generated from 2x250 bp with 350 bp insert for fungal genome assembly. I got very good results from SPAdes, but later discovered that SPAdes recommends an insert of 550-700 bp for 2x250 bp sequencing.
I think that means "If you want to use SPAdes for 2x250bp reads, we recommend you target you libraries for 550-700bp" rather than "If you have a library outside of 550-700bp, don't use SPAdes". Once you have the library, it's too late, but SPAdes is very flexible with insert sizes.

As Genomax mentioned, you might try merging the reads first; I have found that to improve SPAdes assemblies.
Brian Bushnell is offline   Reply With Quote
Old 04-10-2017, 01:06 PM   #5
Vinn
Member
 
Location: Sweden

Join Date: Nov 2014
Posts: 17
Default

Dear Genomax,

Thanks for your reply and for the suggestion. I was thinking about that too, but since I just got another PE library (with another insert size), I am not sure if SPAdes can handle one single read and one paired-end?

Have a great Easter holiday!
Vinn is offline   Reply With Quote
Old 04-10-2017, 01:12 PM   #6
Vinn
Member
 
Location: Sweden

Join Date: Nov 2014
Posts: 17
Default

Dear Ola,

Thanks for your reply and for your suggestion. I just received another library with another insert size and will try using it to improve the one I have.

Have a great Easter holiday!
Vinn is offline   Reply With Quote
Old 04-10-2017, 01:17 PM   #7
Vinn
Member
 
Location: Sweden

Join Date: Nov 2014
Posts: 17
Default

Dear Brian,

Thanks for your reply and for the suggestion. I will try merging and reassembling again.
Happy Easter holiday!
Vinn is offline   Reply With Quote
Old 04-10-2017, 03:10 PM   #8
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707
Default

Hi Vinn,

SPAdes can handle one paired and one single-ended set of reads. I recommend that anyway when using a single library and merging reads, because not all the reads will merge.
Brian Bushnell is offline   Reply With Quote
Old 04-11-2017, 03:36 AM   #9
Vinn
Member
 
Location: Sweden

Join Date: Nov 2014
Posts: 17
Default

Hi Brian,

Thank you very much; I will try as you suggested. Anyway, I couldn't stop wondering what if I trim both R1 and R2 reads to 150 bp using bbduk (ftr=150), and use them as a 150PE?
Vinn is offline   Reply With Quote
Old 04-12-2017, 04:39 PM   #10
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707
Default

Hi Vinn,

You could certainly do that, but unless your sequence quality is very bad at the ends, it won't give you a better assembly; it will mainly just reduce your sequence volume. In my testing, SPAdes produces the best assemblies when you merge reads (if they are overlapping) and feed it both the merged and unmerged reads. Remember that SPAdes supports kmers up to 127bp; with 150bp reads, the kmer depth at k=127 will be quite low. Whereas with 250bp reads (or merged 350+bp reads) it will be much higher, potentially resulting in a superior 127-mer assembly.
Brian Bushnell is offline   Reply With Quote
Reply

Tags
genome assembly, next gen sequencing data, scaffolding, spades, velvetg

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 08:14 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2018, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO