SEQanswers

Go Back   SEQanswers > Applications Forums > De novo discovery



Similar Threads
Thread Thread Starter Forum Replies Last Post
Insert size != Fragment size? Boel Bioinformatics 6 12-12-2013 08:28 AM
About Insert, Insert size and MIRA mates.file aarthi.talla 454 Pyrosequencing 1 08-01-2011 01:37 PM
Insert size for Pindel mard Bioinformatics 0 12-15-2010 08:08 PM
insert size polystone Sample Prep / Library Generation 0 05-04-2010 10:07 AM
insert size adrian Bioinformatics 1 03-18-2010 04:55 PM

Reply
 
Thread Tools
Old 01-21-2010, 05:48 AM   #1
454andSolid
Junior Member
 
Location: USA

Join Date: May 2009
Posts: 8
Question Insert size important?

Hi all! First post here, maybe some of you can shed some light on this:

We are working on a denovo sequencing and assembly project, of an eukaryotic genome of ~120 Mbp. So far we have sequenced a couple of fragment libraries with 454 titanium. We now want to sequence some paired-end libraries, but the sequencing facility has problems with getting the titanium 20kbp inserts (and the 8Kbp) to work. 3Kbp inserts seem to work fine, so the question is:

Is there a big difference between 20Kbp and 3Kbp inserts when it comes to the assembly (L50, N50 etc.)?

I have been asking around, and the answer I get is that "of course 20Kbp is better". What I am interested in is how much better it would be (if its worth to wait for x months). These things are difficult to estimate, but if maybe somebody has some experience from similar projects...?

thanks
/Jakub

Last edited by 454andSolid; 01-26-2010 at 05:29 AM.
454andSolid is offline   Reply With Quote
Old 01-21-2010, 10:31 AM   #2
flxlex
Moderator
 
Location: Oslo, Norway

Join Date: Nov 2008
Posts: 415
Default

It depends on the length of the repeated parts of your genome. Repeated parts result in collapsed contigs (with subsequently higher read depths) that break the non-repeated parts. Paired end reads where the distances between the halves are longer than the repeat size allow for ordering and orienting the non-repeated contigs, and occasionally place copies of the repeats as well.

So, you want the longest paired end distance to be somewhat larger than the longest repeated parts (contigs). How to find this out? Look for the length of the high-depth, i.e. repeated, contigs, as written in the 454ContigGraph (first part, where all the contigs are listed). Very helpful is to plot the length versus depth, then you can match the paired end distances with the length of the repeated contigs.

Hope this helps, and hope you find out you don't need 8 kb or 20 kb...

flxlex
flxlex is offline   Reply With Quote
Old 01-22-2010, 01:29 AM   #3
454andSolid
Junior Member
 
Location: USA

Join Date: May 2009
Posts: 8
Default

Thanks! This was exactly what I was looking for. Now I just have to make sure that the contigs with high depth are not mtDNA etc...
454andSolid is offline   Reply With Quote
Old 12-27-2017, 01:33 AM   #4
raman91
Junior Member
 
Location: Singapore

Join Date: May 2016
Posts: 6
Default

Hi,

I am visualizing alignments using IGV. I want to ask what do you mean by Insert? Is insert size a large unsequenced segment sitting right in between the two reads which escapes sequencing and also shearing during library prep stage? How to validate the existence of a large insert such as 3Mb or 17 Mb as it cannot be confirmed even by a high fidelity Taq (max limit of Genomic DNA pcr is 1-2kb for taq and upto 30kb for hf Taq)?

Regards
raman91 is offline   Reply With Quote
Reply

Tags
genome assembly, insert size, paried end

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 01:02 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2018, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO