I am about to have DNA sequenced on a HiSeq and I expect about a 30 fold coverage of a 1x10^9 bp genome with 100 bp PE reads. I am unsure of the size of the fragments I should use to get the best likely assembly from these PE reads. I am aware that the best results would be obtained by having PE reads from several libraries of varying sizes but I can only afford to sequence one library at this time. Currently I would hope to obtain contigs that at least average 2,000 to 10,000 bp so single genes would likely be within a contig. The most likely problem in assembling a contig that spans a gene would be STRs in introns. I was thinking that a 1000 bp library should span across most such STRs. Any suggestions would be appreciated.
Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
From the SGA paper: SGA can assemble 35X human reads into 10kbp contigs with reads from a single library with an average ~400bp insert size. Don't go for >500bp insert size. If I am right, the throughput and the quality of Illumina sequencing will degrade significantly.
-
Originally posted by nickloman View PostWe routinely do 500-600 base fragments and it works well. I think I read on another thread that 800 bases is where performance falls off a cliff, not tested that high ourselves.
Comment
-
Originally posted by lkral View PostI am about to have DNA sequenced on a HiSeq and I expect about a 30 fold coverage of a 1x10^9 bp genome with 100 bp PE reads. I am unsure of the size of the fragments I should use to get the best likely assembly from these PE reads. I am aware that the best results would be obtained by having PE reads from several libraries of varying sizes but I can only afford to sequence one library at this time. Currently I would hope to obtain contigs that at least average 2,000 to 10,000 bp so single genes would likely be within a contig. The most likely problem in assembling a contig that spans a gene would be STRs in introns. I was thinking that a 1000 bp library should span across most such STRs. Any suggestions would be appreciated.
It's hard to say whether you should see contigs of the size you mentioned, since again it all depends of the complexity level of your organism.
Comment
-
My view is libraries with large insert size mainly helps scaffolding, but not much for contigs. For example, SGA assembles reads with ~400bp insert to 10kb. Allpaths-LG assembles reads from variety of insert sizes to ~20kb. The contig N50 is not that different especially given that allpaths-lg uses 3-fold as many data which are much higher in cost. The scaffold N50 of allpaths-lg is by far better.
Comment
-
The current project is phase I where all I need to do is obtain contigs that are large enough to contain a gene or part of a gene. If contigs are smaller than a gene I can align these to orthologs from other fish species for assembly of those genes. In phase II in about a year or so, I hope to build longer scaffolds aligning to long oxford nanopore generated sequences (I trust these nanopores will work as advertised).
Comment
Latest Articles
Collapse
-
by seqadmin
In recent years, precision medicine has become a major focus for researchers and healthcare professionals. This approach offers personalized treatment and wellness plans by utilizing insights from each person's unique biology and lifestyle to deliver more effective care. Its advancement relies on innovative technologies that enable a deeper understanding of individual variability. In a joint documentary with our colleagues at Biocompare, we examined the foundational principles of precision...-
Channel: Articles
01-27-2025, 07:46 AM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, Today, 10:34 AM
|
0 responses
7 views
0 likes
|
Last Post
by seqadmin
Today, 10:34 AM
|
||
Started by seqadmin, 02-03-2025, 09:07 AM
|
0 responses
15 views
0 likes
|
Last Post
by seqadmin
02-03-2025, 09:07 AM
|
||
Started by seqadmin, 01-31-2025, 08:31 AM
|
0 responses
27 views
0 likes
|
Last Post
by seqadmin
01-31-2025, 08:31 AM
|
||
Started by seqadmin, 01-24-2025, 07:35 AM
|
0 responses
78 views
0 likes
|
Last Post
by seqadmin
01-24-2025, 07:35 AM
|
Comment