Seqanswers Leaderboard Ad

**westerman** · 02-23-2012, 08:23 AM

Is it better to go for overlapping reads or not?

You have done your homework quite thoroughly. I think that you have the answer in your simulation: those 460 bp fragments (non-overlapping) are much better than the 180-220 bp fragments in the metrics that matter -- CEGs found and N50 scaffold. The issue of using FLASHed reads in blasts is, in my mind, a side issue.

A question. If you are interested in genes then perhaps a transcriptome instead of a whole genome project would be better? At least it would give deeper coverage.

As for the assembler, I prefer ABySS but that is just because I am use to it. Your choices are good. A longer kmer size might improve the overall assembly.

Once again, thumbs up (in the positive sense) of doing your background work before doing the actual sequencing.

**Ole** · 02-25-2012, 05:14 AM

You have done your homework quite thoroughly. I think that you have the answer in your simulation: those 460 bp fragments (non-overlapping) are much better than the 180-220 bp fragments in the metrics that matter -- CEGs found and N50 scaffold. The issue of using FLASHed reads in blasts is, in my mind, a side issue.

I'm coming to the same conclusion, but I'm running some assemblies on a bit higher coverage (15x and 20x) to see what happens then, and changing the parameters a bit. At that coverage I can error correct with Quake for example.

A question. If you are interested in genes then perhaps a transcriptome instead of a whole genome project would be better? At least it would give deeper coverage.

That's a good idea, but you are not sure about the presence/absence of genes then, are you? You have to get the right tissue under the right conditions at the right time, and that can be hard.

As for the assembler, I prefer ABySS but that is just because I am use to it. Your choices are good. A longer kmer size might improve the overall assembly.

I'll look into that. For now I'm running SOAPdenovo on all the different setups, and select some promising setups to run with Celera. On Twitter last night, we had some discussion about this and SGA came up as a promising assembler too. OLC assemblers might be the best for this case.

Once again, thumbs up (in the positive sense) of doing your background work before doing the actual sequencing.

Thank you, the better ground work, the easier the actual doing is.

I'll come back with some more thoughts and examples of setups and results when I have it.

**NextGenSeq** · 02-29-2012, 10:21 AM

You may want to investigate PASHA. De novo assembly of short reads needs a ton of RAM

http://sites.google.com/site/yongchaosoftware/pasha

Page not available - PMC

http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3167803/

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 45 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 46 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 39 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 55 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

Low coverage sequencing, which strategy?

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News