Seqanswers Leaderboard Ad

**Bukowski** · 04-21-2010, 02:25 AM

Originally posted by 454andSolid View Post

Hi all,

We have been running de novo assembly of a eukaryotic genome, using 454 titanium together with gsAssembler. When we compare our assembly with cloned cDNA fragments (sequenced with Sanger) we find some homopolymer errors. So we were wondering:

- Are there any reports on how common these errors are (especially in coding regions)?

- How have people dealt with these problems? We were thinking about running Illumina or SOLiD (which would give us 50-100x coverage) and using these data to correct the homopolymer run errors. Do you know of any programs or papers that might help?

thanks
/Jakub

I have to say at the time of answering, I've been looking for solutions to this with SOLiD data to correct 454 homopolymer errors, and come up short. I know there are some people working on this, but with the NGS workflow focused on resequencing and SNP detection, the finishing of denovo 454 assemblies with additional data, especially from SOLiD runs, seems to be a sadly neglected area.

I'd be delighted to hear otherwise from someone..

**colindaven** · 04-21-2010, 03:15 AM

There are a couple of other messages on this forum about this. Also several papers are out there too, using Pubmed should get you some good information.

Using solexa to correct 454 homopolymer errors - SEQanswers

http://seqanswers.com/forums/showthread.php?t=3635&highlight=homopolymer

Discussion of next-gen sequencing related bioinformatics: resources, algorithms, open source efforts, etc

As far as I know, the only implemented script is the one mentioned here by Torst.

**Torst** · 04-24-2010, 07:59 PM

Originally posted by 454andSolid View Post

We have been running de novo assembly of a eukaryotic genome, using 454 titanium together with gsAssembler. When we compare our assembly with cloned cDNA fragments (sequenced with Sanger) we find some homopolymer errors. So we were wondering:
- Are there any reports on how common these errors are (especially in coding regions)?
- How have people dealt with these problems? We were thinking about running Illumina or SOLiD (which would give us 50-100x coverage) and use these data to correct the homopolymer run errors. Do you know of any programs or papers that might help?

The homopolymer errors can occur wherever the true sequence has about three or more of the same bases in a row. If this happens more in coding regions, then they will be affected more. It's genome dependent. In bacteria, which are coding-dense, this means all homopolymer errors result in frame-shifts in genes :-(

We use Illumina and SOLiD short reads to correct 454 scaffolds produced by gsAssembler/Newbler. We don't correct the reads themselves, rather the contigs or scaffolds that are assembled by gsAssembler.

As colindaven said, I explain on this thread http://seqanswers.com/forums/showthread.php?t=3635 how our software Nesoni could be used for this purpose. The key is using a read mapper which is good at detecting INDELs - detecting SNPs is not much use in fixing homopolymer errors.

**454andSolid** · 05-02-2010, 08:40 AM

I will try using Nesoni with our transcriptome data.

Thanks for the advice!

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 18 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 22 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 16 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 47 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

correcting homopolymer run errors

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News