Seqanswers Leaderboard Ad

**DerekS** · 07-01-2012, 08:16 PM

Hi,

I've had similar problems with variable nucleotide tandem repeat regions in a high G+C bacterium. I've been using BWA as an aligner and SolSNP as a SNP/variant caller. For de novo assembly, I've had some success with using different sequencing technologies (454 and PE Illumina) and then performing a hybrid assembly using MIRA. The longer 454 reads are able to span some of these regions although there are still plenty of sections that are unable to be assembled.

For mapping have you tried any other aligners besides BWA? I was thinking of trying bfast to see if it can cope with these regions any better. Failing that can you increase the stringency of the SAMtools SNP caller? At the very least you should be able to remove the low confidence SNPs from analysis.

**Sheila** · 08-30-2012, 02:18 AM

Originally posted by Genomics101 View Post

Greetings,

I'm looking for some advice on how to approve my analysis of assembly and variant analysis using 100bp Illumina in genes with low-complexity regions (imperfect repeat sequences).

I am working on comparative genomics with a number of very AT-rich genomes (about 80%, in a variety of Plasmodium species). I am also doing some population genetics in there and need an accurate set of SNPs (and indels would be nice, too).

Mapping, de novo assembly, and SNP/indel calling all have problems assembling/mapping low-complexity regions (using Velvet for de novo, BWA for mapping and SAMtools/bcftools for variant analysis). Velvet gets them right about 50% of the time (checking with Sanger sequencing) BWA can't map these regions at all.

I tried masking the regions in the genome using DUST, but it only finds little regions, these are easier to find using protein sequence.

Any advice on how to mask these regions or (even better) include them in the analyses and get them right would be appreciated.

Have a look at:

404 Not Found

http://www.cbrc.jp/tantan/

Regards,

S.

Topics	Statistics	Last Post
Genetic Variants and Diabetes Risk in Childhood Cancer Survivors by seqadmin Started by seqadmin, Today, 08:47 AM	0 responses 12 views 0 likes	Last Post by seqadmin Today, 08:47 AM
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 60 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 59 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 54 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM

Seqanswers Leaderboard Ad

Announcement

Better accuracy in assembling, and SNP calling in, low-complexity sequence regions?

Comment

Comment

Latest Articles

ad_right_rmr

News