Go Back   SEQanswers > Bioinformatics > Bioinformatics

Similar Threads
Thread Thread Starter Forum Replies Last Post
"allele balance ratio" and "quality by depth" in VCF files efoss Bioinformatics 2 10-25-2011 11:13 AM
Relatively large proportion of "LOWDATA", "FAIL" of FPKM_status running cufflink ruben6um Bioinformatics 3 10-12-2011 12:39 AM
The position file formats ".clocs" and "_pos.txt"? Ist there any difference? elgor Illumina/Solexa 0 06-27-2011 07:55 AM
"Systems biology and administration" & "Genome generation: no engineering allowed" seb567 Bioinformatics 0 05-25-2010 12:19 PM
SEQanswers second "publication": "How to map billions of short reads onto genomes" ECO Literature Watch 0 06-29-2009 11:49 PM

Thread Tools
Old 03-10-2010, 06:21 AM   #1
Junior Member
Location: sweden

Join Date: Jan 2010
Posts: 8
Default "beginner in alignment" question

I am analyzing Roche 454 sequence data. The sequencing was performed not for whole genome but for the exons of (around) 100 genes. When I first started analyzing, I used the target sequence (exome of 100 genes) as my reference not the whole genome. After all, this target region is tiled and sequenced by Roche platform. But by doing so I am getting a lot of consequential SNPs, most of them are probably false positives.
But when I perform whole genome alignment, I am getting reasonable/low number of SNPs.
I checked some specific regions which are showing great variation at the number of SNPs. Turns out, some of the reads mapping to that region in the 1st alignment are not mapping there in whole genome alignment, but to some other region which is not in the target sequence.
So my questions are;
Is it general practice to do whole genome alignment in any given NGS project, or do I need more stringent alignment parameters when performing alignment over a specific region?
By the way, I am using bwa and ssaha2 for alignment step.

menenuh is offline   Reply With Quote
Old 03-10-2010, 09:57 AM   #2
Senior Member
Location: USA

Join Date: Jan 2008
Posts: 482

I noticed similar behavior looking at capture illumina reads. The capture is not very specific and there is always a lot of 'off-target' sequencing that occurs. In case of using a restricted reference sequence, a lot of those 'off-target' reads are forced to align to these regions, producing false variants.

The best approach is to use the whole genome for the mapping of reads.
bioinfosm is offline   Reply With Quote

alignment, bwa, roche 454, ssaha2

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

All times are GMT -8. The time now is 04:03 PM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO