Seqanswers Leaderboard Ad

**Ben Langmead** · 02-14-2010, 06:11 AM

Hi keebs,

Bowtie is not expecting to receive FASTQ files formatted the way BFAST formats them (though I should probably fix it so that it handles BFAST-style too since you're the second person who's reported this problem). I have been working from Galaxy's and BWA's conversion tools, both of which chop off the primer base (which doesn't have a quality in BFAST's output) and the first color (which does), and prints only the qualities that correspond to the remaining colors.

Hope that helps - I will try to make Bowtie do the right thing with the BFAST tool's output in the future though.

Ben

**keebs42** · 02-15-2010, 11:21 AM

To recap, the problem is that for every read base that doesn't match the reference, Bowtie is reporting a 0 base quality, regardless of the original color-space quality of the read. Bowtie shouldn't be altering the base-quality of a read regardless of the reference sequence, correct? The problem still isn't fixed by using the BWA csfasta to fastq conversion script. I've used PerM to successfully align the read, maintaining the base quality. Both aligners mapped the read to the same genomic position (NCBI36 is the reference).

Here is the seqeunce & base quality pulled from the SAM file created by Bowtie (using BWA solid2fastq) :

Code:

TCGGAAGCCGAGCCTGTGACTGCACCGGCACTGAAGCTCCCTGTGTG	
\Z_XW]^b][__\UURI!!%SX_`V<5OXWD@HLOHR\ZP=E[XP@,

Here is the same read pulled from the SAM file created by PerM (operates directly on csfasta files, so no fastq conversion is necessary):

Code:

TCGGAAGCCGAGCCTGTGACTGCACCGGCACTGAAGCTCCCTGTGTG     
\Z_XW]^b][__\UURIKKMSX_`V<5OXWD@HLOHR\ZP=E[XP@,

Note the difference in the qualities at the three bases, GAC. The 'A' is the heterozygous position. It's clear Bowtie is reporting a different base quality string than PerM only at these sites (!!% vs KKM). Given that the original quality string of the color-space read (in the _QV.qual file) is more similar to the PerM version, I wonder if this is a bug in Bowtie.

Again this is a data-set wide issue.. not just happening with one read.

**Ben Langmead** · 02-15-2010, 01:18 PM

Originally posted by keebs42 View Post

Note the difference in the qualities at the three bases, GAC. The 'A' is the heterozygous position. It's clear Bowtie is reporting a different base quality string than PerM only at these sites (!!% vs KKM). Given that the original quality string of the color-space read (in the _QV.qual file) is more similar to the PerM version, I wonder if this is a bug in Bowtie.

OK, sorry for not having fully understood the question. There are two parts to the answer, one of which *does* involve a bug in Bowtie, so many thanks for pointing this out.

1. Bowtie reports alignments in nucleotide space, and thus employs a decoding scheme to decide which nucleotides correspond to the colors in the read. The decoding scheme also calculates quality scores for the decoded nucleotides. The decoded quality score for a given nucleotide is a function of the two covering colors from the original and whether they "match" the decoded nucleotides. If one or both colors don't "match" (i.e. if the SOLiD software called the color incorrectly), then we intentionally assign the nucleotide a low quality score (often close to 0). This is sensible because that nucleotide call is actually *refuted* by one or both of the covering color calls, so we can't have much confidence in it even if it is corroborated by the other color. The exact calculation is described in the manual and is borrows from is borrowed directly from the BWA paper.

2. There is a bug in Bowtie whereby some colors are incorrectly penalized when, in fact, the covering colors are *correct* with respect to the decoded nucleotides. This happens when the decoded nucleotide is a SNP. Your example may be one of those instances (I can't tell without seeing the colors), in which case you're 100% right that it's due to a bug. I will have a fix for this very soon; sorry for the inconvenience,

Ben

**Kevin_Johnson** · 02-19-2010, 07:37 PM

This is an example of where it's better to do Perms strategy of mapping in color space...

**keebs42** · 03-06-2010, 01:14 PM

Just wanted to report that the latest release of Bowtie, 12.3, fixes the bug I mentioned in this thread. In addition I'm getting a much higher % of color space reads to align now with Bowtie than with PerM ( 30-60% vs 20-40%), even with the --best flag.

Also the -Q option combined with -f removes the need to convert to fastq from csfasta, making it as easy to use as PerM with SOLiD reads.

Thanks for the work, Ben!

**sorrychen** · 03-13-2010, 05:51 PM

Hello, this is Yang-Ho from USC, PerM's author. Sorry for not visiting the site for a long long long time.

May I ask what parameter you use for Bowtie and PerM. Because With --seed F4 -v (many) you can be full sensitive to four and partial sensitive to (many) mismatches, as you want, ex: 6 ~ 10 although those mapping may not be very trustable?

Note the full, means from end-to-end, ALL alignments within the mismatches threshold should be found. If you don't want the low quality end, you can "trim" the end.

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 57 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 53 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 45 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 55 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

Bowtie, Color-space reads, and confusing base qualities at variable sites

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News