View Single Post
Old 08-17-2011, 11:51 PM   #1
Junior Member
Location: Denmark

Join Date: Aug 2011
Posts: 1
Default Bowtie. Different outputs from equivalent(?) inputs.

Hello all

I'm currently trying to align & annotate lots of short sequences to the human genome (from Ensembl) using Bowtie (and R).

When the query sequences are given on the command line (with -c) as a comma-separated list I cannot make get Bowtie to yield the same result when using a self-created FASTQ-file. The suspected error is what I choose as default (Phred) read qualities in the FASTQ-file. It is clear that if Bowtie is given sequences on the command line it must assume some default read qualities, but what is the default value? I cannot find the answer in the Bowtie manual but I suspect, that the answer is Phred quality 40 (corresponding to ASCII character "h"(?)) since this quality is used with other commands.

Using "h" as default read-quality, however, does not give exactly the same results? Where am I taking the wrong turn?

Minimal example: Running

bowtie -a --fullref Homo_sapiens.GRCh37.63.cdna.all TestFASTQ.fq test1.txt
bowtie -c -a --fullref Homo_sapiens.GRCh37.63.cdna.all AAATTGCTCTTAGCATA test2.txt

where the TestFASTQ.fq is simply


does not give the the same results.

The output from my R-script is (which filters and formats the bowtie output)

> genes1
[1] "ENSG00000135829" "ENSG00000135829" "ENSG00000135829" "ENSG00000151789"
[5] "ENSG00000127081" "ENSG00000122042" "ENSG00000162894" "ENSG00000187699"
[9] "ENSG00000187699" "ENSG00000231890" "ENSG00000182749" "ENSG00000233124"
[13] "ENSG00000228002" "ENSG00000101040" "ENSG00000101040" "ENSG00000112773"
[17] "ENSG00000112773" "ENSG00000112773"
> genes2
[1] "ENSG00000135829" "ENSG00000135829" "ENSG00000135829" "ENSG00000151789"
[5] "ENSG00000127081" "ENSG00000122042" "ENSG00000162894" "ENSG00000231890"
[9] "ENSG00000187699" "ENSG00000187699" "ENSG00000182749" "ENSG00000233124"
[13] "ENSG00000228002" "ENSG00000101040" "ENSG00000101040" "ENSG00000112773"
[17] "ENSG00000112773" "ENSG00000112773"

(EDIT: The two vectors above differs at positions 8 and 10)

Can anyone help me?


ps. does anyone know, how to make Bowtie return Gene Symbols. I.e. get DHX9 for ENSG00000135829 and so on.

Last edited by AEB; 08-18-2011 at 01:41 AM.
AEB is offline   Reply With Quote