View Single Post
Old 07-11-2014, 06:16 AM   #5
Senior Member
Location: China

Join Date: Feb 2009
Posts: 116

Originally Posted by kmcarr View Post
Tophat has done nothing to change the encoding. Quality scores in BAM files are stored as numeric values, not ASCII characters like they are in FASTQ files. When you view the contents of a BAM file with samtools view it will convert those numbers to ASCII characters for display and it will always use the Sanger Phred+33 encoding for display. It is only important that when you are using FASTQ files as input you correctly identify what encoding method is used in that FASTQ, then the correct q-score numbers will be stored in the BAM and everything downstream will work fine.
Thank kmcarr and blancha's reply.

More question:
When I use the fastq with the base quality like this:
the output bam will be like this:
FCC22UBACXX:2:2112:11384:88500#ACACGCGG 0       AF024514.1      14      50      49M     *       0       0       ATATTGCTTCTATTTCGGTTTTGTTCAAGCGTTGACCGTTGCAGGCGCT     %$%(((((*****++++"(*+++*+++++++++++++++++++++++++       AS:i:-4 XN:i:0  XM:i:2  XO:i:0  XG:i:0  NM:i:MD:Z:17A1A29     YT:Z:UU NH:i:1
FCC22UBACXX:2:1313:19247:88511#ACACGCGG 0       AF024514.1      15      50      49M     *       0       0       TATTGCTTCTATTTCGATTTTGTTCAAGCGTTGACCGTTGCAGGCGCTT     $$$(((((&((&(*))'%(**+&(*++*&)''')&)+++%()))'&()%       AS:i:-2 XN:i:0  XM:i:1  XO:i:0  XG:i:0  NM:i:MD:Z:18A30       YT:Z:UU NH:i:1
Is it normal? I have used this bam file to call snp, none snp has been found, while several sites can be manually detected. Is it caused by this base quality?
pengchy is offline   Reply With Quote