mpileup base-quality filter does not seem to work

david.tamborero

Member

Join Date: Feb 2011

Posts: 60
- Share
- Tweet
#1

mpileup base-quality filter does not seem to work

12-29-2011, 11:35 AM

Hello,

I am trying to detect somatic mutations on tumor-normal samples (illumina paired-end reads), so what I am doing as the first approach is the following:

Code:

- bfast alignement (match + localalign + postprocess) for each sample - picard remove_duplicates for each sample - samtools mpileup for each sample - varscan for each tumor-normal samples pair

I am interested in not taking into account those reads/bases with 'low' quality for the mpileup step, thus I use the -q/-Q arguments to do so. However, it does not seem to work, and after diving through the data now I am totally confused.

I check out the bam file by using IGV, which annotates the base/read qualities for each position. The pileup file is generated by the following mpileup command:

Code:

/samtools-0.1.18/samtools mpileup -f ref.fa -B -q 1 -Q 30 -SD sample.bam > sample.pileup

What I observe is that the number of reads that are included in the pileup summary are less than the ones availaible in the bam file. But the point is that they do not seem to respond to the -q1 -Q 30 criteria (for instance, it includes read bases whose quality is much lower than 30, according to the bam file). Note that I disabled the BAQ calculation to do everything more clear. Moreover, the base qualities reported in most of the pileup entries are sistematically lower than 30, e.g:

Code:

chr1 115323009 T 18 ,$.,,,,,,,,,,c,,,,, ==?=@;>?=77##;#>>;

And even more confusing for me, when I run Varscan, which is supposed to just summarize the pileup data, the reported number of reads supporting each allele does not fit with the corresponding pileup entry. For instance, for the position of the previous example, Varscan says that only eigth reads supports the 'T' allele.

I've found many entries about to use/or not the BAQ calculations, but I have no clue about problems with the -q/-Q criteria, or even the Varscan statistics. It should be trivial, so I guess I am missing some silly thing, but any help would be really appreciated.

thanks a lot!
david
Tags: mpileup, quality scores, varscan

Previous template Next

Essential Discoveries and Tools in Epitranscriptomics

by seqadmin

The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
- Channel: Articles
04-22-2024, 07:01 AM

Topics	Statistics	Last Post
A Close Examination at Probiotic-Related Bacteremia by seqadmin Started by seqadmin, 05-02-2024, 08:06 AM	0 responses 16 views 0 likes	Last Post by seqadmin 05-02-2024, 08:06 AM
Expanded Genetic Insights into Blood Pressure Regulation by seqadmin Started by seqadmin, 04-30-2024, 12:17 PM	0 responses 20 views 0 likes	Last Post by seqadmin 04-30-2024, 12:17 PM
The Role of Enhancers in Defining Cell Fate by seqadmin Started by seqadmin, 04-29-2024, 10:49 AM	0 responses 26 views 0 likes	Last Post by seqadmin 04-29-2024, 10:49 AM
Expanding the Horizons of Cellular Research with the Single Cell Atlas by seqadmin Started by seqadmin, 04-25-2024, 11:49 AM	0 responses 28 views 0 likes	Last Post by seqadmin 04-25-2024, 11:49 AM

Seqanswers Leaderboard Ad

Announcement

mpileup base-quality filter does not seem to work

Latest Articles

ad_right_rmr

News