mpileup base-quality filter does not seem to work

david.tamborero

Member

Join Date: Feb 2011

Posts: 60
- Share
- Tweet
#1

mpileup base-quality filter does not seem to work

12-29-2011, 11:35 AM

Hello,

I am trying to detect somatic mutations on tumor-normal samples (illumina paired-end reads), so what I am doing as the first approach is the following:

Code:

- bfast alignement (match + localalign + postprocess) for each sample - picard remove_duplicates for each sample - samtools mpileup for each sample - varscan for each tumor-normal samples pair

I am interested in not taking into account those reads/bases with 'low' quality for the mpileup step, thus I use the -q/-Q arguments to do so. However, it does not seem to work, and after diving through the data now I am totally confused.

I check out the bam file by using IGV, which annotates the base/read qualities for each position. The pileup file is generated by the following mpileup command:

Code:

/samtools-0.1.18/samtools mpileup -f ref.fa -B -q 1 -Q 30 -SD sample.bam > sample.pileup

What I observe is that the number of reads that are included in the pileup summary are less than the ones availaible in the bam file. But the point is that they do not seem to respond to the -q1 -Q 30 criteria (for instance, it includes read bases whose quality is much lower than 30, according to the bam file). Note that I disabled the BAQ calculation to do everything more clear. Moreover, the base qualities reported in most of the pileup entries are sistematically lower than 30, e.g:

Code:

chr1 115323009 T 18 ,$.,,,,,,,,,,c,,,,, ==?=@;>?=77##;#>>;

And even more confusing for me, when I run Varscan, which is supposed to just summarize the pileup data, the reported number of reads supporting each allele does not fit with the corresponding pileup entry. For instance, for the position of the previous example, Varscan says that only eigth reads supports the 'T' allele.

I've found many entries about to use/or not the BAQ calculations, but I have no clue about problems with the -q/-Q criteria, or even the Varscan statistics. It should be trivial, so I guess I am missing some silly thing, but any help would be really appreciated.

thanks a lot!
david
Tags: mpileup, quality scores, varscan

Previous template Next

Current Approaches to Protein Sequencing

by seqadmin

Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
- Channel: Articles
04-04-2024, 04:25 PM
Strategies for Sequencing Challenging Samples

by seqadmin

Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
- Channel: Articles
03-22-2024, 06:39 AM

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 18 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 22 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 16 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 46 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

mpileup base-quality filter does not seem to work

Latest Articles

ad_right_rmr

News