SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
0 depth in pileup files from BAM files ? jjlaisnoopy Bioinformatics 3 04-22-2012 07:38 AM
Pileup from BAM or SAM nilmot13 General 7 02-03-2012 02:54 AM
Extract perfectly mapped reads from SAM/BAM file Graham Etherington Bioinformatics 2 07-21-2011 07:27 AM
NEw to Chip-seq and have .bam/.sam/.bam.bai files... then what? NGS newbie Bioinformatics 11 05-25-2011 07:48 AM
Getting pileup consensus from BAM files using Bio::DB::Sam ragowthaman Bioinformatics 2 08-03-2010 08:21 AM

Reply
 
Thread Tools
Old 05-01-2012, 08:00 AM   #1
liu_xt005
Member
 
Location: Iowa City, IA

Join Date: Jun 2011
Posts: 24
Question Pileup / extract information from BAM/SAM files

I am aiming to extract the full information at each site: read depth, # ref reads, # variant reads (for all non-reference alleles), strand information, etc. An old thread started by nilmot13 suggested "genomeCoverageBed", but the output seems to contain only simple read depth.

My problem started from an observation that SAMtools and GATK generate very different REF/ALT read depth at some site from the same BAM file, while BAMview and IGV both tend to support GATK counts at the site.
liu_xt005 is offline   Reply With Quote
Old 09-28-2012, 06:45 AM   #2
clk
Junior Member
 
Location: Montreal

Join Date: Nov 2010
Posts: 4
Default

Hi,
looking for an answer to the same question I came across your post... Did you ever find an answer to this question? The discrepancies I see between IGV and samtools mpileup are huge in my data, which is RNAseq data...


Thanks!
clk is offline   Reply With Quote
Old 10-01-2012, 11:56 AM   #3
liu_xt005
Member
 
Location: Iowa City, IA

Join Date: Jun 2011
Posts: 24
Default

Sorry I did not find a good solution.
But my study showed that SAMtools tend to keep longer tails for indels than GATK and others. For example, SAMtools gives TAAAA:TAAA (REF:ALT), while GATK gives TA:T.
liu_xt005 is offline   Reply With Quote
Old 10-01-2012, 12:07 PM   #4
clk
Junior Member
 
Location: Montreal

Join Date: Nov 2010
Posts: 4
Default

Thanks for responding. I finally found a solution to this, so I'll post it here in case is useful for others.

I found out that samtools filters reads before including them in the pileup; it reads the flag field in the bam file and discards reads that
a) are not paired
b) not properly mapped
c) mate is not mapped
d) alignment is not primary
e) reads fail quality control of vendor
f) is marked as PCR duplicates.

If the filters (a) and (c) are not desired, you can use the parameter -A.
In addition, samtools performs realignment unless the parameter -B is used, and discards low quality reads unless -Q0 is used. Finally, it stops at a certain number of reads unless the -d parameter is invoqued.


I really needed a good quantification of the reads at each position, so I needed to make sure I could trust the pileup (or vcf) files generated by samtools. So I wrote a small script that parses the bam file, reading the flag field, and removes specific reads from the alignment. In this way, I was finally able to produce an alignment that gave me the exact same counts with IGV and samtools pileup.

It would be really great if all these little details were more clear in the documentation, but in the end, the filtering criteria used by samtools was adequate for my needs. Except for the "anomalous read pairs" parameter (-A), which not very appropriate for RNA-seq data.

Hope that helps somebody!
clk is offline   Reply With Quote
Old 01-19-2015, 09:14 AM   #5
zyxue
Junior Member
 
Location: Canada

Join Date: May 2014
Posts: 3
Default

Hi clk, where did you find the information? Did you analyze the source code for samtools mpileup? Thanks!
zyxue is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 05:26 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO