SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Per base sequence coverage from sam/bam file? ewilbanks Bioinformatics 7 06-06-2012 02:03 PM
Extracting 'counts' from v values crh Bioinformatics 0 12-31-2011 10:29 AM
Extracting reads from BAM files alpesh Bioinformatics 1 10-12-2011 03:59 PM
Base Quality values in BAM file donniemarco Bioinformatics 2 06-22-2011 11:06 AM
Base coverage from Bam ElMichael Bioinformatics 4 12-01-2010 10:18 AM

Reply
 
Thread Tools
Old 10-20-2010, 08:15 AM   #1
unagaswamy
Member
 
Location: Texas

Join Date: May 2010
Posts: 13
Question Extracting base coverage and quality values from BAM files

Experts,
I am interested in extracting base coverage and quality values for non SNP reads from a BAM file?
Thanks,
-Uma
unagaswamy is offline   Reply With Quote
Old 10-20-2010, 08:24 AM   #2
wenhuang
Member
 
Location: Raleigh, NC

Join Date: Feb 2010
Posts: 30
Default

pileup in samtools may be what you need.
wenhuang is offline   Reply With Quote
Old 10-20-2010, 10:36 AM   #3
bioinfosm
Senior Member
 
Location: USA

Join Date: Jan 2008
Posts: 482
Default

Yes, pileup is the tool from samtools to display coverage / quality information
__________________
--
bioinfosm
bioinfosm is offline   Reply With Quote
Old 10-21-2010, 12:49 PM   #4
unagaswamy
Member
 
Location: Texas

Join Date: May 2010
Posts: 13
Default

Thanks for the info.
Here is what I have tried so far to view the file for a given sample:
samtools view filename.bam chr:start-end

For non SNP positions, I am getting the base coverage using the following command:
samtools view filename.bam chr:start-end |wc -l

Sometimes the actual position is missing when I grep it and ~50 bases around it are printed. My question is how can I trust the wc -l value for a position if that position itself is missing in the BAM file?

Last edited by unagaswamy; 10-21-2010 at 01:37 PM.
unagaswamy is offline   Reply With Quote
Old 10-25-2010, 11:03 AM   #5
unagaswamy
Member
 
Location: Texas

Join Date: May 2010
Posts: 13
Default

Hi Experts,
Can anyone tell me how to extract the base quality from the CQ field of the BAM file? I colleague of mine mentioned that every two ASCII character corresponds to the SOLiD color space. Do I take the average of every overlapping 2 or the max? Any input in this regard is gretaly appreciated!
unagaswamy is offline   Reply With Quote
Old 10-25-2010, 10:57 PM   #6
nilshomer
Nils Homer
 
nilshomer's Avatar
 
Location: Boston, MA, USA

Join Date: Nov 2008
Posts: 1,285
Default

Quote:
Originally Posted by unagaswamy View Post
Hi Experts,
Can anyone tell me how to extract the base quality from the CQ field of the BAM file? I colleague of mine mentioned that every two ASCII character corresponds to the SOLiD color space. Do I take the average of every overlapping 2 or the max? Any input in this regard is gretaly appreciated!
Is the base quality not already reported in the SAM record (it should be)? As to how to calculate the base qualities during alignment from color qualities, here is the formula MAQ/BWA/BFAST uses: https://sourceforge.net/apps/mediawi...apping_Quality.
nilshomer is offline   Reply With Quote
Old 10-28-2010, 08:11 AM   #7
unagaswamy
Member
 
Location: Texas

Join Date: May 2010
Posts: 13
Default

Hi nilshomer,
Thanks for the reply. We are generating SAM record only for the SNP positions which does have the base quality. I am interested in the Non-SNP positions as well and hence I'm trying to calcuate the base quality for these positions from the BAM file.
-Uma
unagaswamy is offline   Reply With Quote
Old 11-05-2010, 09:54 AM   #8
unagaswamy
Member
 
Location: Texas

Join Date: May 2010
Posts: 13
Default

Hi All,
I generated a raw pileup for all reference positions using the samtools. An exmaple record for a non-SNP position is below:
chrY 11936864 G G 51 0 60 8 ,,.,,,.. "%`".B"`

This base has a coverage of 8. Which one of these fields (51,0,60) represent the base quality in this case? I am still confused about the ABI's SOLiD base quality!

Any input is greatly appreciated!
unagaswamy is offline   Reply With Quote
Old 11-10-2010, 07:41 AM   #9
Bruins
Member
 
Location: Groningen

Join Date: Feb 2010
Posts: 78
Default

Did you see the SAM FAQ?
http://sourceforge.net/apps/mediawik...pileup_output.

Hope this helps.
Bruins is offline   Reply With Quote
Old 11-11-2010, 07:55 AM   #10
unagaswamy
Member
 
Location: Texas

Join Date: May 2010
Posts: 13
Default

Hi Bruins,
Thanks for this very useful link. It definitely answers some of my questions.
-Uma
unagaswamy is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 08:34 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO