SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Pileup / extract information from BAM/SAM files liu_xt005 Bioinformatics 4 01-19-2015 10:14 AM
Per base sequence coverage from sam/bam file? ewilbanks Bioinformatics 7 06-06-2012 02:03 PM
Fastest way to extract differing positions from each alignment in a BAM file CHRYSES Bioinformatics 5 12-14-2011 12:28 PM
Extract perfectly mapped reads from SAM/BAM file Graham Etherington Bioinformatics 2 07-21-2011 08:27 AM
Base Quality values in BAM file donniemarco Bioinformatics 2 06-22-2011 11:06 AM

Reply
 
Thread Tools
Old 06-28-2012, 08:15 AM   #1
empyrean
Member
 
Location: EU

Join Date: Sep 2010
Posts: 52
Default Extract base from bam file

I have a bam file and i wanted to extract the following information.

For a read in a bamfile, can i extract the base based on the position? For example, if i say i need the base at 100th position and it should output whether its A/T/G/C. Can i get this using samtools or bamtools??
empyrean is offline   Reply With Quote
Old 06-28-2012, 08:54 AM   #2
pbluescript
Senior Member
 
Location: Boston

Join Date: Nov 2009
Posts: 224
Default

Try this:

Code:
samtools view file.bam | awk '{print substr($10, 100, 1)}'
pbluescript is offline   Reply With Quote
Old 07-02-2012, 10:46 AM   #3
empyrean
Member
 
Location: EU

Join Date: Sep 2010
Posts: 52
Default

thank you.. its doing exactly as i asked .. but looking at the output i realized that i asked my question wrongly.

I guess i should have made my question clear..sorry about that.. So once the read maps to the reference, i wanted to give input a position based on the reference say 100th position on reference, i need all bases from the reads which are mapping at that particular position including the readname. if i use the solution below, its printing the 100th base on the reads which is not equal to what i need because if there are indels in reference or reads, then the position of bases changes completely. So is there anyway to get what i need? i tried with mpileup. Its printing out all the bases at that positions but not the readnames.
empyrean is offline   Reply With Quote
Old 07-02-2012, 03:15 PM   #4
SeekAnswers
Member
 
Location: USA

Join Date: Mar 2012
Posts: 21
Default

Something like this?

Code:
samtools view <in.bam> 'chr1:start_pos-end_pos' | awk '{print $1"\t"$10}'
SeekAnswers is offline   Reply With Quote
Old 07-03-2012, 04:47 AM   #5
pbluescript
Senior Member
 
Location: Boston

Join Date: Nov 2009
Posts: 224
Default

Quote:
Originally Posted by empyrean View Post
thank you.. its doing exactly as i asked .. but looking at the output i realized that i asked my question wrongly.

I guess i should have made my question clear..sorry about that.. So once the read maps to the reference, i wanted to give input a position based on the reference say 100th position on reference, i need all bases from the reads which are mapping at that particular position including the readname. if i use the solution below, its printing the 100th base on the reads which is not equal to what i need because if there are indels in reference or reads, then the position of bases changes completely. So is there anyway to get what i need? i tried with mpileup. Its printing out all the bases at that positions but not the readnames.
You should look into bedtools. You can use intersectBed to get the reads that overlap with any defined interval, including single bases in a vcf file.
pbluescript is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 07:58 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO