SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
How to replace select reads in a bam file? Heisman Bioinformatics 8 01-02-2012 03:49 PM
Query bam file and assemble reads dustar1986 Bioinformatics 10 09-29-2011 08:48 PM
Extract perfectly mapped reads from SAM/BAM file Graham Etherington Bioinformatics 2 07-21-2011 08:27 AM
Obtaining reference identical reads from a BAM file Sakti Bioinformatics 2 05-17-2011 11:40 AM
Extracting unique reads from a .ma or .bam file? JohnK SOLiD 14 06-04-2010 01:32 AM

Reply
 
Thread Tools
Old 10-12-2011, 12:01 PM   #1
empyrean
Member
 
Location: EU

Join Date: Sep 2010
Posts: 52
Default Getting Reads from Bam file

Hello every one.

I have a bam file which is approx 50gb from hiseq paired end data. I aligned it using BWA and now i am intersted in getting reads at particular location. i.e. for examle in chromosome 1 from (100000 - 101000). I would like to make a subset out from bam file with the above range. How can i do that ?

Thank you for your help!!
empyrean is offline   Reply With Quote
Old 10-12-2011, 01:21 PM   #2
iansealy
Member
 
Location: Hitchin, UK

Join Date: Oct 2010
Posts: 15
Default

Personally, I'd use samtools view to extract the region of interest into another BAM file. If you then want the reads then you could use Picard's SamToFastq.

Cheers,
Ian
iansealy is offline   Reply With Quote
Old 10-12-2011, 01:31 PM   #3
empyrean
Member
 
Location: EU

Join Date: Sep 2010
Posts: 52
Default

How can i specify the parameters for samtools view for extracting reads in that region?

I see the below options in samtools.

Quote:
Usage: samtools view [options] <in.bam>|<in.sam> [region1 [...]]

Options: -b output BAM
-h print header for the SAM output
-H print header only (no alignments)
-S input is SAM
-u uncompressed BAM output (force -b)
-1 fast compression (force -b)
-x output FLAG in HEX (samtools-C specific)
-X output FLAG in string (samtools-C specific)
-c print only the count of matching records
-L FILE output alignments overlapping the input BED FILE [null]
-t FILE list of reference names and lengths (force -S) [null]
-T FILE reference sequence file (force -S) [null]
-o FILE output file name [stdout]
-R FILE list of read groups to be outputted [null]
-f INT required flag, 0 for unset [0]
-F INT filtering flag, 0 for unset [0]
-q INT minimum mapping quality [0]
-l STR only output reads in library STR [null]
-r STR only output reads in read group STR [null]
-s FLOAT fraction of templates to subsample; integer part as seed [-1]
-? longer help
empyrean is offline   Reply With Quote
Old 10-12-2011, 01:33 PM   #4
alpesh
Junior Member
 
Location: Iowa

Join Date: Oct 2011
Posts: 7
Default

A region can be presented, for example, in the following format: ‘chr2’ (the whole chr2), ‘chr2:1000000’ (region starting from 1,000,000bp) or ‘chr2:1,000,000-2,000,000’ (region between 1,000,000 and 2,000,000bp including the end points). The coordinate is 1-based.

example

samtools view aln.sorted.bam chr2:20,100,000-20,200,000
alpesh is offline   Reply With Quote
Old 10-12-2011, 01:57 PM   #5
empyrean
Member
 
Location: EU

Join Date: Sep 2010
Posts: 52
Default

thank you.. it worked
empyrean is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 10:48 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2018, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO