SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Is there a tool that converts TXT, BED, GFF format to VCF? LauraSmith Bioinformatics 4 03-22-2017 01:41 AM
Is it possible to convert a SNP.txt to a bed file or get a SNP.bed from samtools? Ling Bioinformatics 7 04-02-2015 06:17 AM
Updated How to convert .txt file to .bed .GFF or .BAR file format, forevermark4 Bioinformatics 2 06-30-2014 05:02 AM
How to convert BED format to SAM/BAM? seq_newbie Bioinformatics 1 06-23-2011 08:11 AM
Question about using sra_toolkit to transform the SRA format into FASTQ format areyousad Bioinformatics 0 05-16-2010 10:56 PM

Reply
 
Thread Tools
Old 01-10-2010, 06:47 AM   #1
zhenshao
Junior Member
 
Location: boston

Join Date: Jan 2010
Posts: 2
Default How to transform BAM format to .TXT or .BED?

Dear all,

I downloaded a file in .BAM format and want to transform it into .BED format. What can I do? Thanks a lot!

Zhen
zhenshao is offline   Reply With Quote
Old 01-10-2010, 05:53 PM   #2
quinlana
Senior Member
 
Location: Charlottesville

Join Date: Sep 2008
Posts: 119
Default

Hi,
I just finished a new version of BEDTools which has a C++ utility call bamToBed. This tool will convert BAM alignments to BED or BEDPE (see the BEDTools documentation) format. For example:

1. Convert BAM alignments to BED format.
Code:
$ bamToBed -i reads.bam > reads.bed
2. Convert BAM alignments to BED format using edit distance (NM) as the BED “score”. Default is mapping quality.
Code:
$ bamToBed -i reads.bam -ed > reads.bed
3. Convert BAM alignments to BEDPE format.
Code:
$ bamToBed -i reads.bam -bedpe > reads.bedpe
Heng Li also posted a nice example of how to create a BAMToBED utility using the SamTools code base.


You might also be interested in two other utilities in BEDTools that now support BAM input and output. Namely, intersectBed now accepts BAM files as input and will separately compare each alignment (each end separately if paired-end) to a BED file. One can create a new BAM file based on those alignments that do or do not overlap the BED features in question. Similarly, pairToBed does the same thing, but requires that the BAM file be paired. This tools is a bit more sophisticated in that one can require the "span" of the aligned pair to overlap, as well as either/both/neither/xor/notboth ends of the pair.

For example:

1. Retain only paired-end BAM alignments where neither end overlaps simple sequence repeats.
Code:
$ pairToBed -abam reads.bam -b SSRs.bed -type neither > reads.noSSRs.bam
2. Retain only paired-end BAM alignments where both ends overlap segmental duplications.
Code:
$ pairToBed -abam reads.bam -b segdups.bed -type both > reads.SSRs.bam
3. Retain only paired-end BAM alignments where neither or one and only one end overlaps segmental duplications.
Code:
$ pairToBed -abam reads.bam -b segdups.bed -type notboth > reads.notbothSSRs.bam

The BAM support is built upon Derek Barnett's nice C++ BAM API called BAMTools (http://sourceforge.net/projects/bamtools/). I'd encourage you to take a look at the new BEDTools manual for more details if you are interested.

Best,
Aaron
quinlana is offline   Reply With Quote
Old 01-10-2010, 08:58 PM   #3
dawe
Senior Member
 
Location: 45°30'25.22"N / 9°15'53.00"E

Join Date: Apr 2009
Posts: 258
Default

Hi,

Quote:
Originally Posted by zhenshao View Post
Dear all,

I downloaded a file in .BAM format and want to transform it into .BED format. What can I do? Thanks a lot!

Zhen
I'm happy that BEDTools now include a bam->bed conversion utility... btw you still may try this (at least for Illumina reads):

Code:
samtools view -F 0x0004 $filein | awk '{OFS="\t"; if (and($2, 16)) print $3,$4,$4+length($10),$1,$5,"-"; else print $3,$4,$4+length($10),$1,$5,"+" }
d
dawe is offline   Reply With Quote
Old 01-11-2010, 10:31 AM   #4
quinlana
Senior Member
 
Location: Charlottesville

Join Date: Sep 2008
Posts: 119
Default

Here is the sample code from Heng Li.

http://sourceforge.net/apps/mediawik...e=SAM_protocol
Protocol #4 describes his cut at bamToBed.

Aaron
quinlana is offline   Reply With Quote
Old 09-18-2013, 07:11 AM   #5
freeseek
Junior Member
 
Location: Cambridge MA

Join Date: Feb 2010
Posts: 8
Default bedtools bamtobed when large indels are present

I really like the bedtools bamtobed command, although there are some instances where reads skip very large indels and you don't want the bed file to include those indels. The only information you need is contained in columns 3 (chromosome), 4 (base pair start), and 6 (CIGAR) of the BAM file. Here is a simple awk script that should work (it really should be an option of bedtools bamtobed):
Code:
samtools view in.bam |
  awk '{split ($6,a,"[MIDNSHP]"); bp=$4-1; n=0;
    for (i=1; i<=length(a); i++) {
      n+=1+length(a[i]);
      if (substr($6,n,1)=="M") print $3"\t"bp"\t"(bp+=a[i]);
      if (substr($6,n,1)=="D") bp+=a[i];
    }
  }' > out.bed
freeseek is offline   Reply With Quote
Old 06-19-2014, 02:29 AM   #6
mslider
Junior Member
 
Location: france

Join Date: Sep 2010
Posts: 24
Default

--Hi,

i have a strange result using BamToBed and awk command line:

samtools view -F 0x0004 464_J3_D1.bam | head -1
IP6FNQC01CAO42 0 gi|2281652|gb|AF004394.1| 18 40 5S304M7841N12M1D38M3S * 0 0 TTAACTCCCAGAAAAGACAAGATATCCTTGATCTGTGGGTCTACCACACGCAAGGCTACTTCCCTGATTGGCAGAACTACACACCAGGGCCAGGGATCAGATATCCACTGACCTTTGGATGGTGCTTCAAGCTAGTACCAGTGGAGCCAGAGAAGGTAGAAGAGGCCAATGAAGGAGAGAACAACAGCCTGTTACACCCTATGAGCCTGCATGGGATGGAGGACCCGGAGAAGGAAGTGTTAATGTGGCGGTTTGACAGCAGCCTAGCATTTCATCACATGGCCCGAGAGCTGCATCCGGAGCACTACAAGAACCAACAAGAAAGAATGAACAAGAATTATTAGAATTGGATAAATGGGACA 433146444?8.//153FFFFFFIIIIGGIIIIIII:::=IIIGGIIIIIIIIIIIIIIIIIIIHIIIIIIIIIIGIIIIIIIGGG888GGI666GIIIIIIIIIIIIIGFEEGGIIIII===GGGGGGGGGGGGGGGGGGGGGGGGGGGGG@@@@GGGDDE@>>AACCGDDDDDDDDE<<<>DCDFIIIIEDFFDDDDFDDFFDDDFFECC221;C>>>B?>888EGC>>>@CGGGC>>>BBBBB>>>::333<>;;;>>BBBGDDDDCCCDDDDDCCBAAAABCDBBBBBBAAA4444@@BB?==A???A?<<444;40..../588633579<<../0009<<<<<::=988:////25 MD:Z:17G17T8A45G46T9C2A14A14T21A6A5T3A6GA8GCA4AA10C41T13G4^A27A5G4 NH:i:1 HI:i:1 NM:i:26 SM:i:40 XQ:i:40 X2:i:0 XS:A:?

bamToBed give me this result:
gi|2281652|gb|AF004394.1| 17 8213 IP6FNQC01CAO42 40 +

and
awk '{OFS="\t"; if (and($2, 16)) print $3,$4,$4+length($10),$1,$5,"-"; else print $3,$4,$4+length($10),$1,$5,"+" }'

give me:
gi|2281652|gb|AF004394.1| 18 380 IP6FNQC01CAO42 40 +


in bamToBed result i have 8213, why ?

thank you --
mslider is offline   Reply With Quote
Old 06-19-2014, 04:33 AM   #7
dpryan
Devon Ryan
 
Location: Freiburg, Germany

Join Date: Jul 2011
Posts: 3,480
Default

The read is spliced (note 7841N in the CIGAR string), so bamToBed is correct.
dpryan is offline   Reply With Quote
Old 06-19-2014, 05:08 AM   #8
mslider
Junior Member
 
Location: france

Join Date: Sep 2010
Posts: 24
Default

yes but what's mean the second coordinate 8213 ?
mslider is offline   Reply With Quote
Old 06-19-2014, 05:23 AM   #9
dpryan
Devon Ryan
 
Location: Freiburg, Germany

Join Date: Jul 2011
Posts: 3,480
Default

The second value would be the end of where the read aligns.
dpryan is offline   Reply With Quote
Old 06-19-2014, 05:31 AM   #10
mslider
Junior Member
 
Location: france

Join Date: Sep 2010
Posts: 24
Default

okay i understand,
and is there a way to calculate the length of splicing region ?
mslider is offline   Reply With Quote
Old 06-19-2014, 05:45 AM   #11
dpryan
Devon Ryan
 
Location: Freiburg, Germany

Join Date: Jul 2011
Posts: 3,480
Default

Have a read through the SAM specification.
dpryan is offline   Reply With Quote
Old 11-14-2015, 01:08 PM   #12
seqprone
Junior Member
 
Location: USA

Join Date: Oct 2015
Posts: 8
Default

Is there a way I could extract a range, say [chr3,a,b] to a BED format from a BAM file?
seqprone is offline   Reply With Quote
Old 11-14-2015, 05:03 PM   #13
blancha
Senior Member
 
Location: Montreal

Join Date: May 2013
Posts: 367
Default

How about samtools view to select the region of interest, and then bamtools bedtobam to convert the BAM file to the BED format?

You can pipe the output of samtools view directly to bedtobam.

Last edited by blancha; 11-14-2015 at 05:33 PM.
blancha is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 02:03 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO