SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Tophat junctions.bed RockChalkJayhawk RNA Sequencing 7 12-12-2013 11:56 AM
How to make sense of Tophat's output file 'junctions.bed' gsinghal RNA Sequencing 4 09-03-2012 07:49 AM
Trouble getting TopHat to work -- empty junctions.bed thurisaz RNA Sequencing 6 12-01-2011 12:13 PM
Missing Junctions in Tophat! ( after providing a known junctions & gene models files) avi Bioinformatics 2 08-02-2011 05:21 PM
tophat junctions.bed MerFer Bioinformatics 0 06-16-2010 03:57 AM

Reply
 
Thread Tools
Old 12-14-2009, 01:11 PM   #1
HTS
Member
 
Location: Toronto

Join Date: Nov 2009
Posts: 24
Default How to combine junctions.bed files produced by TopHat

Hi,

Basically I would like to double check before writing my own script to do so. Since I need to pool together samples with different read lengths, I have to run TopHat separately for them (using the -j option wherever appropriate). I already know how to merge the resulting .sam files with samtools and generate a combined coverage.wig file. Before putting effort to combine the junctions.bed files, I would like to know:

1. If there are tools/scripts to do this already, either within TopHat or outside.

2. If none, I would like to confirm if the score field of the junctions.bed file is simply the number of uniquely mapping reads that are aligned to the junction, or if muitlreads are also counted/weighted.

Please feel free to share your knowledge/experience/comments. Thanks a lot!

-- Leo
HTS is offline   Reply With Quote
Old 12-14-2009, 01:47 PM   #2
DrD2009
Member
 
Location: Kansas City

Join Date: Oct 2009
Posts: 88
Default

You could use Galaxy (http://main.g2.bx.psu.edu/). Upload the files (probably as file format tabular) you want to combine and then use the 'Concatenate queries' tool found under Text Manipulation. There's probably a short script out there to combine them, but this should work for what you want to do too.

-Brandon
DrD2009 is offline   Reply With Quote
Old 12-14-2009, 01:57 PM   #3
Cole Trapnell
Senior Member
 
Location: Boston, MA

Join Date: Nov 2008
Posts: 212
Default

TopHat comes with an (undocumented, I realize) script called bed_to_juncs. Running it on a TopHat BED file will produce a .juncs-format file. To merge these, you can simply cat them together, sort, and pipe to uniq.
Cole Trapnell is offline   Reply With Quote
Old 12-14-2009, 02:08 PM   #4
HTS
Member
 
Location: Toronto

Join Date: Nov 2009
Posts: 24
Default

Quote:
Originally Posted by Cole Trapnell View Post
TopHat comes with an (undocumented, I realize) script called bed_to_juncs. Running it on a TopHat BED file will produce a .juncs-format file. To merge these, you can simply cat them together, sort, and pipe to uniq.
Hi Cole,

I am aware of the bed_to_juncs script but when converting BED files to .junc files, it doesn't preserve the score and overhang information that I would like to combine. BTW, could you confirm if my understanding of the score field in BED files is correct? Thanks a lot!

-- Leo

Last edited by HTS; 12-14-2009 at 02:27 PM.
HTS is offline   Reply With Quote
Old 12-14-2009, 02:32 PM   #5
HTS
Member
 
Location: Toronto

Join Date: Nov 2009
Posts: 24
Default

Quote:
Originally Posted by DrD2009 View Post
You could use Galaxy (http://main.g2.bx.psu.edu/). Upload the files (probably as file format tabular) you want to combine and then use the 'Concatenate queries' tool found under Text Manipulation. There's probably a short script out there to combine them, but this should work for what you want to do too.

-Brandon
Thanks for your reply, Brandon! I guess I didn't make myself clear but I need something considerably more sophisticated than concatenating multiple text files, especially considering what each field of the junctions.bed file means.
HTS is offline   Reply With Quote
Old 12-14-2009, 02:44 PM   #6
DrD2009
Member
 
Location: Kansas City

Join Date: Oct 2009
Posts: 88
Default

I'm new just thought I'd try helping. Apparently it isn't the bed file format I am familiar with.
DrD2009 is offline   Reply With Quote
Old 06-23-2011, 10:37 AM   #7
fongchun
Member
 
Location: Vancouver, BC

Join Date: May 2011
Posts: 55
Default

Did you end up writing your own script that merges the junction.bed files from different sample while preserving the score? If yes, would you be willing to share that script since this is exactly what I need to do.
fongchun is offline   Reply With Quote
Old 09-02-2011, 03:18 PM   #8
roryk
Member
 
Location: boston

Join Date: Aug 2010
Posts: 15
Default

Quote:
Originally Posted by fongchun View Post
Did you end up writing your own script that merges the junction.bed files from different sample while preserving the score? If yes, would you be willing to share that script since this is exactly what I need to do.
https://github.com/roryk/seqscripts has a bunch of little gluey type scripts to work with splice junctions, one of them does what you want. Download it and do this from the command line:

cat junctions_file1.bed junction_file_2.bed junction_file_3.bed| tbed2juncs | combineJuncs > combined.juncs

combine.juncs will be a BED file and contains all of the junctions from the junction files with the score the sum of the junction scores. The names of the junctions are the locations of the two nucleotides at the edges of the exons that are joined together.
roryk is offline   Reply With Quote
Old 05-03-2015, 03:33 AM   #9
pengchy
Senior Member
 
Location: China

Join Date: Feb 2009
Posts: 116
Default

Quote:
Originally Posted by roryk View Post
https://github.com/roryk/seqscripts has a bunch of little gluey type scripts to work with splice junctions, one of them does what you want. Download it and do this from the command line:

cat junctions_file1.bed junction_file_2.bed junction_file_3.bed| tbed2juncs | combineJuncs > combined.juncs

combine.juncs will be a BED file and contains all of the junctions from the junction files with the score the sum of the junction scores. The names of the junctions are the locations of the two nucleotides at the edges of the exons that are joined together.
great work! Thank you roryk.
pengchy is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 07:48 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO