Go Back   SEQanswers > Applications Forums > RNA Sequencing

Similar Threads
Thread Thread Starter Forum Replies Last Post
How to combine junctions.bed files produced by TopHat HTS Bioinformatics 8 05-03-2015 03:33 AM
Tophat junctions.bed RockChalkJayhawk RNA Sequencing 7 12-12-2013 11:56 AM
Trouble getting TopHat to work -- empty junctions.bed thurisaz RNA Sequencing 6 12-01-2011 12:13 PM
tophat junctions.bed MerFer Bioinformatics 0 06-16-2010 03:57 AM
Can anyone make sense of the quality scores in the qseq.txt files? TylerBackman Bioinformatics 2 04-29-2009 10:23 AM

Thread Tools
Old 04-05-2011, 01:16 PM   #1
Junior Member
Location: St Louis, Missouri, USA

Join Date: Jul 2010
Posts: 2
Default How to make sense of Tophat's output file 'junctions.bed'

This is an excerpt from junctions.bed, Tophat's output file generated using paired-end reads. Can somebody suggest how to make sense of the two bed blocks? Both bed blocks have the same coordinates. Besides, how to infer the scores (apparently, which represent the number of alignments spanning the junctions)

chr20 9353709 9360718 JUNC00000552 2 + 9353709 9360718 255,0,0 2 42,18 0,6991
chr20 9365023 9368124 JUNC00000553 1 + 9365023 9368124 255,0,0 2 35,15 0,3086
chr20 9368172 9370544 JUNC00000554 2 + 9368172 9370544 255,0,0 2 31,19 0,2353
chr20 9371222 9374262 JUNC00000555 7 + 9371222 9374262 255,0,0 2 40,28 0,3012
chr20 9374285 9376179 JUNC00000556 1 + 9374285 9376179 255,0,0 2 40,10 0,1884
chr20 9376224 9382178 JUNC00000557 5 + 9376224 9382178 255,0,0 2 41,42 0,5912
chr20 9385955 9388573 JUNC00000558 1 + 9385955 9388573 255,0,0 2 40,10 0,2608
chr20 9388666 9389312 JUNC00000559 4 + 9388666 9389312 255,0,0 2 39,33 0,613
chr20 9389328 9389741 JUNC00000560 6 + 9389328 9389741 255,0,0 2 36,38 0,375
chr20 9389783 9391703 JUNC00000561 3 + 9389783 9391703 255,0,0 2 45,20 0,1900
Gaurav Singhal
gsinghal is offline   Reply With Quote
Old 02-28-2012, 06:50 PM   #2
Junior Member
Location: Melbourne

Join Date: Feb 2012
Posts: 2
Default Explanation of junctions.bed

[seqname] [start] [end] [id] [score] [strand] [thickStart] [thickEnd] [r,g,b] [block_count] [block_sizes] [block_locations]
"start" is the start position of the leftmost read that contains the junction.
"end" is the end position of the rightmost read that contains the junction.
"id" is the junctions id, e.g. JUNC0001
"score" is the number of reads that contain the junction.
"strand" is either + or -.
"thickStart" and "thickEnd" don't seem to have any effect on display for a junctions track. TopHat sets them as equal to start and end respectively.
"r","g" and "b" are the red, green, and blue values. They affect the colour of the display.
"block_count", "block_sizes" and "block_locations":
The block_count will always be 2. The two blocks specify the regions on either side of the junction. "block_sizes" tells you how large each region is, and "block_locations" tells you, relative to the "start" being 0, where the two blocks occur. Therefore, the first block_location will always be zero.

[block1 ][ ][block2]
Alex124 is offline   Reply With Quote
Old 08-29-2012, 12:26 PM   #3
Senior Member
Location: Mexico

Join Date: Mar 2011
Posts: 137


I don't quite understand the block_sizes and block_locations fields. What I get but I think I'm wring is that the block_sizes field indicates the size of the 2 exons a,b (blocks) joined by the spliced junction?

And the block_locations field would indicate the position relative to the junction (feature) start position where the 2 exons a,b each begin? But this really makes no sense to me as this would mean that [as the first value of this field is 0] the first exon starts right where the splice junction begins, which is actually where it (the exon) ends.

Thanks for sharing your knowledge,

carmeyeii is offline   Reply With Quote
Old 08-29-2012, 08:27 PM   #4
Junior Member
Location: Melbourne

Join Date: Feb 2012
Posts: 2
Default Try IGV

Easiest way to understand this output is to load it into IGV, Broad Institute's Integrated Genome Viewer. You can then compare the values with what shows on the screen, try changing them to see what effect it has, etc.


Alex124 is offline   Reply With Quote
Old 09-03-2012, 07:49 AM   #5
Junior Member
Location: beijing

Join Date: Apr 2012
Posts: 9
Default use cufflinks or cuffdiff to get the gene expression value?

I used tophat cufflinks and cuffdiff to analysis my mRNA sequencing data, I am confused about the gene expression value. We have 7 samples in my expreiment, I can used cufflinks to produce every gene's expression value(FPKM) in each stage , and I can also used cuffdiff to get the gene's expression value by running cuffdiff with 7 samples together. But the gene's expression value produced by cufflinks and cuffdiff is not the same, so could you give me a instruction about that. Thank you.
xiongdianguang is offline   Reply With Quote

junctions, output, paired end reads, rna-seq, tophat

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

All times are GMT -8. The time now is 08:08 AM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2018, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO