SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Viewing Tophat results in IGV SEQquestions Bioinformatics 19 10-17-2014 11:12 AM
Different results produced by tophat? kentnf Bioinformatics 2 10-29-2010 09:35 AM
tophat results: chr4-9 missing ?? IrisZhu Bioinformatics 0 08-30-2010 03:35 AM
Using tophat results via UCSC genome browser statsteam RNA Sequencing 2 11-20-2009 11:37 AM
Tophat and Bowtie results baohua100 Bioinformatics 6 08-26-2009 11:17 PM

Reply
 
Thread Tools
Old 05-12-2010, 10:01 PM   #1
Maria_Lu
Junior Member
 
Location: Shanghai

Join Date: May 2010
Posts: 4
Default TopHat: the results confused me

I used TopHat to find exon-exon junctions.

But the results in the output 'junctions.bed' files confused me.
I seperated RNA-seq data into two datasets as one is 76bp*2 another is 40bp*2, then ran TopHat individually.

However, the two 'junctions.bed' files gave different results.
Here are examples of the two 'junctions.bed' result:
One reports the following junctions:
chromosome12 12302 12721 JUNC00000002 5 -
chromosome12 33389 34997 JUNC00000003 6 +
chromosome12 33688 34964 JUNC00000004 2 +
chromosome12 35474 35675 JUNC00000005 5 +
chromosome12 35718 35949 JUNC00000006 9 +

Another reports the following junctions:
chromosome12 12303 12723 JUNC00000005 26 -
chromosome12 33679 34982 JUNC00000007 3 +
chromosome12 35490 35674 JUNC00000008 6 +
chromosome12 35711 35949 JUNC00000009 7 +

These junction locations of each output file were similar but different.
When I extracted the detailed sequences, no GT-AG was found.

Does anybody know how to explain it?
Maria_Lu is offline   Reply With Quote
Old 05-14-2010, 06:38 PM   #2
lifeng.tian
Member
 
Location: Philadelphia

Join Date: Jul 2009
Posts: 16
Default

Maria,

Since you did not give the complete junctions.bed line, I attached two lines
from my TopHat analysis:

75nt run
chr20 251862 256677 JUNC00000001 11 - 251862 256677 255,0,0 2 46,69 0,4746


50nt run
chr20 251879 257723 JUNC00000001 2 - 251879 257723 255,0,0 2 29,39 0,5805


Here is the link to BED format at ucsc: http://genome.ucsc.edu/FAQ/FAQformat.html#format1

You can calculate the (start,end) of the junction by:
e.g.for 75nt run, find col 2, 10, 11, 12:
col 2: feature start 251862
col 10: block count 2
col 11: block sizes 46,69
col 12: block starts 0,4746
You can find the splice junction coordinates by:
start: col2+col11.first.item+1
end: col2+col12.second.item,
i.e.,
start: 251862+46+1=251909, end: 251862+4746=256608
chr20:251909-256608


for 50nt run,
junction is : chr20:251909-257684


Now, in this case, TopHat finds two different junctions. You can also
copy and paste the BED directly to ucsc custom track and visualize them.

Hope it helps.

Lifeng
lifeng.tian is offline   Reply With Quote
Old 05-14-2010, 06:54 PM   #3
Maria_Lu
Junior Member
 
Location: Shanghai

Join Date: May 2010
Posts: 4
Default

Hi, Lifeng,

Thank you so much for you kindness.
I'll calculate these junctions again according to your remind.
Maria_Lu is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 12:07 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO