SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Repost: tophat-fusion outputs empty result mrfox Bioinformatics 31 10-13-2016 07:03 AM
Dindel empty variant result file land_NGS Bioinformatics 3 09-25-2013 01:57 PM
RNA-Seq: TopHat-Fusion: an algorithm for discovery of novel fusion transcripts. Newsbot! Literature Watch 5 07-13-2013 12:02 AM
tophat-fusion-post:known fusions? mrfox Bioinformatics 0 01-19-2012 08:04 AM
tophat fusion --fusion-min-dist MerFer Bioinformatics 1 07-24-2011 07:09 PM

Reply
 
Thread Tools
Old 07-31-2011, 05:40 PM   #1
emilyjia2000
Member
 
Location: usa

Join Date: May 2011
Posts: 59
Default tophat-fusion-post result empty

Hello,

Does anyone use tophat-fusion-post? After I run it, the output is empty. I put the tophat-fusion outputs, blast database and annotation files in the same directory. command line I used is:
tophat-fusion-post -p 8 --num-fusion-reads 1 --num-fusion-pairs 2 --num-fusion-both 5 hg18

I don't know what's wrong with the process. I really appreciate someone's help.

Thanks,
emilyjia2000 is offline   Reply With Quote
Old 08-04-2011, 09:44 AM   #2
ahmetz
Member
 
Location: new york

Join Date: Jun 2011
Posts: 23
Default

I'm having the same issue. After I ran the tophat-fusion, i generated the folder structure shown on the website with blast binaries and sequences. When I run tophat-fusion-post; it starts by extracting 23mers around fusions and maps them using bowtie. I have fusion_seq.bwtout, fusion_seq.fa and fusion.seq.map files generated. After tophat-fusion-post filters the fusions, it puts 0 fusions into potential_fusion.txt. Anyone with any ideas why this may be? My filtering criteria is min fusion reads is to be 1 and I know there are fusions with more than 1 reads in the fusions.out file from tophat-fusion. I tried to follow the python code but got lost after a while since i'm new in this. any help is greatly appreciated. thanks.
ahmetz is offline   Reply With Quote
Old 08-05-2011, 04:51 PM   #3
Jon_Keats
Senior Member
 
Location: Phoenix, AZ

Join Date: Mar 2010
Posts: 279
Default

I've got the same problem. It extracts and maps the 23-mers without a problem but no potential fusions are identified. If you follow their "awk filtering" suggestion on the fusion.out file the first line is a known IgH translocation is this sample with over 100 reads spanning the break, 100 pairs flanking, etc... I think I'll try and simulate a BCR-ABL translocation and see if that gets mapped and survives filtering. It would be nice if there was a small test dataset to validate each install like they have for bowtie and tophat.
Jon_Keats is offline   Reply With Quote
Old 08-08-2011, 07:14 AM   #4
ahmetz
Member
 
Location: new york

Join Date: Jun 2011
Posts: 23
Default

I actually wanted to use the fusion-post for the way it presents the results. like giving you gene names along with chr coordinates and showing the reads around the fusion point. when i do the filtering with 10 reads being my filter, i get about 20 fusions and going around trying to figure the genes out manually seems a bit mundane given i have a lab meeting for next week. oh well.
ahmetz is offline   Reply With Quote
Old 08-08-2011, 07:20 AM   #5
Jon_Keats
Senior Member
 
Location: Phoenix, AZ

Join Date: Mar 2010
Posts: 279
Default

This weekend I download and ran the MCF7 dataset on the tophat-fusion website using the settings described in the manual. No luck the potential_fusion.txt file is empty still. I've put in a question to the tophat-cufflinks email so I'll post if I get any answers.
Jon_Keats is offline   Reply With Quote
Old 08-08-2011, 09:34 AM   #6
ahmetz
Member
 
Location: new york

Join Date: Jun 2011
Posts: 23
Default

i hope you hear from them jon. do post here if you do.
ahmetz is offline   Reply With Quote
Old 08-11-2011, 04:22 PM   #7
Jon_Keats
Senior Member
 
Location: Phoenix, AZ

Join Date: Mar 2010
Posts: 279
Default

No response yet but I have a feeling there is some odd compiling issue between systems. I've tried three things"
1) compiled from source on linux = tophat-fusion fails (never tried tophat-fusion-post)
2) tried pre-compiled linux binary = tophat-fusion works, tophat-fusion-post fails to identify any fusion even in MCF7 data from developer site
3) tried pre-compiled mac binary = tophat-fusion works, tophat-fusion-post identifies fusions but I get a blast error (looks to be memory limitation of my laptop).
Jon_Keats is offline   Reply With Quote
Old 08-12-2011, 06:41 AM   #8
ahmetz
Member
 
Location: new york

Join Date: Jun 2011
Posts: 23
Default

when i complied from the source, tophat-fusion had failed to compile all together for a reason i failed to understand. then used the precompiled version for mac and got the error i listed above. wonder if we'll hear from them.
ahmetz is offline   Reply With Quote
Old 08-12-2011, 07:09 AM   #9
Jon_Keats
Senior Member
 
Location: Phoenix, AZ

Join Date: Mar 2010
Posts: 279
Default

I did forget to mention that I've tried compiling from source on a mac and get the same error more or less that you saw

Code:
-- tophat 1.2.0 Configuration Results --
  C compiler:          gcc -Wall -arch x86_64 -O3  -DNDEBUG
  C++ compiler:        g++ -Wall -arch x86_64 -O3  -DNDEBUG -I/Users/jkeats/local/include
  GCC version:         i686-apple-darwin10-gcc-4.2.1 (GCC) 4.2.1 (Apple Inc. build 5664)
  Host System type:    i386-apple-darwin10.8.0
  Install prefix:      /Users/jkeats/local2
  Install eprefix:     ${prefix}

  See config.h for further configuration information.
  Email <cole@cs.umd.edu> with questions and bug reports.

jkeats-ML:tophatfusion-0.1.0 jkeats$ make
make  all-recursive
Making all in src
g++ -Wall -arch x86_64 -O3  -DNDEBUG -I/Users/jkeats/local/include -I./SeqAn-1.2 -Wall -arch x86_64 -O3  -DNDEBUG -I/Users/jkeats/local/include  -L/usr/lib -o prep_reads -L/Users/jkeats/local/lib prep_reads.o ../src/libtophat.a -lbam -lz -lz 
Undefined symbols:
  "_solToPhred", referenced from:
      charToPhred33(char, bool, bool)in prep_reads.o
ld: symbol(s) not found
collect2: ld returned 1 exit status
make[2]: *** [prep_reads] Error 1
make[1]: *** [all-recursive] Error 1
make: *** [all] Error 2
Did you align to hg18 or hg19? The annotation files provided in the package are for hg19. In my case for comparison, I'm running the mac binary on a snow leopard machine, aligning to hg19 provided on the bowtie site. At least when I tested the MCF7 dataset using the settings on the website it works less the blast error and generated reports with annotated breaks with gene names, positions, exon number, etc..
Jon_Keats is offline   Reply With Quote
Old 10-17-2011, 03:46 AM   #10
Jeckow
Junior Member
 
Location: Florence, Italy

Join Date: Sep 2009
Posts: 4
Default

I've got the same problem.
I contacted the authors, but not have received a response.
Has anyone solved this yet?
Jeckow is offline   Reply With Quote
Old 10-17-2011, 09:13 AM   #11
lshen
Member
 
Location: Toronto

Join Date: Jan 2008
Posts: 30
Default

No. I checked the code, but could not figure out how it can be empty. I run on Python 2.6.5, an 2.7.1, none of them worked.

A bit frustrating, because I already finished tophat-fusion step for12 samples with a big effort.

Plus, I could not use existing genes.gtf at the tophat-fusion step. It worked if I did not provide an existing genes.gtf.
lshen is offline   Reply With Quote
Old 10-17-2011, 09:45 AM   #12
LOH
Registered Vendor
 
Location: CA

Join Date: Jul 2010
Posts: 23
Default

Same problems here. Tried on Mac and Ubuntu (with python 2.7.1). The tophat-fusion-post did not work. Contacted the author but no response.
LOH is offline   Reply With Quote
Old 10-19-2011, 11:14 AM   #13
ahmetz
Member
 
Location: new york

Join Date: Jun 2011
Posts: 23
Default

I just tried tophat-fusion-post - freshly downloaded mac binaries - and it actually worked. the only problem is the html file doesn't show my fusions and i think blast is working probably because of the way i unzipped the database. anyone care commenting whether i have the right configuration for the database?

ahmetz is offline   Reply With Quote
Old 10-02-2012, 10:21 AM   #14
tankman
Member
 
Location: usa

Join Date: Sep 2012
Posts: 22
Default tophat-fusion-post example

Hi Ahametz,

I'm still struggling to get the tophat-fusion-post example to work -- I also get no fusions even though all the fusions.out files in their MCF7 example certainly contain fusion candidates. In other words, I don't even get to the blast stage.

I'm using tophat 2.0.4 and tophat-fusion ran without issues.

Any help would be really appreciated.

thanks,
t
tankman is offline   Reply With Quote
Old 10-03-2012, 06:43 AM   #15
Emilie
Member
 
Location: Toronto

Join Date: Nov 2010
Posts: 21
Default

I used as well MCF7 to test TopHat-Fusion but it worked for me.
I am using symlinks for the blast and indexes directories (it works as well if the tophat_MCF7_1 files are symlinks).

The structure I am using to run tophat-fusion-post is the following:

TopHatFusion_MCF7
----blast
----blast_human # symbolic link to the blast directory
----indexes # hg19 indexes / bowtie1
----mcl
----refGene.txt
----ensGene.txt
----tophat_MCF7_1
--------accepted_hits.bam
--------deletions.bed
--------fusions.out
--------insertions.bed
--------junctions.bed
--------logs
--------prep_reads.info
--------unmapped.bam
----tophatfusion_out

The blast directory looks like:

blast
----human_genomic.00.nhd
----human_genomic.00.nhi
[...]
----nt.00.nhd
----nt.00.nhi
[...]
----other_genomic.00.nhd
----other_genomic.00.nhi
[...]

The command I am using:
tophat-fusion-post -p 1 --num-fusion-reads 1 --num-fusion-pairs 2 --num-fusion-both 5 indexes/hg19

I am using bowtie/0.12.7, samtools/0.1.18 and blast+/2.2.26
This works with both tophat-2.0.3 (tophat-2.0.3.Linux_x86_64.tar.gz) and tophat-2.0.4 (tophat-2.0.4.Linux_x86_64.tar.gz)

Hope this helps

Emilie
Emilie is offline   Reply With Quote
Old 10-03-2012, 06:53 AM   #16
tankman
Member
 
Location: usa

Join Date: Sep 2012
Posts: 22
Default tophat-fusion-post

Hi Emilie,

Thanks a lot for your answer. Was your initial alignment call something like

/packages/tophat/2.0.4/bin/tophat -p 64 -o tophat/tophat_MCF7_final2 --fusion-search --keep-fasta-order --bowtie1 --no-coverage-search -r 0 --mate-std-dev 80 --fusion-min-dist 100000 --fusion-anchor-length 13 --fusion-ignore-chromosomes chrM index/hg19 SRR064286_1.fastq SRR064286_2.fastq

for your MCF7 run?

how many fusions did you get?

I'm still getting nothing!

tophat-fusion-post -p 1 --num-fusion-reads 1 --num-fusion-pairs 2 --num-fusion-both 5 indexes/hg19
[Wed Oct 3 10:53:02 2012] Beginning TopHat-Fusion post-processing run (v2.0.3)
-----------------------------------------------
[Wed Oct 3 10:53:02 2012] Extracting 23-mer around fusions and mapping them using Bowtie
samples updated
[Wed Oct 3 10:53:24 2012] Filtering fusions
Processing: tophat_MCF7_final2/fusions.out
0 fusions are output in ./tophatfusion_out/potential_fusion.txt
[Wed Oct 3 10:53:32 2012] Blasting 50-mers around fusions
[Wed Oct 3 10:53:32 2012] Generating read distributions around fusions
[Wed Oct 3 10:53:32 2012] Reporting final fusion candidates in html format
num of fusions: 0
-----------------------------------------------
[Wed Oct 3 10:53:33 2012] Run complete [00:00:30 elapsed]


Quote:
Originally Posted by Emilie View Post
I used as well MCF7 to test TopHat-Fusion but it worked for me.
I am using symlinks for the blast and indexes directories (it works as well if the tophat_MCF7_1 files are symlinks).

The structure I am using to run tophat-fusion-post is the following:

TopHatFusion_MCF7
----blast
----blast_human # symbolic link to the blast directory
----indexes # hg19 indexes / bowtie1
----mcl
----refGene.txt
----ensGene.txt
----tophat_MCF7_1
--------accepted_hits.bam
--------deletions.bed
--------fusions.out
--------insertions.bed
--------junctions.bed
--------logs
--------prep_reads.info
--------unmapped.bam
----tophatfusion_out

The blast directory looks like:

blast
----human_genomic.00.nhd
----human_genomic.00.nhi
[...]
----nt.00.nhd
----nt.00.nhi
[...]
----other_genomic.00.nhd
----other_genomic.00.nhi
[...]

The command I am using:
tophat-fusion-post -p 1 --num-fusion-reads 1 --num-fusion-pairs 2 --num-fusion-both 5 indexes/hg19

I am using bowtie/0.12.7, samtools/0.1.18 and blast+/2.2.26
This works with both tophat-2.0.3 (tophat-2.0.3.Linux_x86_64.tar.gz) and tophat-2.0.4 (tophat-2.0.4.Linux_x86_64.tar.gz)

Hope this helps

Emilie
tankman is offline   Reply With Quote
Old 10-03-2012, 02:33 PM   #17
Emilie
Member
 
Location: Toronto

Join Date: Nov 2010
Posts: 21
Default

Hi tankman,

My TopHat2 command:
tophat2 -o tophat_MCF7_1 -p 8 --fusion-search --keep-fasta-order --bowtie1 --no-coverage-search -r 0 --mate-std-dev 80 --fusion-min-dist 100000 --fusion-anchor-length 13 --fusion-ignore-chromosomes chrM indexes/hg19 SRR064286_1.fastq SRR064286_2.fastq

fusions.out: 70619 lines
potential_fusion.txt: 84 lines
result.txt: 13 lines

Do you get a similar number of lines in fusions.out?
Is your blast directory organized the same way?
Which files do you obtain when you run tophat-fusion-post?

Emilie
Emilie is offline   Reply With Quote
Old 10-03-2012, 02:42 PM   #18
Emilie
Member
 
Location: Toronto

Join Date: Nov 2010
Posts: 21
Default

The corresponding log file (for 2.0.3):

/bin/bash: module: line 1: syntax error: unexpected end of file
/bin/bash: error importing function definition for `module'
[Wed Jun 27 13:23:06 2012] Beginning TopHat-Fusion post-processing run (v2.0.3)
-----------------------------------------------
[Wed Jun 27 13:23:06 2012] Extracting 23-mer around fusions and mapping them using Bowtie
samples updated
[Wed Jun 27 13:24:23 2012] Filtering fusions
Processing: tophat_MCF7_1/fusions.out
14 fusions are output in ./tophatfusion_out/potential_fusion.txt
[Wed Jun 27 13:24:30 2012] Blasting 50-mers around fusions
1. RSBN1 exon7(114354330-114355069) AP4B1 intron5(114441423-114442523)
2. LRP1B exon89(142237963-142238101) PLXDC1 exon12(37265499-37265643)
3. ENSG00000233459 exon1(204499298-204500738) ZNF207 exon8(30692347-30692505)
4. ENSG00000250859 exon1(126847154-126848533) HNRNPK exon3(86585650-86585733)
5. FOXA1 exon1(38058755-38061915) ENSG00000139865 intron7(38184001-38194100)
6. ENSG00000224738 exon1(57183957-57184951) VMP1 exon11(57915654-57915757)
7. VMP1 exon12(57917127-57917951) RPS6KB1 exon4(57991994-57992063)
8. USP32 exon26(58342771-58342834) PPM1D intron1(58678247-58700879)
9. BCAS3 intron23(59161925-59445685) BCAS4 exon1(49411465-49411709)
10. BCAS3 exon24(59445686-59445854) BCAS4 exon1(49411465-49411709)
11. CARM1 exon2(11015625-11015751) SMARCA4 exon4(11096863-11097268)
12. ARFGEF2 exon1(47538273-47538546) SULF2 exon19(46365445-46365685)
13. SULF2 exon21(46414790-46415359) ENSG00000171940 intron4(52199707-52210643)
14. SULF2 exon21(46414790-46415359) ENSG00000171940 exon5(52210644-52210800)
[Wed Jun 27 13:55:46 2012] Generating read distributions around fusions
MCF7_1 (1-14)
chr1-chr1 114354329 114442495 rf
chr2-chr17 142237963 37265642 rr
chr2-chr17 204499953 30692348 rf
chr5-chr9 126847434 86585718 rr
chr14-chr14 38061534 38184710 rr
chr17-chr17 57184951 57915655 ff
chr17-chr17 57917128 57992063 rr
chr17-chr17 58342772 58679978 rr
chr17-chr20 59430948 49411709 rr
chr17-chr20 59445687 49411709 rr
chr19-chr19 11015626 11097268 rr
chr20-chr20 47538546 46365685 fr
chr20-chr20 46415148 52210294 rf
chr20-chr20 46415148 52210645 rf
[Wed Jun 27 14:08:48 2012] Reporting final fusion candidates in html format
num of fusions: 11
-----------------------------------------------
[Wed Jun 27 14:08:50 2012] Run complete [00:45:43 elapsed]


The resulting tophatfusion_out files contains (no empty files or folders):

blast_genomic
blast_nt
check
fusion_seq.bwtout
fusion_seq.fa
fusion_seq.map
logs
potential_fusion.txt
result.html
result.txt
sample_list.txt
tmp

Last edited by Emilie; 10-03-2012 at 02:48 PM. Reason: added the content of the tophat-fusion-post output file
Emilie is offline   Reply With Quote
Old 10-03-2012, 05:57 PM   #19
tankman
Member
 
Location: usa

Join Date: Sep 2012
Posts: 22
Default

Hi Emilie,

my tophat 2.0.4 command was identical to yours however I obtained 74484 lines in my fusions.out file, so somewhat more than you strangely enough. My potential_fusion.txt and result.txt files are both empty!

I obtain these files from tophatfusion_output

total 453248
-rw-rw-r--. 1 tankman01 tankman01a 51 Oct 2 13:31 sample_list.txt
-rw-rw-r--. 1 tankman01 tankman01a 37305758 Oct 2 13:31 fusion_seq.fa
-rw-rw-r--. 1 tankman01 tankman01a 355780646 Oct 2 13:32 fusion_seq.bwtout
-rw-rw-r--. 1 tankman01 tankman01a 69632205 Oct 2 13:32 fusion_seq.map
drwxrwxr-x. 2 tankman01 tankman01a 32768 Oct 2 13:32 tmp
-rw-rw-r--. 1 tankman01 tankman01a 0 Oct 2 13:33 potential_fusion.txt
drwxrwxr-x. 2 tankman01 tankman01a 32768 Oct 2 13:33 blast_genomic
drwxrwxr-x. 2 tankman01 tankman01a 32768 Oct 2 13:33 blast_nt
drwxrwxr-x. 2 tankman01 tankman01a 32768 Oct 2 13:33 check
-rw-rw-r--. 1 tankman01 tankman01a 0 Oct 2 13:33 result.txt
-rw-rw-r--. 1 tankman01 tankman01a 1120380 Oct 2 13:33 result.html
drwxrwxr-x. 2 tankman01 tankman01a 32768 Oct 2 13:33 logs
-rw-rw-r--. 1 tankman01 tankman01a 0 Oct 3 21:56 file

Quote:
Originally Posted by Emilie View Post
Hi tankman,

My TopHat2 command:
tophat2 -o tophat_MCF7_1 -p 8 --fusion-search --keep-fasta-order --bowtie1 --no-coverage-search -r 0 --mate-std-dev 80 --fusion-min-dist 100000 --fusion-anchor-length 13 --fusion-ignore-chromosomes chrM indexes/hg19 SRR064286_1.fastq SRR064286_2.fastq

fusions.out: 70619 lines
potential_fusion.txt: 84 lines
result.txt: 13 lines

Do you get a similar number of lines in fusions.out?
Is your blast directory organized the same way?
Which files do you obtain when you run tophat-fusion-post?

Emilie
tankman is offline   Reply With Quote
Old 10-04-2012, 06:30 AM   #20
tankman
Member
 
Location: usa

Join Date: Sep 2012
Posts: 22
Default fusions.out file might be different

Hi Emilie,

The only think I can think of now is that the fusions.out file is somehow very different than yours. Is there a way we could exchange fusions.out files and you try it on mine and try it on yours to see if there's any difference in result?

thanks
tm


Quote:
Originally Posted by Emilie View Post
The corresponding log file (for 2.0.3):

/bin/bash: module: line 1: syntax error: unexpected end of file
/bin/bash: error importing function definition for `module'
[Wed Jun 27 13:23:06 2012] Beginning TopHat-Fusion post-processing run (v2.0.3)
-----------------------------------------------
[Wed Jun 27 13:23:06 2012] Extracting 23-mer around fusions and mapping them using Bowtie
samples updated
[Wed Jun 27 13:24:23 2012] Filtering fusions
Processing: tophat_MCF7_1/fusions.out
14 fusions are output in ./tophatfusion_out/potential_fusion.txt
[Wed Jun 27 13:24:30 2012] Blasting 50-mers around fusions
1. RSBN1 exon7(114354330-114355069) AP4B1 intron5(114441423-114442523)
2. LRP1B exon89(142237963-142238101) PLXDC1 exon12(37265499-37265643)
3. ENSG00000233459 exon1(204499298-204500738) ZNF207 exon8(30692347-30692505)
4. ENSG00000250859 exon1(126847154-126848533) HNRNPK exon3(86585650-86585733)
5. FOXA1 exon1(38058755-38061915) ENSG00000139865 intron7(38184001-38194100)
6. ENSG00000224738 exon1(57183957-57184951) VMP1 exon11(57915654-57915757)
7. VMP1 exon12(57917127-57917951) RPS6KB1 exon4(57991994-57992063)
8. USP32 exon26(58342771-58342834) PPM1D intron1(58678247-58700879)
9. BCAS3 intron23(59161925-59445685) BCAS4 exon1(49411465-49411709)
10. BCAS3 exon24(59445686-59445854) BCAS4 exon1(49411465-49411709)
11. CARM1 exon2(11015625-11015751) SMARCA4 exon4(11096863-11097268)
12. ARFGEF2 exon1(47538273-47538546) SULF2 exon19(46365445-46365685)
13. SULF2 exon21(46414790-46415359) ENSG00000171940 intron4(52199707-52210643)
14. SULF2 exon21(46414790-46415359) ENSG00000171940 exon5(52210644-52210800)
[Wed Jun 27 13:55:46 2012] Generating read distributions around fusions
MCF7_1 (1-14)
chr1-chr1 114354329 114442495 rf
chr2-chr17 142237963 37265642 rr
chr2-chr17 204499953 30692348 rf
chr5-chr9 126847434 86585718 rr
chr14-chr14 38061534 38184710 rr
chr17-chr17 57184951 57915655 ff
chr17-chr17 57917128 57992063 rr
chr17-chr17 58342772 58679978 rr
chr17-chr20 59430948 49411709 rr
chr17-chr20 59445687 49411709 rr
chr19-chr19 11015626 11097268 rr
chr20-chr20 47538546 46365685 fr
chr20-chr20 46415148 52210294 rf
chr20-chr20 46415148 52210645 rf
[Wed Jun 27 14:08:48 2012] Reporting final fusion candidates in html format
num of fusions: 11
-----------------------------------------------
[Wed Jun 27 14:08:50 2012] Run complete [00:45:43 elapsed]


The resulting tophatfusion_out files contains (no empty files or folders):

blast_genomic
blast_nt
check
fusion_seq.bwtout
fusion_seq.fa
fusion_seq.map
logs
potential_fusion.txt
result.html
result.txt
sample_list.txt
tmp
tankman is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 09:45 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO