Go Back   SEQanswers > Bioinformatics > Bioinformatics

Similar Threads
Thread Thread Starter Forum Replies Last Post
Repost: tophat-fusion outputs empty result mrfox Bioinformatics 31 10-13-2016 08:03 AM
Tophat2 with fusion search and tophat-fusion-post problems seqfast Bioinformatics 9 07-30-2013 07:16 PM
RNA-Seq: TopHat-Fusion: an algorithm for discovery of novel fusion transcripts. Newsbot! Literature Watch 5 07-13-2013 01:02 AM
Tophat2 fusion post MerFer Bioinformatics 2 06-29-2012 03:37 AM
TopHat2 insertion bed outputs Bukowski Bioinformatics 0 04-23-2012 01:29 AM

Thread Tools
Old 08-13-2012, 05:40 PM   #1
Location: New York, NY

Join Date: Sep 2011
Posts: 26
Default Merging tophat2 outputs for tophat-fusion

Hello Tophat users,

I'm having some trouble getting tophat-fusion-post to work with my data. Running a full lane of RNA-seq data resulted in failures at the joining segments step, so I ran each lane split in two, (ie sample1.1 and sample1.2) This worked well, but now I have two unique outputs for every 1 sample.

I attempted to cat together fusions.out from both outputs in a new directory (sample1.3) and then do a sort using the command below:

cat fusions.out | tr "-" "\t" | sed 's/chr/chr /g' | sort -n -b -k 2 -k 4 -k 5 -k 6 | sed 's/chr /chr/g' | sed 's/\t/-/' > fusions.out1 ; cp fusions.out1 fusions.out
This seems to work, except it puts X and Y ahead of 1 which I didn't think would be a problem. However, when I run tophat-fusion-post I get an error message:

tophat-fusion-post -p 8 --num-fusion-reads 1 --num-fusion-pairs 2 --num-fusion-both 5 /bowtie/bowtie-0.12.8/hg19/hg19

[Mon Aug 13 16:41:08 2012] Beginning TopHat-Fusion post-processing run (v2.0.3)
[Mon Aug 13 16:41:08 2012] Extracting 23-mer around fusions and mapping them using Bowtie
        samples updated

Mon Aug 13 17:07:46 2012] Filtering fusions
        Processing: tophat_sample01.1/fusions.out
        Processing: tophat_sample01.2/fusions.out
        Processing: tophat_sample01.3/fusions.out
Traceback (most recent call last):
  File "/tophat2/tophat-2.0.3.Linux_x86_64/tophat-fusion-post", line 2083, in <module>
  File "/tophat2/tophat-2.0.3.Linux_x86_64/tophat-fusion-post", line 2054, in main
    filter_fusion(bwt_idx_prefix, params)
  File "/tophat2/tophat-2.0.3.Linux_x86_64/tophat-fusion-post", line 695, in filter_fusion
    filter_fusion_impl(fusion_file, refGene_list, ensGene_list, seq_chr_dic, fusion_gene_list)
  File "/tophat2/tophat-2.0.3.Linux_x86_64/tophat-fusion-post", line 488, in filter_fusion_impl
    if abs(int(left)) + abs(int(right)) > 2000:
ValueError: invalid literal for int() with base 10: ''
Has anyone had to merge separated tophat outputs? Any ideas why I'm getting this error? Does tophat-fusion-post rely on any files besides fusions.out?

Thanks in advance.
NKAkers is offline   Reply With Quote
Old 08-15-2012, 06:40 PM   #2
Location: New York, NY

Join Date: Sep 2011
Posts: 26

Sorry folks, this was just a dumb error on my part. I was inadvertently changing fusions.out when I sorted it. Here is better code for sorting, if anyone wants it.

cat fusions.out | sed 's/-/\t/' | sed 's/chr/chr /g' | sort -n -b -k 2 -k 4 -k 5 -k 6 | sed 's/chr /chr/g' | sed 's/\t/-/' > fusions.out1 ; cp fusions.out1 fusions.out ; rm fusions.out1
NKAkers is offline   Reply With Quote

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

All times are GMT -8. The time now is 06:45 AM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO