I have been investigating a set of RNA-seq samples with known, confirmed fusions with Tophat-Fusion, in order to check efficacy of the tool and our pipeline. 4/5 fusions in different samples have been detected successfully. The fifth one was initially detected by Tophat (i.e. it is listed in fusions.out), but it was filtered out by post processing. I am trying to figure out why.
I have checked specificity of the fusion site manually with BLAST and the highest score I found was 110(length+identity percent), below the cut-off of 160.
The fusion is well anchored into both sites, and it also has sufficient number of supporting reads, pairs, and spanning pairs, as seen in its details, taken from fusions.out:
I had a look at the source code of tophat-fusion-post, and unfortunately I cannot find any indication as to why this specific fusion was filtered out.
Its check log does not exist in tophatfusion_out/check/
Commands used:
Tophat:
Post processing:
Software versions:
Bowtie 0.12.8
Tophat 2.0.6
Blast 2.2.25
Samtools 0.1.18
I have emailed the creators of Tuxedo suite, and will post here if I get a reply.
Maybe there is something I am forgetting about? Some post processing condition that this specific fusion does not meet? Any help would be greatly appreciated.
I have checked specificity of the fusion site manually with BLAST and the highest score I found was 110(length+identity percent), below the cut-off of 160.
The fusion is well anchored into both sites, and it also has sufficient number of supporting reads, pairs, and spanning pairs, as seen in its details, taken from fusions.out:
Code:
chr5-chr12 149510224 12006494 fr 38 13 19 0 86 79 0.546399 @ 11 25 38 52 66 @ TTCCCCACTGTCAGGGTGGCTCTCACTTAGCTCCAGCACTCGGACAGGGA CTGCATGGAGAGAGCACTGAGTTAGGAGGCGGGAGGGTCAGGACAGTTAA @ CCTGGATTTGTCTAAACCTCAGGCAAGAAAAGAGAAACCTCTTCCAGTAC CTTCTTCATGGTTCTGATGCAGTATGACCTCCGGCTGTGTGTGTATAGAG @ 38 38 38 38 38 38 38 38 38 38 38 38 38 38 38 38 38 38 38 38 38 37 37 37 37 36 36 36 36 35 35 34 34 34 34 34 34 32 32 31 31 31 31 30 28 25 24 23 22 20 @ 38 38 38 38 38 38 38 38 38 38 38 38 38 38 37 37 37 37 37 37 37 36 35 35 35 35 35 35 34 33 33 33 32 30 30 27 26 24 23 23 23 22 22 21 21 21 21 20 20 19 @ 11:7 32:2 22:43 82:2 43:70 29:97 72:69 114:97 114:97 727:31 880:29 4046:105 5235:105
Its check log does not exist in tophatfusion_out/check/
Commands used:
Tophat:
Code:
tophat -o tophat_sample_282 -p 8 --fusion-search --keep-fasta-order --bowtie1 --no-coverage-search -r 0 --mate-std-dev 80 --max-intron-length 100000 --fusion-min-dist 100000 --fusion-anchor-length 13 --fusion-ignore-chromosomes chrM /scratch/EXOME_DATA/RNASEQ/index_bowtie1/hg19 /scratch/EXOME_DATA/RNASEQ/positive_control_WTCHG/tophat/sample_282/WTCHG_cat_282_1.fastq /scratch/EXOME_DATA/RNASEQ/positive_control_WTCHG/tophat/sample_282/WTCHG_cat_282_2.fastq
Code:
tophat-fusion-post -p 8 --num-fusion-reads 1 --num-fusion-pairs 2 --num-fusion-both 5 /scratch/EXOME_DATA/RNASEQ/index_combined/hg19
Bowtie 0.12.8
Tophat 2.0.6
Blast 2.2.25
Samtools 0.1.18
I have emailed the creators of Tuxedo suite, and will post here if I get a reply.
Maybe there is something I am forgetting about? Some post processing condition that this specific fusion does not meet? Any help would be greatly appreciated.
Comment