Hello,
I believe I have successfully run tophat fusion on single end RNA-seq data. In the results.html file, I noticed that several of the fusion alignments contain dashes in the sequence. What do these dashes signify?
Also, I noticed in my results (and the tophat fusion results example) that while there are few if any duplicate sequences for sequences not spanning a fusion (left or right), many of the sequences mapping across the fusion are duplicated (exactly the same sequence), sometimes as many as 6 times. These multiple copies seem to be counted in the number of spanning reads summary at the top of the file. What is the origin of these repeats? Should they be ignored (only considered as a single read)?
Thank you for the help!
I believe I have successfully run tophat fusion on single end RNA-seq data. In the results.html file, I noticed that several of the fusion alignments contain dashes in the sequence. What do these dashes signify?
Also, I noticed in my results (and the tophat fusion results example) that while there are few if any duplicate sequences for sequences not spanning a fusion (left or right), many of the sequences mapping across the fusion are duplicated (exactly the same sequence), sometimes as many as 6 times. These multiple copies seem to be counted in the number of spanning reads summary at the top of the file. What is the origin of these repeats? Should they be ignored (only considered as a single read)?
Thank you for the help!
Comment