Seqanswers Leaderboard Ad

**Amelie** · 01-07-2013, 02:57 PM

Hi,

I'm no expert with tophat2 yet but here's what I think:

-using both -G and -no-novel-juncs, tophat goes through the first 2 of its 3 possible steps: 1) mapping to transcriptome and 2) mapping reads to the genome but only if they can be aligned without splicing.

-using -G and -T (I think -no-novel-juncs is not necessary/meaningless in that case), it only performs step 1.

Now as to why it outputs a junctions file in the first case and not in the second case, I'm not sure.
My guess is that when -T is specified, since the junctions that would be contained in the output junctions files are those specified by the GTF file, it is just unnecessary to output a junctions file.
It is also unnecessary I think when -no-novel-juncs is specified, but it still seems to output one. You could check if all the junctions output with the -G and -no-novel-juncs options are the same as those defined by your GTF file maybe.
Anyway, the GTF file is equivalent to a junctions file with no novel junctions, so you don't really need an output junctions file do you?

Hope that helps,

Amelie

**jb2** · 01-07-2013, 03:03 PM

Hi Amelie,

Thanks for the response. What you stated does appear to be the case. I have also checked that the splice junctions in the junctions.bed file returned by tophat2 with -G and --no-novel-juncs are in fact all correlated with junctions in the gtf file. I actually like having the junctions.bed file output because it gives me the opportunity to look at junction specific behavior and how many and which junctions seem to have evidence of expression.

Tophat run with -T is definitely much faster, but since I desire the junctions.bed file, I'm going to stick to to the method -G --no-novel-juncs method I've used so far. I'd also point out that I don't think indels are returned either, so that is something missing too, which I think most users may want regardless of whether they are using Tophat to find novel splice junctions or not.

Cheers,

jb2

**Amelie** · 01-07-2013, 03:09 PM

The fact that indels are not returned with -T makes me wonder if they are allowed in step 1) when -T is not specified and tophat goes through all 3 steps. That would be desirable! I'll email tophat to check. Thanks for pointing that out!

**jb2** · 01-07-2013, 03:13 PM

Oops, my bad! I just noticed that it does return these things, just not in the place I originally checked. Doh! So now I'm going to go and explore what the difference might be between the junctions.bed file returned from -T and -G and the one returned from -G --no-novel-juncs. Sorry for the confusion and mistake on my part!

**Amelie** · 01-07-2013, 03:15 PM

Did you find indels files in both cases too then?

**jb2** · 01-07-2013, 03:23 PM

Yep, both insertions.bed and deletions.bed are there in both cases.

One thing I do notice, with -G --no-novel-juncs and -G -T, the resulting bam files do differ in size:

-G --no-novel-juncs

-rw-r--r-- 1 jb2 lgrc 2.4G Oct 24 10:41 s_1pair_accepted_hits.bam

-T -G

-rw-r--r-- 1 jb2 lgrc 2.0G Jan 3 14:33 s_1pair_accepted_hits.bam

Interesting to see what the differences might be.

**Amelie** · 01-07-2013, 03:25 PM

The difference should be the reads mapped in step 2) (mapping to genome splicing)

**jb2** · 01-07-2013, 03:58 PM

For the most part they find evidence of expression for a large union of the splice junctions (both -G -T and -G --no-novel-juncs show that there are reads aligned to 175988 splice junctions).

With the -G and -T options, there are 322 splice junctions that have aligned reads not found by -G and --no-novel-juncs.

However, with -G --no-novel-juncs, there are 1007 splice junctions that have aligned reads that don't have reads aligning to them with -G -T function.

Overall, they find a majority of the same things, I'm just curious about what causes the differences. I would have expected that -G --no-novel-juncs to have found all the the junctions that -G -T found plus a few more junctions because of the extra alignment steps it performs compared to -G -T, but in fact the -G -T tophat run finds some junctions not found by -G -T options.

Maybe I'll hear back from the Tophat folks about this to see what their thoughts are on that?

Topics	Statistics	Last Post
Expanding the Horizons of Cellular Research with the Single Cell Atlas by seqadmin Started by seqadmin, Yesterday, 11:49 AM	0 responses 13 views 0 likes	Last Post by seqadmin Yesterday, 11:49 AM
Genetic Variants and Diabetes Risk in Childhood Cancer Survivors by seqadmin Started by seqadmin, 04-24-2024, 08:47 AM	0 responses 16 views 0 likes	Last Post by seqadmin 04-24-2024, 08:47 AM
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 61 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 60 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM

Seqanswers Leaderboard Ad

Announcement

Tophat 2.0.4 -T and -G versus -G --no-novel-juncs

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News