Seqanswers Leaderboard Ad

**dpryan** · 02-03-2014, 12:00 PM

Including a GTF file can make a large difference (see "tophat2" vs. "tophat2 ann" at the bottom):

I recommend reading the whole paper, it's quite useful.

**dpryan** · 02-03-2014, 12:05 PM

I should add that in either case your alignment rate is exceedingly low. What sort of organism is this? Also, did you do any adapter trimming?

**id0** · 02-03-2014, 02:45 PM

To answer your question, this is mouse without adapter trimming.

Thanks for that informative paper. However, the difference between annotated and non-annotated TopHat there is a few percentage points. For me it's ~5% versus ~50%.

For comparison, I am getting over 80% with just regular genomic alignment with Bowtie, so the reads themselves are of reasonable quality.

**dpryan** · 02-04-2014, 02:05 AM

80% with mouse RNAseq is more what one would expect (I get >95% alignment with mouse RNAseq, though only ~85-90% map uniquely).

Are you using local alignment with bowtie? Also, keep in mind that tophat is less tolerant (by default) of mismatches than bowtie, so if you have a number of those (due to using a quite divergent strain, for example), then that might also cause these sorts of problems.

Maybe give STAR a try and see if that produces better results for you. I've been quite happy with it.

**id0** · 02-04-2014, 09:33 AM

Based on what I've heard from other people, STAR will be much faster, but only marginally more accurate (if at all).

Regarding mismatches, that should not be affected by adding or removing a GTF. That variable is yielding ~5% versus ~50% alignment rate for me. I don't see how I can find any novel genes based off TopHat alignment if it is having so much difficulty finding known ones.

**dpryan** · 02-05-2014, 02:13 AM

True, though if the low alignment rate is due in part to the ends of many reads not mapping then using an aligner that can do soft-clipping (e.g., STAR) might produce better results. Aside from that, I'd have to actually see and play around with your data a bit to be of any more help. I've never had these sorts of issues with mouse RNA.

**id0** · 02-06-2014, 06:27 AM

I ran the same sample with STAR. I generated two genomes, one with GTF and one without. I ran the sample against both. I got more than twice the number of splices with GTF, which makes sense to me. For uniquely mapped reads, I got 64% alignment rate with GTF and 63% without. Essentially identical, which is what I would expect from a good aligner.

I will have to evaluate STAR more thoroughly. Based on the literature and this forum, it's main advantage is speed, which is not a concern for me, so I never bothered to test it for myself. At least for this one example, it seems to be far superior than TopHat in terms of alignment. I would also be far more confident in any novel genes detected from this alignment.

Topics	Statistics	Last Post
Genetic Variants and Diabetes Risk in Childhood Cancer Survivors by seqadmin Started by seqadmin, Today, 08:47 AM	0 responses 11 views 0 likes	Last Post by seqadmin Today, 08:47 AM
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 60 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 59 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 54 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM

Seqanswers Leaderboard Ad

Announcement

TopHat with and without GTF

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News