SEQanswers

Go Back   SEQanswers > Applications Forums > RNA Sequencing



Similar Threads
Thread Thread Starter Forum Replies Last Post
Tophat2: Ensembl GRCh37 V's UCSC hg19 SHeaph Bioinformatics 2 01-10-2017 01:52 AM
Tophat2 output bam size discrepancies Sentinel156 RNA Sequencing 1 05-19-2015 10:34 AM
bowtie index file & matching GTF for tophat2, for specific human genome cytoband rodrigo.duarte88 Bioinformatics 7 05-17-2015 07:09 AM
Alignment on hg19 or hg38 for exome-seq data blancha Bioinformatics 5 04-21-2015 08:22 AM
Tophat2 and hg19 - fail to generate fasta from indexes E_w Bioinformatics 2 05-23-2014 03:59 AM

Reply
 
Thread Tools
Old 01-26-2016, 06:05 AM   #1
Sbamo
Junior Member
 
Location: Germany

Join Date: Jan 2016
Posts: 7
Default HISAT2 vs. TopHat2: Discrepancies between Hg19 & Hg38

Hello everyone,

I have around 150 RNA-Seq datasets created using the Lexogen SENSE mRNA-Seq Library Prep Kit for Illumina, as well as around 50 Trueseq Illumina samples. I aligned one sample using both HISAT2 and TopHat2 for both Hg19 and Hg38. The reason for that is that I wish to run all samples with Hg38 (since it is the newest reference) and HISAT2, but using Hg19 I get a far better alignment rate:

Lexogen Sample:
HISAT2 (Hg19): Paired Rate = 82.68%, Overall Rate = 90.31%
HISAT2 (Hg38): Paired Rate = 73.87%, Overall Rate = 81.01%
TopHat2 (Hg19): Paired Rate = 74.74%, Overall Rate = 87.4%
TopHat2 (Hg38): Paired Rate = 77.68%, Overall Rate = 87.7%

It is interesting to notice, that TopHat2 does not seem to be negatively affected by the change of reference. On the contrary it actually "likes" it.
This got even more strange, when I run a control sample created with the TrueSeq Illumina Kit and got the following results:

Trueseq Sample:
HISAT2 (Hg19): Paired Rate = 94.86%, Overall Rate = 97.22%
HISAT2 (Hg38): Paired Rate = 93.15%, Overall Rate = 95.47%
TopHat2 (Hg19): Paired Rate = 93.46%, Overall Rate = 96.50%
TopHat2 (Hg38): Paired Rate = 88.07%, Overall Rate = 97.00%

I can accept a difference in the alignment rate as "random" if its less than 5% but a drop from 90.31% to 81.01% I cannot accept. Has anyone tested HISAT2 on those different reference genomes and if so had similar results? I have been struggling with this for a long time so any help is greatly appreciated!!

Additional Info:
- The whole analysis was run on the Galaxy Platform
- The Lexogen Samples are strand-specific (second strand) and the Trueseq samples are unstranded
- I tested 4 additional samples (2 Lexogen + 2 Trueseq) gaining similar results
- The references were downloaded from the USCS directly

Thanks in advance!
Sbamo
Sbamo is offline   Reply With Quote
Reply

Tags
hg19, hg38, hisat2, lexogen, reference annotation

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 12:34 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2017, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO