Seqanswers Leaderboard Ad

**kmcarr** · 06-19-2014, 04:11 AM

Originally posted by bbl View Post

…..
What worries me is there is only around 28% of the total unique mapped from tophat2 seem to have been properly counted by htseq-count. Could any one divulge what going on here?

Summary of unique mapped reads from tophat2 using samtools flagstat is:
60657139 + 0 in total (QC-passed reads + QC-failed reads)
….
Bottom of htseq-count output:
no_feature 11321706
ambiguous 1434804
too_low_aQual 0
not_aligned 0
alignment_not_unique 13865867

The sum of all reads counts by gen_id by htseq-count using awk is 17,068,532.

You are being confused by the way TopHat reports alignments and htseq reports counts. TopHat reports the number of reads aligned, meaning it counts each read of a pair individually so ~60M reads. htseq counts the number of fragments aligned to each gene, meaning it counts each read pair just once. So htseq is basing its counts on ~30M fragments; 17M fragments aligned uniquely would work out to ~57% unique alignment rate.

**bbl** · 06-19-2014, 05:57 AM

Thanks kmcarr- my ignorance, didnt remember htseq counts the pair-read only once. Nevertheless, is it reasonable to have 57% of mapped read-pairs for DE analysis? It seems to me still quite a lot loss.

**Brian Bushnell** · 06-19-2014, 10:00 AM

That depends on your genome completeness, read quality, and presence of contaminants. Did you do any quality-control? Some basic steps like adapter-trimming, quality-trimming, removal of common contaminants/spike-ins (phiX, etc) can greatly improve the mapping rate. Also, BBMap will map a higher percentage of reads than Tophat, especially if the data is low quality.

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 18 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 22 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 17 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 49 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

What is wrong with merely 28% tophat2 mapped reads are counted by HTSeq-count

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News