SylvainL 03-16-2015 03:51 AM

Hi all,

I am using (for the first time) featureCounts to get gene level counts in a RNAseq experiment... It is mouse RNAseq and polyA RNA were extracted...

A quick look at the results shows up to 15% of reads which are unassigned because of no overlap with any feature.

Do you normally have the same stats?


dpryan 03-16-2015 04:20 AM

I just checked one dataset (mouse liver), where ~5% had no feature, ~3% were ambiguously mapped (a non-stranded protocol), and ~19% were not uniquely mapped. ~75% were counted if we exclude unmapped reads. This is similar in a mouse hippocampus dataset that I just checked. Obviously if you include unmapped reads then these percentages would go down a bit.

So, that unassigned percentage seems a bit high, but perhaps there's a good biological reason for that (e.g., in tumors or sperm).

GenoMax 03-16-2015 04:31 AM

Post #29. Response from Wei Shi (author of featureCounts):

SylvainL 03-16-2015 04:32 AM

Hi dpryan,

I already removed the non uniquely mapped so yes, it's a bit lower if we consider all the mapped reads... and yes, it's sperm... :)

So I should not worry :9

Thanks a lot

dpryan 03-16-2015 04:35 AM

Yeah, sperm will inherently be different, since you have depletion of mRNAs. My sperm datasets aren't polyA enriched, they're ribo-depleted, so I don't have any numbers for comparison.

