View Single Post
Old 02-18-2017, 08:07 AM   #1
achamess
Junior Member
 
Location: North Carolina, USA

Join Date: Apr 2016
Posts: 5
Default How to use HTSeq to count RNA-Seq overlap w/ whole genes (intron+exon) vs. exon only

So I recently performed RNA-seq using the SmartSeq2 on RNA from neuronal nuclei.


My sample was a pool of sorted neuronal nuclei from mouse. Uing STAR to align, I found that about 20% is exonic, and 60-70% is intronic. Even though I polyA selected, there is still a ton of retained intron, which is consistent with a recent paper that did single nucleus RNASeq in the human brain https://www.ncbi.nlm.nih.gov/pubmed/27339989

But the question is, what to do with the intronic reads?

One of my goals is differential gene expression. I want to take my RNASeq reads and count the overlap with the whole gene (intron + exon), rather than just exons.

How would I do this? What annotation file should I use for HTSeq? I have a GTF from UCSC, but I see exons, CDS but not the whole gene. How would I go about doing this? Do I need to make a custom GTF using the transcription start and stop sites?
achamess is offline   Reply With Quote