![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
htseq-count | paolo.kunder | Bioinformatics | 10 | 10-22-2014 05:45 AM |
HTseq-count of zero | halffedelf | RNA Sequencing | 14 | 04-17-2014 11:31 AM |
Is my parameter choice for HTSeq-count is right? | wmseq | Bioinformatics | 15 | 11-07-2013 01:24 PM |
Which ID should be used for HTSeq-count? | syintel87 | Bioinformatics | 11 | 02-07-2013 01:16 AM |
multiBamCov or htseq-count to count read per feature ? | NicoBxl | Bioinformatics | 1 | 07-03-2012 03:05 AM |
![]() |
|
Thread Tools |
![]() |
#1 |
Member
Location: La Jolla Join Date: Nov 2013
Posts: 17
|
![]()
Hi all,
I am using HTseq-count to count reads in my bam files. I used ncbi mouse annotation file. I wonder which feature type (third column of gff3 file) I should use? My understanding of how htseq count reads is that if I choose 'exon', then it will count reads only mapping to exons and sum those up for a gene. If I choose 'gene', it will count all the reads mapping to introns and exons of that gene. Theoretically, for RNAseq I should choose exons, and ignore reads mapping to introns. In my sample, I know I knock out the gene ext1. I tried both choices and used DEseq2 to do differential expression analysis. In my results which I chose the 'gene' feature, the ext1 gene was the most significant gene. However, in the results for choosing 'exon', there are over 200 genes more significant than ext1. So now I am confused, it seems that the 'gene' feature are better than 'exon'? Anyone has this situation before? thanks. |
![]() |
![]() |
![]() |
#2 |
Devon Ryan
Location: Freiburg, Germany Join Date: Jul 2011
Posts: 3,480
|
![]()
You'll want to use 'exon', otherwise you increase the noise from pre-mRNAs. I'm not sure why you would be disturbed to find other genes with more significant p-values than the knocked out gene. There are a few different things that affect one's ability to determine variance and the signal level (in this case, how highly expressed a gene is) is one of them. Since you're knocking the gene out, its expression should be very low in some of your samples (I'm assuming a complete knock out rather than just hemizygous), so you'll then have less power there to begin with. If this causes some other decently expressed gene to drastically increase expression then said other gene will probably have a smaller p-value. That's not really a problem.
|
![]() |
![]() |
![]() |
#3 | |
Member
Location: La Jolla Join Date: Nov 2013
Posts: 17
|
![]() Quote:
|
|
![]() |
![]() |
![]() |
#4 |
Devon Ryan
Location: Freiburg, Germany Join Date: Jul 2011
Posts: 3,480
|
![]()
Sure and for the reasons that I mentioned. Remember that the p-value just relates to believability of the finding given the data. Further, changes in biology are typically non-linear, so a 50% decrease in the knocked out gene could easily lead to larger changes in others. Not to mention that a hemizygous deletion will often not halve the expression.
|
![]() |
![]() |
![]() |
Thread Tools | |
|
|