Hi, I am using cuffdiff on single end illumina data. I naturally get a lot more significantly differentially expressed genes if I lower my threshold from 500 to 300, for example. When I use the default 500, much of my data comes out with NOTEST. What is an acceptable -c value to use to still find significantly differentially expressed genes?
Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
I'm not going to claim to be an expert on just how to manipulate the code Cuffdiff uses, but it seems to me 500 is very high unless you have some ridiculous coverage. I had the same issues even with relatively large amounts of total RNA (5-10 ug) used and with genes I know have good expression levels from other experiments. So I cut the threshold down to 250. The P-values for most called differences where still way below .05. I know you run into alpha error inflation, because you're running these test 20000 time or more, depending on which genome you're working with, but you have to balance those false positive reportings with the false negatives for having the cutoff too high.
Anyway, I'm betting you're going to have to do replicates somehow regardless, so I rather set the cuttoff too low for RNA-seq, then shrink that list down with what ever kind of validation you're doing.
-
Thank you very much for your response. I have actually changed my -c option to 0, seeing that there are genes with a smaller amount of reads that still can be differentially expressed. Can anyone comment to this approach, or if I am getting a lot of false values?
Thanks again!
Comment
-
-c 0
I have also set this parameter to zero and lowered my FDR to reduce false discovery. I believe the key to analyzing this data is to realize that there are no absolutes and that we are creating a model and fitting the data as best as possible to this model. Be aware of your assumptions and the pitfalls of those assumptions and get into the data and work with it. Patterns will emerge and from that you can develop testable hypotheses.
Comment
Latest Articles
Collapse
-
by seqadmin
The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...-
Channel: Articles
04-22-2024, 07:01 AM -
-
by seqadmin
Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...-
Channel: Articles
04-04-2024, 04:25 PM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, Yesterday, 11:49 AM
|
0 responses
15 views
0 likes
|
Last Post
by seqadmin
Yesterday, 11:49 AM
|
||
Started by seqadmin, 04-24-2024, 08:47 AM
|
0 responses
16 views
0 likes
|
Last Post
by seqadmin
04-24-2024, 08:47 AM
|
||
Started by seqadmin, 04-11-2024, 12:08 PM
|
0 responses
61 views
0 likes
|
Last Post
by seqadmin
04-11-2024, 12:08 PM
|
||
Started by seqadmin, 04-10-2024, 10:19 PM
|
0 responses
60 views
0 likes
|
Last Post
by seqadmin
04-10-2024, 10:19 PM
|
Comment