Hi, I am using cuffdiff on single end illumina data. I naturally get a lot more significantly differentially expressed genes if I lower my threshold from 500 to 300, for example. When I use the default 500, much of my data comes out with NOTEST. What is an acceptable -c value to use to still find significantly differentially expressed genes?
Unconfigured Ad
Collapse
X
-
I'm not going to claim to be an expert on just how to manipulate the code Cuffdiff uses, but it seems to me 500 is very high unless you have some ridiculous coverage. I had the same issues even with relatively large amounts of total RNA (5-10 ug) used and with genes I know have good expression levels from other experiments. So I cut the threshold down to 250. The P-values for most called differences where still way below .05. I know you run into alpha error inflation, because you're running these test 20000 time or more, depending on which genome you're working with, but you have to balance those false positive reportings with the false negatives for having the cutoff too high.
Anyway, I'm betting you're going to have to do replicates somehow regardless, so I rather set the cuttoff too low for RNA-seq, then shrink that list down with what ever kind of validation you're doing.
-
-
Thank you very much for your response. I have actually changed my -c option to 0, seeing that there are genes with a smaller amount of reads that still can be differentially expressed. Can anyone comment to this approach, or if I am getting a lot of false values?
Thanks again!
Comment
-
-
-c 0
I have also set this parameter to zero and lowered my FDR to reduce false discovery. I believe the key to analyzing this data is to realize that there are no absolutes and that we are creating a model and fitting the data as best as possible to this model. Be aware of your assumptions and the pitfalls of those assumptions and get into the data and work with it. Patterns will emerge and from that you can develop testable hypotheses.
Comment
-
Latest Articles
Collapse
-
by SEQadmin2
Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.
The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
...-
Channel: Articles
06-02-2026, 10:05 AM -
-
by SEQadmin2
With the launch of new single-cell sequencing platforms in 2026, the field stands at an exciting inflection point. This article surveys the most impactful advances in the field and discusses how they’re reshaping research in cancer, immunology, and beyond.
Introduction
Single-cell sequencing technologies have undergone remarkable advances over the past decade, transitioning from low-throughput experimental approaches to highly scalable platforms capable of...-
Channel: Articles
05-22-2026, 06:42 AM -
ad_right_rmr
Collapse
News
Collapse
| Topics | Statistics | Last Post | ||
|---|---|---|---|---|
|
Sequencing the Two-Toed Sloth Genome Reveals Jumping Genes Tied to Its Extreme Metabolism
by SEQadmin2
Started by SEQadmin2, 06-09-2026, 11:58 AM
|
0 responses
26 views
0 reactions
|
Last Post
by SEQadmin2
06-09-2026, 11:58 AM
|
||
|
Started by SEQadmin2, 06-05-2026, 10:09 AM
|
0 responses
33 views
0 reactions
|
Last Post
by SEQadmin2
06-05-2026, 10:09 AM
|
||
|
Started by SEQadmin2, 06-04-2026, 08:59 AM
|
0 responses
39 views
0 reactions
|
Last Post
by SEQadmin2
06-04-2026, 08:59 AM
|
||
|
Started by SEQadmin2, 06-02-2026, 12:03 PM
|
0 responses
62 views
0 reactions
|
Last Post
by SEQadmin2
06-02-2026, 12:03 PM
|
Comment