SEQanswers

Go Back   SEQanswers > Applications Forums > RNA Sequencing



Similar Threads
Thread Thread Starter Forum Replies Last Post
Question about using limma for differential analysis bvk Bioinformatics 1 07-20-2017 06:23 PM
Expression quantification/differential expression gene analysis by RNA-Seq chenjy Bioinformatics 12 08-02-2013 04:06 AM
RNA-Seq, Differential Expression: a theoretical question of modeling methodology NikTuzov Bioinformatics 6 04-16-2013 09:30 AM
Bowtie beginner question... milesgr General 6 03-14-2012 07:21 AM
Beginner Sequencing Analysis question mgibson General 0 06-17-2011 10:30 AM

Reply
 
Thread Tools
Old 12-20-2017, 06:36 AM   #1
Vassen
Junior Member
 
Location: greece

Join Date: Dec 2017
Posts: 2
Default Beginner question for Differential Expression Analysis

Hello,

I am a beginner in analyzing data from an RNA seq experiment. I was not the one performing the bioinformatics analysis (I am more of a bench scientist). So, I have an excel file in my hands. I am a bit confused though with how to retrieve my DE genes.
I have read what p and q values represent. I have understood that setting an FDR value threshold is a 'safe' choice in order to identify whether the significant differences recorded are truly significant.

I am a bit confused though with choosing the FDR threshold. If I understand correctly the level of 0.05 does not apply to all experiments.

Could you please refer me to some further reading, or perhaps provide me with some tips, so that I proceed correctly with my analysis?

I apologize if this is a very basic question. I appreciate your help.

Regards
Vassen
Vassen is offline   Reply With Quote
Old 01-11-2018, 02:07 PM   #2
sdriscoll
I like code
 
Location: San Diego, CA, USA

Join Date: Sep 2009
Posts: 437
Default

The raw p-values in your results are still what they are - at a per-gene level given the dispersion models of the expression values in conditions that gene has a low probability of NOT being deferentially expressed. Statistical reality, however, shows us that when we repeatedly run a statistical test between two groups of values that DO come from the same distribution (say split 20 values with a mean of 10 and stdev of 5 into two random groups) we will see 5% or so of those tests return a significant p-value. So given the large number of genes we are testing people theorize that there's a measurable effect of type I error.

In practice I think of the p-value and q-value (adjusted p-value, FDR, etc) differently in different situations. If our goal is a candidate type approach, which means we'll be running additional experiments to verify the RNA-seq result for that gene, we may use the raw p-values to get a broader list of candidates. If we have a phenotype and we want to report the number of genes affected or the percentage of genes enriched vs depleted we'll use the adjusted p-values since that is a more general claim.

Sometimes our experiment may yield zero significant genes by the adjusted p-values even though we know there's a phenotype. In those cases we may proceed with genes significant by raw p-value and keep in mind that we must proceed cautiously. We wouldn't do that if we were going straight into a figure with that result - we'd of course try to confirm if any of those genes appear to be different via other methods.

Finally, keep in mind that raw p-values likely have a high type-I error rate while the adjusted p-values likely have a high type-II error rate. Both of these rates improve the larger your sample size. Of course with higher and higher sample sizes you'll also get significance calls for features with smaller and smaller effect sizes and you'll have to start thinking in terms of "what is a significant effect?". I can't answer that one.
__________________
/* Shawn Driscoll, Gene Expression Laboratory, Pfaff
Salk Institute for Biological Studies, La Jolla, CA, USA */
sdriscoll is offline   Reply With Quote
Old 01-16-2018, 01:37 PM   #3
Vassen
Junior Member
 
Location: greece

Join Date: Dec 2017
Posts: 2
Default

Many thanks sdriscoll!!

Cheers
Vassen
Vassen is offline   Reply With Quote
Reply

Tags
beginner, fdr, p & q values, rna seq

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 03:50 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2018, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO