View Single Post
Old 01-27-2014, 10:07 AM   #16
Simon Anders
Senior Member
Location: Heidelberg, Germany

Join Date: Feb 2010
Posts: 994

Originally Posted by sindrle View Post
You run DESeq2, you pick out 10 genes you want to look at including p values.
Say 6 genes have p < 0.05.
You then use p.adjust in R.
What FDR do you choose and why?
Which n do you set?
If you pick the ten genes a priori, i.e., in a manner that is independent of the the outcome, the you can run p.adjust only on the p values from these 10 genes.

By a choice "a priori", I mean that you knew before doing the analysis that these genes are worth looking at and others are not. If, however, you have chosen these ten genes precisely because their expression data in this very experiment looked so interesting that you want them to be in your result list, then you need to run p.adjust on all genes.

In the former case, you only wanted to look at these genes, so your test only has to reject the null hypothesis that precisely these genes seem to have a signal that looks interesting but arose only due to chance. In the latter case, you have to reject the null hypothesis that somewhere in your data with its many genes, some of which will show strong signals merely due to chance fluctuations, there will be ten genes, which look so far out as to appear interesting. As this is much more likely to happen if it may be any 10 genes rather than a fixed set of 10 genes, therefore the signal has to be stronger to convince us that it is not mere chance. Hence the more stringent multiple-testing adjustment.
Simon Anders is offline   Reply With Quote