SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Different fpkm values for cuffdiff and cuffcompare madsaan Bioinformatics 3 12-12-2012 04:14 PM
Different FPKM values of cufflinks and cuffdiff mrfox Bioinformatics 5 10-17-2012 01:10 PM
Cufflinks and cuffdiff FPKM values combiochem Bioinformatics 12 10-13-2012 11:37 PM
Different FPKM values of cufflinks and cuffdiff in latest version mrfox Bioinformatics 1 11-23-2010 05:23 AM
tophat, not producing rpkm values with -G option warrenemmett Bioinformatics 6 06-02-2010 07:31 AM

Reply
 
Thread Tools
Old 08-30-2011, 01:28 AM   #1
ocs
Member
 
Location: Berlin, Germany

Join Date: May 2011
Posts: 27
Default Cuffdiff producing q-values > 1

Hello at all,

I use TopHat and Cufflinks to process some human RNA-seq data with an annotation gft-file for known genes (this means my pipeline is: TopHat -> Cuffdiff).

In the resulting gene_expr.diff file I obtain q-values greater than 1:

Code:
> range(expr$q_value)
[1] 0.00000 1.08198
Since I thought that a q-value is a probability (fdr corrected p-value) it should not be greater than 1.

Its clear to me that when the p-value is corrected there can be q-values greater than 1 but I thought its limited to 1 in this cases. I am confused about the values, 1.08 is a very small value assuming the q-value is not a probability anymore (then I would expect higher values) but its to big for a rounding error I think.

Maybe I got there something wrong. Could someone explain it to me?

Thanks in advance,
Oliver
ocs is offline   Reply With Quote
Old 08-30-2011, 10:18 AM   #2
shilez
Member
 
Location: Rockville, MD

Join Date: Jun 2010
Posts: 10
Default

I got the same above 1 q value, would also want to know how q value is generated
shilez is offline   Reply With Quote
Old 08-30-2011, 05:47 PM   #3
severin
Genome Informatics Facility
 
Location: Iowa @isugif

Join Date: Sep 2009
Posts: 105
Default fdr

Benjamini, Y. & Hochberg, Y., 1995. Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal Statistical Society. Series B

The equation does the correction and it is up to the user to take appropriate values. ie >0.05

there may be a more resent paper too but this is the original.

this paper should clear it up
severin is offline   Reply With Quote
Old 08-31-2011, 04:30 AM   #4
ocs
Member
 
Location: Berlin, Germany

Join Date: May 2011
Posts: 27
Default

Hello severin, thanks for the paper. I looked at it, but they always correct the cutoff for the p-value and not the p-value itself. They don't even mention the q-value. So the paper may be useful for the background but I can't see that it explains my inital question.
ocs is offline   Reply With Quote
Old 08-31-2011, 06:26 AM   #5
severin
Genome Informatics Facility
 
Location: Iowa @isugif

Join Date: Sep 2009
Posts: 105
Default R script for fdr

Here is an R script for false discovery rate based on that paper.

#FALSE DISCOVERY RATE
bh.fdr=function(p){
#
#This function computes q-values using Benjamini and Hochberg's (1995)
#approach for controlling FDR. Courtesy of Dan Nettleton
#
m = length(p)
k = 1:m
ord = order(p)
p[ord] = (p[ord] * m)/k
qval = p
qval[ord]=rev(cummin(rev(qval[ord])))
return(qval)
}
severin is offline   Reply With Quote
Old 08-31-2011, 06:49 AM   #6
shilez
Member
 
Location: Rockville, MD

Join Date: Jun 2010
Posts: 10
Default

Thank you severin!
Now I can generate q value based on the cufflinks pvalue output, I think there is a bug in cufflinks when generating q values, possibly that they miss the line qval[ord]=rev(cummin(rev(qval[ord]))), which results in a >1 q value.
shilez is offline   Reply With Quote
Old 09-01-2011, 12:47 AM   #7
ocs
Member
 
Location: Berlin, Germany

Join Date: May 2011
Posts: 27
Default

Thanks for that script, severin.

I have a question concerning the use of the cummin function. In this case it ensures that the q-value is not getting bigger than 1 (thats nice) but also that if once a q-value is reached, the following values can't get bigger but would be set to the lowest value known to this time point (actually this happens in reversed order). This means that some q-values are not the appropriate q-value for the according p-value. And it means also that they are ascending. At first glance this seem not to be legal. So for example if I have q-values (in p-value order):
Code:
0.01, 0.02, 0.06, 0.04, 0.07
this procedure would result in
Code:
0.01, 0.02, 0.04, 0.04, 0.07
.

Having a cutoff of 0.05, in line 1, the third would not be accepted but the fourth. In line 2 the third is accepted (but due to correction it wouldn't be). This is what is confusing me: What happens to q-values that are again lower as one preceding value? Why is it okay to accept them anyway? I looked at the paper, but its not clear to me.
ocs is offline   Reply With Quote
Old 08-15-2012, 10:26 AM   #8
ypark28
Junior Member
 
Location: Baltimore

Join Date: Apr 2012
Posts: 1
Default

hope this help ->
http://brainder.org/2011/09/05/fdr-c...sted-p-values/
ypark28 is offline   Reply With Quote
Reply

Tags
cuffdiff, q-value

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 05:46 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO