05312013, 10:32 AM  #1 
Member
Location: CA Join Date: Jan 2011
Posts: 28

cuffdiff (v 2.1.1) pvalues
I have rnaseq data which I analyzed using the latest versions of tophatcufflinkscuffdiff ( that includes the new dispersion methods)
The authors mention that, "Because the test is now based on explicit sampling from the beta negative binomial, users will not see values less than 10^5 by default" . I understand this is the lowest possible p one will see, which then in turn outputs a uniform adj p (0.00801514) for 45 of my genes that had equal or smaller than 10^ 5 pval. Is it legitimate to see the same p val for 100's of genes when it is NOT equal or less than 10^5? Thanks 
06032013, 01:18 PM  #2 
Junior Member
Location: Honolulu Join Date: Jun 2013
Posts: 2

I was wondering about the same issue.
Looks as if the genes are sorted to discrete p/q bins, and every 45 or so genes (in my case) appear to have the same p/q values. Janos 
06032013, 05:01 PM  #3 
Junior Member
Location: South Australia Join Date: Apr 2013
Posts: 4

As of version 2.1, cuffdiff uses a sampling method to calculate the pvalues. As I understand it, the new method draws about 10000 simulations from the null hypothesis of no significant change in expression, and defines the pvalue as the proportion of those 10000 simulations that were more extreme than what was observed. As a consequence, the pvalues you get out are slightly 'discretised', and many sites may have the same pvalue (the pvalues can only be fractions out of 10000.)
Hope that answers your question. 
06032013, 05:23 PM  #4 
Junior Member
Location: Honolulu Join Date: Jun 2013
Posts: 2

Yes, that answered my question, thank you nicrob!

06032013, 09:25 PM  #5 
Member
Location: CA Join Date: Jan 2011
Posts: 28

Thanks you both for the insight

