Seqanswers Leaderboard Ad

**NicoBxl** · 05-12-2011, 10:17 PM

nobody knows ?

**Simon Anders** · 05-12-2011, 10:28 PM

DESeq and edgeR both offer ANODEV tests.

**NicoBxl** · 05-12-2011, 11:12 PM

what's the function to do this ?

I've a dataframe A with 20 samples ( 4 groups of 5 samples )

Code:

groups <- c(rep("A",5),rep("B",5),rep("C",5),rep("D",5))
cds <- newCountDataSet(A,groups)
cds <- estimateSizeFactors(cds)
cds <- estimateVarianceFunctions(cds)

res <- nbinomTest(cds,condA="A",condB="B") # two group test

**Simon Anders** · 05-13-2011, 01:29 AM

Sorry, I need to add this to the vignette. In your case, this should work:

Code:

groups <- c( rep("A",5), rep("B",5), rep("C",5), rep("D",5) )
cds <- newCountDataSet( A, groups )
cds <- estimateSizeFactors( cds )
cds <- estimateVarianceFunctions( cds, "pooled" )

fit0 <- nbinomFitGLM( cds, count ~ 1 )
fit1 <- nbinomFitGLM( cds, count ~ condition )
pvals <- nbinomGLMTest( fit1, fit0 )

Note that this is a good deal slower than the standard test.

If you have more than one factor, pass a data frame to 'newCountDataSet' instead of 'groups' and then, you can write down more interesting model formulae than the one in this example, using the data frame's column names instead of 'condition'.

Simon

**NicoBxl** · 05-13-2011, 01:31 AM

ok I'll try this.

Thanks Simon

**RSK** · 05-16-2011, 08:41 AM

Dear Simon,
Is there any place where I can find more information about performing ANODEV analysis using DESeq?
Alternatively, when do you think the vignette will include this information?
Thanks!
RSK

**chenhao392** · 05-16-2011, 02:50 PM

Dear Simon,

I am currently working with RNA-seq datasets in 5 conditions without replicates. It is so lucky for me to see this vignette, Howerer, I found that I can't use "pooled" methods in estimateVarianceFunctions. Or alternatively, I can't go through nbinomFitGLM with "blind" methods in estimateVarianceFunctions. Both error msgs is attached below. But in the nbinomFitGLM function description, it says:

Use this function to estimate coefficients and calculate deviance
from a GLM for each gene. The GLM uses the ‘nbkd.sf’ family, with
the dispersion estimate according to ‘getVarianceFunction(cds)’.
Note that this requires that the variance functions were estimated
with method "pooled" or "blind".

When I was trying to use "method=pooled" in estimateVarianceFunctions

Code:

Error in rowSums(sapply(tapply((1:ncol(counts))[replicated_sample], factor(conditions[replicated_sample]),  : 
  'x' must be an array of at least two dimensions

When I was using "method=blind" instead.

Code:

Error in nbinomFitGLM(cds, count ~ 1) : 
  No pooled variance function found. Have you called 'estimateVarianceFunctions' with 'method="pooled"'?

my code is:

Code:

library(DESeq);
list=read.table("../data/5_condition.txt",row.names=1,header=T);
conds <- c("CFGU","CFGW","CFHG","CFHH","CFHI");
cds <- newCountDataSet( list,conds);
cds <- estimateSizeFactors( cds );
cds <- estimateVarianceFunctions(cds, method='blind');
fit0 <- nbinomFitGLM( cds, count ~ 1 )
fit1 <- nbinomFitGLM( cds, count ~ condition )
result <- nbinomGLMTest( fit1, fit0 )

Could you tell me what is my mistake here?
Thanks in advance,

Hao Chen

Originally posted by Simon Anders View Post

Sorry, I need to add this to the vignette. In your case, this should work:

Code:

groups <- c( rep("A",5), rep("B",5), rep("C",5), rep("D",5) )
cds <- newCountDataSet( A, groups )
cds <- estimateSizeFactors( cds )
cds <- estimateVarianceFunctions( cds, "pooled" )

fit0 <- nbinomFitGLM( cds, count ~ 1 )
fit1 <- nbinomFitGLM( cds, count ~ condition )
pvals <- nbinomGLMTest( fit1, fit0 )

Note that this is a good deal slower than the standard test.

If you have more than one factor, pass a data frame to 'newCountDataSet' instead of 'groups' and then, you can write down more interesting model formulae than the one in this example, using the data frame's column names instead of 'condition'.

Simon

**yoyoq** · 11-08-2011, 01:56 PM

hi,

just for reference the function is now called
fitNbinomGLMs

**jaymiRNA** · 08-02-2012, 01:03 AM

Differential expression DESeq using GLM for timecourse experiment

Dear Members
I am currently working on NGS data and performed some differential expression analysis using DESeq. At the moment I am trying to find differentially expressed miRNAs using GLM functionality.
My experimental design consists of 4 samples (4stages of developing flower) with 2 biological replicates each. I would like to identify the differentially expressed miRNA during different developmental stages(Stage 1,2,3 and 4). However, I have some doubts regarding my final data interpretation and here is the code I used and the output.

My Dataset:
>Str(SSSA)
Dataframe: 662 obs. of 8 variables:
$ Flower1 : int 1 1 6 0 0 0 6709 0 3 0 ...
$ Flower1.1: int 1 1 1 0 0 0 2261 2 1 2 ...
$ Flower2 : int 0 4 49 0 0 0 7311 0 4 0 ...
$ Flower2.1: int 1 14 69 0 0 0 11211 0 1 0 ...
$ Flower3 : int 0 8 89 1 0 0 1978 0 0 1 ...
$ Flower3.1: int 0 4 72 0 0 0 5392 0 1 1 ...
$ Flower4 : int 0 1 67 0 0 0 4675 0 0 0 ...
$ Flower4.1: int 0 0 21 0 0 0 4629 0 1 5 ...

My code
groups <- c( rep("F1",2), rep("F2",2), rep("F3",2), rep("F4",2) )
cds <- newCountDataSet(SSSA , groups )
cds <- estimateSizeFactors( cds )
cds <- estimateDispersions( cds, "pooled" )
fit0 <- fitNbinomGLMs ( cds, count ~ 1 )
fit1 <- fitNbinomGLMs ( cds, count ~ groups )
pvals <- nbinomGLMTest( fit1, fit0 )
padjGLM <- p.adjust( pvals, method="BH" )
fit1.pval <- cbind(fit1, padjGLM)

My Result
> str(fit1.pval)
'data.frame': 662 obs. of 7 variables:
$ (Intercept): num 1.23 1.16 2.77 -32.46 NA ...
$ groupsF2 : num -3.17 1.07 2.17 -2.1 NA ...
$ groupsF3 : num -32.569 0.859 2.973 30.973 NA ...
$ groupsF4 : num -31.675 -2.008 2.835 -0.875 NA ...
$ deviance : num 0.746 1.895 2.041 0.234 NA ...
$ converged : logi TRUE TRUE TRUE TRUE NA NA ...
$ pval : num 0.28395 0.44052 0.00622 0.86603 NA ...

My doubts.....
1.In my case, is it better to set the conditions as a data frame instead of a vector- “groups <- c( rep("F1",2), rep("F2",2), rep("F3",2), rep("F4",2) )”?
2.In the code, I calculated the object fit0 setting the value 1 after the tilde -“fit0 <- fitNbinomGLMs ( cds, count ~ 1 )” but I am not very clear.. what count ~ 1 stand for/refers to ?

3.My final results( above) shows 7 variables(including the adjusted pvalue) in the data.frame and I am not clear in interpreting this data. In particular , by using ‘groups’ in calling fit 1 i.e. fit1 <- fitNbinomGLMs ( cds, count ~ groups ),I can’t understand if I really compared F1 against all other stages F2,F3,F4, as I intend or If I made any other comparisions.

Could anyone help me on this?

Jay

**jaymiRNA** · 08-03-2012, 12:04 AM

No body knows ?

**Simon Anders** · 08-06-2012, 05:41 AM

Please do not cross-post. The reply is in the other thread.

Topics	Statistics	Last Post
Genetic Variants and Diabetes Risk in Childhood Cancer Survivors by seqadmin Started by seqadmin, Yesterday, 08:47 AM	0 responses 14 views 0 likes	Last Post by seqadmin Yesterday, 08:47 AM
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 60 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 60 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 54 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM

Seqanswers Leaderboard Ad

Announcement

DE analysis with multiple groups

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News