polsum 03-18-2012 12:05 PM

edgeR Normalization question
The edgeR manual gives us three examples for DE analysis with each example normalizing the data in different manner.

In the case study 1, the Normalization was done by calcNormFactors. In the case study 2 however, only counts per million or cpm was calculated and no calcNormFactors was used. In the case study 3, both cpm and calcNormFactors was used. In this 3rd example, we are essentially modifying the data two times before performing DE analysis which is strange to me.

I basically want to perform miRNA DE analysis between two samples (each with 3 replicates). Which of the above methods is applicable for me?

Can any one please educate me? thanks in advance.:)

Gordon Smyth 05-07-2012 01:21 AM

Dear polsum,

The edgeR User's Guide contains five case studies, and all are normalized exactly the same way, except for the second which deals with deepSAGE data for which simple library size normalization was considered sufficient.

You are mis-understanding the role of cpm, which are used only as descriptive quantities for data exploration in edgeR, not as part of the formal normalization or differential expression process.

We obviously recommend, unless there are good reasons otherwise, that you use calcNormFactors() for an RNA-Seq analysis.

Note that edgeR is a Bioconductor package, so the way to get a prompt answer is to post your question to the Bioconductor mailing list. See the section "how to get help" in the edgeR User's Guide.

It also seems that your version of edgeR must be very old. Could I suggest that you update?


