Hello all,
I am trying out different normalization methods for RNA-seq.
The newest release of EdgeR offers (among others) a normalization method called "RLE" that is supposed to be an implementation of what is also implemented in DESeq. This is explained in the edgeR manual. See ?calcNormFactors.
I thus expected to obtain the same normalization factors with both packages.
I am using part of the MAQC-2 data set that is referenced in a number of papers on RNA-seq. There are 14 samples, of which 7 are brain and 7 are UHR.
Here's what I tried:
"countsMatrix" is a matrix of raw counts (one column for each sample)
conds <- c( rep("brain", 7), rep("UHR", 7))
a)
cds <- newCountDataSet( countsMatrix, conds )
cds <- estimateSizeFactors( cds )
sizeFactors(cds)
b)
d <- DGEList(counts=countsMatrix, group=conds, lib.size=colSums(countsTable))
d <- calcNormFactors(d, method="RLE")
d$samples$norm.factors
Results:
a)
1.1430 1.1597 1.1695 1.1707 1.1751 0.3293 1.1643 1.1489 1.1650 1.1781 1.1802 0.4877 1.1617 1.1596
b)
1.0546 1.0354 1.0167 1.0291 1.0178 0.7133 1.0330 1.0645 1.0705 1.0876 1.0837 0.7619 1.0796 1.0564
Might anyone have a suggestion why the resulting normalization factors are different?
I am trying out different normalization methods for RNA-seq.
The newest release of EdgeR offers (among others) a normalization method called "RLE" that is supposed to be an implementation of what is also implemented in DESeq. This is explained in the edgeR manual. See ?calcNormFactors.
I thus expected to obtain the same normalization factors with both packages.
I am using part of the MAQC-2 data set that is referenced in a number of papers on RNA-seq. There are 14 samples, of which 7 are brain and 7 are UHR.
Here's what I tried:
"countsMatrix" is a matrix of raw counts (one column for each sample)
conds <- c( rep("brain", 7), rep("UHR", 7))
a)
cds <- newCountDataSet( countsMatrix, conds )
cds <- estimateSizeFactors( cds )
sizeFactors(cds)
b)
d <- DGEList(counts=countsMatrix, group=conds, lib.size=colSums(countsTable))
d <- calcNormFactors(d, method="RLE")
d$samples$norm.factors
Results:
a)
1.1430 1.1597 1.1695 1.1707 1.1751 0.3293 1.1643 1.1489 1.1650 1.1781 1.1802 0.4877 1.1617 1.1596
b)
1.0546 1.0354 1.0167 1.0291 1.0178 0.7133 1.0330 1.0645 1.0705 1.0876 1.0837 0.7619 1.0796 1.0564
Might anyone have a suggestion why the resulting normalization factors are different?
Comment