I'm using limma_3.14.1
Suppose I set up:
data<-c(2,2,3,3,5,6,7,8)
data<-rbind(data,data) # Two row matrix
batch<-factor(c("b1","b1","b2","b2","b1","b1","b2","b2"))
treat<-factor(c("a","a","a","a","control","control","b","b"))
treat<-relevel(treat,ref="control")
We can see from the first 4 data points with identical treatment that the bach effct of b2 vs b1 is +1.
removeBatchEffect(data,batch=batch,design=model.matrix(~treat))
This results in
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
data 2.5 2.5 2.5 2.5 5 6 7 8
data 2.5 2.5 2.5 2.5 5 6 7 8
i.e. The batch effect has been corrected for in the first 4 columns.
However the last four columns which also belong to batch b1 or b2 are not adjusted.
Whereas, if I run limma with the batch factor included in the design matrix to estimate the fold change between treatment "b" and the control (i.e. the data in the last four columns)
fit <- lmFit(data,design=model.matrix(~batch+treat))
fit <- eBayes(fit)
topTable(fit,coef="treatb")
I get:
row ID logFC AveExpr t P.Value adj.P.Val B
1 data 1 4.5 1.235397 0.2517299 0.2517299 -4.591222
2 data 1 4.5 1.235397 0.2517299 0.2517299 -4.591222
Here we see that limma has effectively reduced the log2(FC) of treatment b vs control from 2 to 1 because of the batch factor.
So my question is, why does removeBatcheffects not act similarly and return rows reading 2.5,2.5,2.5,2.5,5.5,6.5,6.5,7.5? Have I made an error, or does removeBatchEffect work in a subtly different way?
Suppose I set up:
data<-c(2,2,3,3,5,6,7,8)
data<-rbind(data,data) # Two row matrix
batch<-factor(c("b1","b1","b2","b2","b1","b1","b2","b2"))
treat<-factor(c("a","a","a","a","control","control","b","b"))
treat<-relevel(treat,ref="control")
We can see from the first 4 data points with identical treatment that the bach effct of b2 vs b1 is +1.
removeBatchEffect(data,batch=batch,design=model.matrix(~treat))
This results in
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
data 2.5 2.5 2.5 2.5 5 6 7 8
data 2.5 2.5 2.5 2.5 5 6 7 8
i.e. The batch effect has been corrected for in the first 4 columns.
However the last four columns which also belong to batch b1 or b2 are not adjusted.
Whereas, if I run limma with the batch factor included in the design matrix to estimate the fold change between treatment "b" and the control (i.e. the data in the last four columns)
fit <- lmFit(data,design=model.matrix(~batch+treat))
fit <- eBayes(fit)
topTable(fit,coef="treatb")
I get:
row ID logFC AveExpr t P.Value adj.P.Val B
1 data 1 4.5 1.235397 0.2517299 0.2517299 -4.591222
2 data 1 4.5 1.235397 0.2517299 0.2517299 -4.591222
Here we see that limma has effectively reduced the log2(FC) of treatment b vs control from 2 to 1 because of the batch factor.
So my question is, why does removeBatcheffects not act similarly and return rows reading 2.5,2.5,2.5,2.5,5.5,6.5,6.5,7.5? Have I made an error, or does removeBatchEffect work in a subtly different way?