
Similar Threads  
Thread  Thread Starter  Forum  Replies  Last Post 
How to calculate coverage  arendon  Bioinformatics  53  08202015 07:23 AM 
RNAseq normalization: How to use TMM and rpkm() in EdgeR??  alisonewr  Bioinformatics  2  10272014 08:48 AM 
Calculate pvalue of SNP  cybercot  Bioinformatics  0  04202011 09:55 PM 
how calculate MAF ?  dmartinez  Bioinformatics  1  09232010 09:39 AM 
TMM nomalization in RNAseq  minghui  Bioinformatics  1  05142010 02:01 AM 

Thread Tools 
07152014, 01:06 PM  #1 
Junior Member
Location: US Join Date: Jun 2014
Posts: 8

calculate TMM normalization
Hello all,
I would like to calculate TMM normalization for identifying DE genes. I know that edgeR does all the normalization when you run the code but I would like to calculate TMM for one or two genes to manually to better understand the statistic behind it. I found this formula to do so ( below) but I am not sure what the component, factor means? (raw.counts)*10^7/(librarysize*factors) librarysize=total number of mapped reads. I would really appreciate any suggestions and help. Thanks ! 
07152014, 03:12 PM  #2 
Devon Ryan
Location: Freiburg, Germany Join Date: Jul 2011
Posts: 3,479

TMM wouldn't have any meaning (i.e., you couldn't calculate it) for one or two genes. The whole purpose of these sorts of normalization methods is to use larger datasets.

07162014, 04:44 AM  #3 
Junior Member
Location: US Join Date: Jun 2014
Posts: 8

Thank you for the quick response and clarification. Would you mind providing some insight into how the calculation is done and what that component factor stands for in the equation.
Thanks! 
07162014, 05:44 AM  #4 
Devon Ryan
Location: Freiburg, Germany Join Date: Jul 2011
Posts: 3,479

The algorithm for generating the normalization factor is described in the edgeR paper.

07172014, 06:48 AM  #5 
Junior Member
Location: US Join Date: Jun 2014
Posts: 8

Yes I got the function you use to calculate TMM from that paper. It uses R function, calcNormFactors().
However I was wondering if I could manually calculate it using the formula I mentioned earlier since I do not know how to use R much. Thanks! 
07172014, 09:45 AM  #6 
Devon Ryan
Location: Freiburg, Germany Join Date: Jul 2011
Posts: 3,479

You can just look at the R code, which you'll need to learn anyway if you want to work in bioinformatics.
calcNormFactors: Code:
function (object, method = c("TMM", "RLE", "upperquartile", "none"), refColumn = NULL, logratioTrim = 0.3, sumTrim = 0.05, doWeighting = TRUE, Acutoff = 1e+10, p = 0.75) { if (is(object, "DGEList")) { x < as.matrix(object$counts) lib.size < object$samples$lib.size } else { x < as.matrix(object) lib.size < colSums(x) } method < match.arg(method) allzero < rowSums(x > 0) == 0 if (any(allzero)) x < x[!allzero, , drop = FALSE] if (nrow(x) == 0  ncol(x) == 1) method = "none" f < switch(method, TMM = { f75 < .calcFactorQuantile(data = x, lib.size = lib.size, p = 0.75) if (is.null(refColumn)) refColumn < which.min(abs(f75  mean(f75))) if (length(refColumn) == 0  refColumn < 1  refColumn > ncol(x)) refColumn < 1 f < rep(NA, ncol(x)) for (i in 1:ncol(x)) f[i] < .calcFactorWeighted(obs = x[, i], ref = x[, refColumn], libsize.obs = lib.size[i], libsize.ref = lib.size[refColumn], logratioTrim = logratioTrim, sumTrim = sumTrim, doWeighting = doWeighting, Acutoff = Acutoff) f }, RLE = .calcFactorRLE(x)/lib.size, upperquartile = .calcFactorQuantile(x, lib.size, p = p), none = rep(1, ncol(x))) f < f/exp(mean(log(f))) if (is(object, "DGEList")) { object$samples$norm.factors < f return(object) } else { return(f) } } Code:
function (data, lib.size, p = 0.75) { y < t(t(data)/lib.size) f < apply(y, 2, function(x) quantile(x, p = p)) } Code:
function (obs, ref, libsize.obs = NULL, libsize.ref = NULL, logratioTrim = 0.3, sumTrim = 0.05, doWeighting = TRUE, Acutoff = 1e+10) { if (all(obs == ref)) return(1) obs < as.numeric(obs) ref < as.numeric(ref) if (is.null(libsize.obs)) nO < sum(obs) else nO < libsize.obs if (is.null(libsize.ref)) nR < sum(ref) else nR < libsize.ref logR < log2((obs/nO)/(ref/nR)) absE < (log2(obs/nO) + log2(ref/nR))/2 v < (nO  obs)/nO/obs + (nR  ref)/nR/ref fin < is.finite(logR) & is.finite(absE) & (absE > Acutoff) logR < logR[fin] absE < absE[fin] v < v[fin] n < sum(fin) loL < floor(n * logratioTrim) + 1 hiL < n + 1  loL loS < floor(n * sumTrim) + 1 hiS < n + 1  loS keep < (rank(logR) >= loL & rank(logR) <= hiL) & (rank(absE) >= loS & rank(absE) <= hiS) if (doWeighting) 2^(sum(logR[keep]/v[keep], na.rm = TRUE)/sum(1/v[keep], na.rm = TRUE)) else 2^(mean(logR[keep], na.rm = TRUE)) } 
07252014, 07:34 PM  #7 
Junior Member
Location: US Join Date: Jun 2014
Posts: 8

Thank you very much. Appreciate it!

06132017, 03:13 PM  #8 
Junior Member
Location: Delhi Join Date: Nov 2010
Posts: 2

is it possible to calculate TMM and logFC value through tab delimited file.
geneid file1 file2 a 32 33 b 68 410 c 78 140 d 31 340 e 213 470 f 211 29 total number of reads file1 7538239 file2 3828901 total number of reads align file1 6583829 file2 3029328 
Thread Tools  

