   SEQanswers calculate TMM normalization
 User Name Remember Me? Password
 Register FAQ Members List Calendar Search Today's Posts Mark Forums Read  Similar Threads Thread Thread Starter Forum Replies Last Post arendon Bioinformatics 53 08-20-2015 07:23 AM alisonewr Bioinformatics 2 10-27-2014 08:48 AM cybercot Bioinformatics 0 04-20-2011 09:55 PM dmartinez Bioinformatics 1 09-23-2010 09:39 AM minghui Bioinformatics 1 05-14-2010 02:01 AM 07-15-2014, 01:06 PM #1 Nepa10 Junior Member   Location: US Join Date: Jun 2014 Posts: 8 calculate TMM normalization Hello all, I would like to calculate TMM normalization for identifying DE genes. I know that edgeR does all the normalization when you run the code but I would like to calculate TMM for one or two genes to manually to better understand the statistic behind it. I found this formula to do so ( below) but I am not sure what the component, factor means? (raw.counts)*10^7/(librarysize*factors) librarysize=total number of mapped reads. I would really appreciate any suggestions and help. Thanks !   07-15-2014, 03:12 PM #2 dpryan Devon Ryan   Location: Freiburg, Germany Join Date: Jul 2011 Posts: 3,480 TMM wouldn't have any meaning (i.e., you couldn't calculate it) for one or two genes. The whole purpose of these sorts of normalization methods is to use larger datasets.   07-16-2014, 04:44 AM #3 Nepa10 Junior Member   Location: US Join Date: Jun 2014 Posts: 8 Thank you for the quick response and clarification. Would you mind providing some insight into how the calculation is done and what that component factor stands for in the equation. Thanks!   07-16-2014, 05:44 AM #4 dpryan Devon Ryan   Location: Freiburg, Germany Join Date: Jul 2011 Posts: 3,480 The algorithm for generating the normalization factor is described in the edgeR paper.   07-17-2014, 06:48 AM #5 Nepa10 Junior Member   Location: US Join Date: Jun 2014 Posts: 8 Yes I got the function you use to calculate TMM from that paper. It uses R function, calcNormFactors(). However I was wondering if I could manually calculate it using the formula I mentioned earlier since I do not know how to use R much. Thanks!   07-17-2014, 09:45 AM #6 dpryan Devon Ryan   Location: Freiburg, Germany Join Date: Jul 2011 Posts: 3,480 You can just look at the R code, which you'll need to learn anyway if you want to work in bioinformatics. calcNormFactors: Code: function (object, method = c("TMM", "RLE", "upperquartile", "none"), refColumn = NULL, logratioTrim = 0.3, sumTrim = 0.05, doWeighting = TRUE, Acutoff = -1e+10, p = 0.75) { if (is(object, "DGEList")) { x <- as.matrix(object\$counts) lib.size <- object\$samples\$lib.size } else { x <- as.matrix(object) lib.size <- colSums(x) } method <- match.arg(method) allzero <- rowSums(x > 0) == 0 if (any(allzero)) x <- x[!allzero, , drop = FALSE] if (nrow(x) == 0 || ncol(x) == 1) method = "none" f <- switch(method, TMM = { f75 <- .calcFactorQuantile(data = x, lib.size = lib.size, p = 0.75) if (is.null(refColumn)) refColumn <- which.min(abs(f75 - mean(f75))) if (length(refColumn) == 0 | refColumn < 1 | refColumn > ncol(x)) refColumn <- 1 f <- rep(NA, ncol(x)) for (i in 1:ncol(x)) f[i] <- .calcFactorWeighted(obs = x[, i], ref = x[, refColumn], libsize.obs = lib.size[i], libsize.ref = lib.size[refColumn], logratioTrim = logratioTrim, sumTrim = sumTrim, doWeighting = doWeighting, Acutoff = Acutoff) f }, RLE = .calcFactorRLE(x)/lib.size, upperquartile = .calcFactorQuantile(x, lib.size, p = p), none = rep(1, ncol(x))) f <- f/exp(mean(log(f))) if (is(object, "DGEList")) { object\$samples\$norm.factors <- f return(object) } else { return(f) } } .calcFactorQuantile Code: function (data, lib.size, p = 0.75) { y <- t(t(data)/lib.size) f <- apply(y, 2, function(x) quantile(x, p = p)) } .calcFactorWeighted Code: function (obs, ref, libsize.obs = NULL, libsize.ref = NULL, logratioTrim = 0.3, sumTrim = 0.05, doWeighting = TRUE, Acutoff = -1e+10) { if (all(obs == ref)) return(1) obs <- as.numeric(obs) ref <- as.numeric(ref) if (is.null(libsize.obs)) nO <- sum(obs) else nO <- libsize.obs if (is.null(libsize.ref)) nR <- sum(ref) else nR <- libsize.ref logR <- log2((obs/nO)/(ref/nR)) absE <- (log2(obs/nO) + log2(ref/nR))/2 v <- (nO - obs)/nO/obs + (nR - ref)/nR/ref fin <- is.finite(logR) & is.finite(absE) & (absE > Acutoff) logR <- logR[fin] absE <- absE[fin] v <- v[fin] n <- sum(fin) loL <- floor(n * logratioTrim) + 1 hiL <- n + 1 - loL loS <- floor(n * sumTrim) + 1 hiS <- n + 1 - loS keep <- (rank(logR) >= loL & rank(logR) <= hiL) & (rank(absE) >= loS & rank(absE) <= hiS) if (doWeighting) 2^(sum(logR[keep]/v[keep], na.rm = TRUE)/sum(1/v[keep], na.rm = TRUE)) else 2^(mean(logR[keep], na.rm = TRUE)) }   07-25-2014, 07:34 PM #7 Nepa10 Junior Member   Location: US Join Date: Jun 2014 Posts: 8 Thank you very much. Appreciate it!   06-13-2017, 03:13 PM #8 yusufkhan Junior Member   Location: Delhi Join Date: Nov 2010 Posts: 2 is it possible to calculate TMM and logFC value through tab delimited file. geneid file1 file2 a 32 33 b 68 410 c 78 140 d 31 340 e 213 470 f 211 29 total number of reads file1 7538239 file2 3828901 total number of reads align file1 6583829 file2 3029328  Thread Tools Show Printable Version Email this Page Posting Rules You may not post new threads You may not post replies You may not post attachments You may not edit your posts BB code is On Smilies are On [IMG] code is On HTML code is Off Forum Rules

All times are GMT -8. The time now is 01:10 PM. 