![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
How to calculate coverage | arendon | Bioinformatics | 53 | 08-20-2015 08:23 AM |
RNA-seq normalization: How to use TMM and rpkm() in EdgeR?? | alisonewr | Bioinformatics | 2 | 10-27-2014 09:48 AM |
Calculate p-value of SNP | cybercot | Bioinformatics | 0 | 04-20-2011 10:55 PM |
how calculate MAF ? | dmartinez | Bioinformatics | 1 | 09-23-2010 10:39 AM |
TMM nomalization in RNA-seq | minghui | Bioinformatics | 1 | 05-14-2010 03:01 AM |
![]() |
|
Thread Tools |
![]() |
#1 |
Junior Member
Location: US Join Date: Jun 2014
Posts: 8
|
![]()
Hello all,
I would like to calculate TMM normalization for identifying DE genes. I know that edgeR does all the normalization when you run the code but I would like to calculate TMM for one or two genes to manually to better understand the statistic behind it. I found this formula to do so ( below) but I am not sure what the component, factor means? (raw.counts)*10^7/(librarysize*factors) librarysize=total number of mapped reads. I would really appreciate any suggestions and help. Thanks ! |
![]() |
![]() |
![]() |
#2 |
Devon Ryan
Location: Freiburg, Germany Join Date: Jul 2011
Posts: 3,480
|
![]()
TMM wouldn't have any meaning (i.e., you couldn't calculate it) for one or two genes. The whole purpose of these sorts of normalization methods is to use larger datasets.
|
![]() |
![]() |
![]() |
#3 |
Junior Member
Location: US Join Date: Jun 2014
Posts: 8
|
![]()
Thank you for the quick response and clarification. Would you mind providing some insight into how the calculation is done and what that component factor stands for in the equation.
Thanks! |
![]() |
![]() |
![]() |
#4 |
Devon Ryan
Location: Freiburg, Germany Join Date: Jul 2011
Posts: 3,480
|
![]()
The algorithm for generating the normalization factor is described in the edgeR paper.
|
![]() |
![]() |
![]() |
#5 |
Junior Member
Location: US Join Date: Jun 2014
Posts: 8
|
![]()
Yes I got the function you use to calculate TMM from that paper. It uses R function, calcNormFactors().
However I was wondering if I could manually calculate it using the formula I mentioned earlier since I do not know how to use R much. Thanks! |
![]() |
![]() |
![]() |
#6 |
Devon Ryan
Location: Freiburg, Germany Join Date: Jul 2011
Posts: 3,480
|
![]()
You can just look at the R code, which you'll need to learn anyway if you want to work in bioinformatics.
calcNormFactors: Code:
function (object, method = c("TMM", "RLE", "upperquartile", "none"), refColumn = NULL, logratioTrim = 0.3, sumTrim = 0.05, doWeighting = TRUE, Acutoff = -1e+10, p = 0.75) { if (is(object, "DGEList")) { x <- as.matrix(object$counts) lib.size <- object$samples$lib.size } else { x <- as.matrix(object) lib.size <- colSums(x) } method <- match.arg(method) allzero <- rowSums(x > 0) == 0 if (any(allzero)) x <- x[!allzero, , drop = FALSE] if (nrow(x) == 0 || ncol(x) == 1) method = "none" f <- switch(method, TMM = { f75 <- .calcFactorQuantile(data = x, lib.size = lib.size, p = 0.75) if (is.null(refColumn)) refColumn <- which.min(abs(f75 - mean(f75))) if (length(refColumn) == 0 | refColumn < 1 | refColumn > ncol(x)) refColumn <- 1 f <- rep(NA, ncol(x)) for (i in 1:ncol(x)) f[i] <- .calcFactorWeighted(obs = x[, i], ref = x[, refColumn], libsize.obs = lib.size[i], libsize.ref = lib.size[refColumn], logratioTrim = logratioTrim, sumTrim = sumTrim, doWeighting = doWeighting, Acutoff = Acutoff) f }, RLE = .calcFactorRLE(x)/lib.size, upperquartile = .calcFactorQuantile(x, lib.size, p = p), none = rep(1, ncol(x))) f <- f/exp(mean(log(f))) if (is(object, "DGEList")) { object$samples$norm.factors <- f return(object) } else { return(f) } } Code:
function (data, lib.size, p = 0.75) { y <- t(t(data)/lib.size) f <- apply(y, 2, function(x) quantile(x, p = p)) } Code:
function (obs, ref, libsize.obs = NULL, libsize.ref = NULL, logratioTrim = 0.3, sumTrim = 0.05, doWeighting = TRUE, Acutoff = -1e+10) { if (all(obs == ref)) return(1) obs <- as.numeric(obs) ref <- as.numeric(ref) if (is.null(libsize.obs)) nO <- sum(obs) else nO <- libsize.obs if (is.null(libsize.ref)) nR <- sum(ref) else nR <- libsize.ref logR <- log2((obs/nO)/(ref/nR)) absE <- (log2(obs/nO) + log2(ref/nR))/2 v <- (nO - obs)/nO/obs + (nR - ref)/nR/ref fin <- is.finite(logR) & is.finite(absE) & (absE > Acutoff) logR <- logR[fin] absE <- absE[fin] v <- v[fin] n <- sum(fin) loL <- floor(n * logratioTrim) + 1 hiL <- n + 1 - loL loS <- floor(n * sumTrim) + 1 hiS <- n + 1 - loS keep <- (rank(logR) >= loL & rank(logR) <= hiL) & (rank(absE) >= loS & rank(absE) <= hiS) if (doWeighting) 2^(sum(logR[keep]/v[keep], na.rm = TRUE)/sum(1/v[keep], na.rm = TRUE)) else 2^(mean(logR[keep], na.rm = TRUE)) } |
![]() |
![]() |
![]() |
#7 |
Junior Member
Location: US Join Date: Jun 2014
Posts: 8
|
![]()
Thank you very much. Appreciate it!
|
![]() |
![]() |
![]() |
#8 |
Junior Member
Location: Delhi Join Date: Nov 2010
Posts: 2
|
![]()
is it possible to calculate TMM and logFC value through tab delimited file.
geneid file1 file2 a 32 33 b 68 410 c 78 140 d 31 340 e 213 470 f 211 29 total number of reads file1 7538239 file2 3828901 total number of reads align file1 6583829 file2 3029328 |
![]() |
![]() |
![]() |
Thread Tools | |
|
|