Seqanswers Leaderboard Ad

**dpryan** · 07-15-2014, 03:12 PM

TMM wouldn't have any meaning (i.e., you couldn't calculate it) for one or two genes. The whole purpose of these sorts of normalization methods is to use larger datasets.

**Nepa10** · 07-16-2014, 04:44 AM

Thank you for the quick response and clarification. Would you mind providing some insight into how the calculation is done and what that component factor stands for in the equation.

Thanks!

**dpryan** · 07-16-2014, 05:44 AM

The algorithm for generating the normalization factor is described in the edgeR paper.

**Nepa10** · 07-17-2014, 06:48 AM

Yes I got the function you use to calculate TMM from that paper. It uses R function, calcNormFactors().

However I was wondering if I could manually calculate it using the formula I mentioned earlier since I do not know how to use R much.

Thanks!

**dpryan** · 07-17-2014, 09:45 AM

You can just look at the R code, which you'll need to learn anyway if you want to work in bioinformatics.

calcNormFactors:

Code:

function (object, method = c("TMM", "RLE", "upperquartile", "none"), 
    refColumn = NULL, logratioTrim = 0.3, sumTrim = 0.05, doWeighting = TRUE, 
    Acutoff = -1e+10, p = 0.75) 
{
    if (is(object, "DGEList")) {
        x <- as.matrix(object$counts)
        lib.size <- object$samples$lib.size
    }
    else {
        x <- as.matrix(object)
        lib.size <- colSums(x)
    }
    method <- match.arg(method)
    allzero <- rowSums(x > 0) == 0
    if (any(allzero)) 
        x <- x[!allzero, , drop = FALSE]
    if (nrow(x) == 0 || ncol(x) == 1) 
        method = "none"
    f <- switch(method, TMM = {
        f75 <- .calcFactorQuantile(data = x, lib.size = lib.size, 
            p = 0.75)
        if (is.null(refColumn)) refColumn <- which.min(abs(f75 - 
            mean(f75)))
        if (length(refColumn) == 0 | refColumn < 1 | refColumn > 
            ncol(x)) refColumn <- 1
        f <- rep(NA, ncol(x))
        for (i in 1:ncol(x)) f[i] <- .calcFactorWeighted(obs = x[, 
            i], ref = x[, refColumn], libsize.obs = lib.size[i], 
            libsize.ref = lib.size[refColumn], logratioTrim = logratioTrim, 
            sumTrim = sumTrim, doWeighting = doWeighting, Acutoff = Acutoff)
        f
    }, RLE = .calcFactorRLE(x)/lib.size, upperquartile = .calcFactorQuantile(x, 
        lib.size, p = p), none = rep(1, ncol(x)))
    f <- f/exp(mean(log(f)))
    if (is(object, "DGEList")) {
        object$samples$norm.factors <- f
        return(object)
    }
    else {
        return(f)
    }
}

.calcFactorQuantile

Code:

function (data, lib.size, p = 0.75) 
{
    y <- t(t(data)/lib.size)
    f <- apply(y, 2, function(x) quantile(x, p = p))
}

.calcFactorWeighted

Code:

function (obs, ref, libsize.obs = NULL, libsize.ref = NULL, logratioTrim = 0.3, 
    sumTrim = 0.05, doWeighting = TRUE, Acutoff = -1e+10) 
{
    if (all(obs == ref)) 
        return(1)
    obs <- as.numeric(obs)
    ref <- as.numeric(ref)
    if (is.null(libsize.obs)) 
        nO <- sum(obs)
    else nO <- libsize.obs
    if (is.null(libsize.ref)) 
        nR <- sum(ref)
    else nR <- libsize.ref
    logR <- log2((obs/nO)/(ref/nR))
    absE <- (log2(obs/nO) + log2(ref/nR))/2
    v <- (nO - obs)/nO/obs + (nR - ref)/nR/ref
    fin <- is.finite(logR) & is.finite(absE) & (absE > Acutoff)
    logR <- logR[fin]
    absE <- absE[fin]
    v <- v[fin]
    n <- sum(fin)
    loL <- floor(n * logratioTrim) + 1
    hiL <- n + 1 - loL
    loS <- floor(n * sumTrim) + 1
    hiS <- n + 1 - loS
    keep <- (rank(logR) >= loL & rank(logR) <= hiL) & (rank(absE) >= 
        loS & rank(absE) <= hiS)
    if (doWeighting) 
        2^(sum(logR[keep]/v[keep], na.rm = TRUE)/sum(1/v[keep], 
            na.rm = TRUE))
    else 2^(mean(logR[keep], na.rm = TRUE))
}

**Nepa10** · 07-25-2014, 07:34 PM

Thank you very much. Appreciate it!

**yusufkhan** · 06-13-2017, 03:13 PM

is it possible to calculate TMM and logFC value through tab delimited file.

geneid file1 file2
a 32 33
b 68 410
c 78 140
d 31 340
e 213 470
f 211 29

total number of reads
file1 7538239
file2 3828901

total number of reads align
file1 6583829
file2 3029328

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 30 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 32 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 28 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 53 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

calculate TMM normalization

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News