Seqanswers Leaderboard Ad

**BENM** · 07-05-2011, 12:19 AM

hi, do you mean experiment normalization or data normalization for quantification analysis?

If it is for cDNA libraries normalization, one of application is duplex-specific nuclease (DSN), which is based on the kinetics of cDNA reassociation. (refers to: P. A. Zhulidov, etc. al., A Method for the Preparation of Normalized cDNA Libraries Enriched with Full-Length Sequences. Russian Journal of Bioorganic Chemistry, Vol. 31, No. 2, 2005. and Irina Shagina, etc. al., Normalization of genomic DNA using duplex-specific nuclease. BioTechniques 48:455-459, June 2010)

Or the later, there is two general formulas for RNA-seq data normalization: RPKM (reads per kilobase per millions of reads mapped) and FPKM (fragments per kilobase per million mapped fragments), and an useful tool - Cufflinks. You can follow the previous post in SEQanswer to find more details: RNA-seq and normalization numbers (http://seqanswers.com/forums/showthr...p?t=586&page=1)

**harshinamdar** · 07-05-2011, 02:30 AM

hi BENM,
i meant the later one.
thank you for providing the link to this old post. that what i was looking for.thanks.

**luoye** · 12-27-2012, 07:05 PM

Originally posted by harshinamdar View Post

hi BENM,
i meant the later one.
thank you for providing the link to this old post. that what i was looking for.thanks.

hi,everyone
i want to use TMM method to normalization,but i encounter a question ,how can i get the normalized counts after TMM ,thank you very much.

**chadn737** · 12-27-2012, 10:35 PM

Originally posted by luoye View Post

hi,everyone
i want to use TMM method to normalization,but i encounter a question ,how can i get the normalized counts after TMM ,thank you very much.

You can use EdgeR to get TMM normalized data using calcNormFactors() in R.

What do you want to use the normalized data as input for?

**luoye** · 12-28-2012, 01:20 AM

Originally posted by chadn737 View Post

You can use EdgeR to get TMM normalized data using calcNormFactors() in R.

What do you want to use the normalized data as input for?

hi chadn737
thank you very much for your reply,I mean is that when i use EdgeR to get TMM calcNormFactors() in R to nomalization ,i want to see the difference
between normalized data and the raw data .for example ,In DESeq, you get normalized counts by dividing the raw counts by the appropriate size factor.but in edgeR ,how can i do this normalized counts ?
thank you

**chadn737** · 12-28-2012, 07:20 AM

Do the same thing with the normalization factors from EdgeR. You can even feed DESeq the normalization factors from EdgeR by using sizeFactors(cds)= normalization factors from EdgeR

**luoye** · 12-28-2012, 05:39 PM

Originally posted by chadn737 View Post

Do the same thing with the normalization factors from EdgeR. You can even feed DESeq the normalization factors from EdgeR by using sizeFactors(cds)= normalization factors from EdgeR

sorry,i can not understand what you mean,can you tell me some more detail？
did you mean is: cds=calcNormFactors(cds) ,sizeFactors(cds)?
thank you very much.

**chadn737** · 12-31-2012, 08:59 AM

Originally posted by luoye View Post

sorry,i can not understand what you mean,can you tell me some more detail？
did you mean is: cds=calcNormFactors(cds) ,sizeFactors(cds)?
thank you very much.

When you first use DESeq, you combine a table of counts and a list of conditions to create a count data set

Code:

cds <- newCountDataSet(countTable,conditions)

You can give the count data set your own size factors using

Code:

sizeFactors(cds) <- #input

If you wanted to use TMM normalized sizeFactors from EdgeR rather than those given by DESeq then you can first:

Code:

x <- calcNormFactors(as.matrix(countTable)

and then give this to the count data set:

Code:

sizeFactors(cds) <- x

**luoye** · 12-31-2012, 09:16 PM

Originally posted by chadn737 View Post

When you first use DESeq, you combine a table of counts and a list of conditions to create a count data set

Code:

cds <- newCountDataSet(countTable,conditions)

You can give the count data set your own size factors using

Code:

sizeFactors(cds) <- #input

If you wanted to use TMM normalized sizeFactors from EdgeR rather than those given by DESeq then you can first:

Code:

x <- calcNormFactors(as.matrix(countTable)

and then give this to the count data set:

Code:

sizeFactors(cds) <- x

thank you very much,i do as you say,but the result is not what i expect.

**Shanrong** · 02-13-2013, 07:28 PM

size factors in DESeq and edgeR

Originally posted by chadn737 View Post

When you first use DESeq, you combine a table of counts and a list of conditions to create a count data set

Code:

cds <- newCountDataSet(countTable,conditions)

You can give the count data set your own size factors using

Code:

sizeFactors(cds) <- #input

If you wanted to use TMM normalized sizeFactors from EdgeR rather than those given by DESeq then you can first:

Code:

x <- calcNormFactors(as.matrix(countTable)

and then give this to the count data set:

Code:

sizeFactors(cds) <- x

Yes, both DESeq and edgeR have functions to normalize the data. However, it's wrong to assign the size factors calculated in edgeR to DESeq, though conceptually fine at first sight. Because in DESEq, the size factor is used to 'transform' the raw reads into a 'common' ground, and you can use the normalized counts for differential analysis. But the size factor in edgeR adjusts the library size so that the gene abundence (=counts/"effective library size", and "effective library size = "library size" * "size factor") is comparable across samples.

To illustrate this point, see example below.

Code:

# data
y <- x <- rep(1,100)
y[1] <- 101  
xy <- data.frame(x=x,y=y)

#edgeR
edger <- DGEList(counts=xy)
edger <- calcNormFactors(edger)
edger$samples

#DESeq
deseq = newCountDataSet( xy, conditions=c("c1","c2") )
deseq = estimateSizeFactors( deseq )
sizeFactors( deseq )

> sizeFactors( deseq )
x y
1 1
> edger$samples
group lib.size norm.factors
x 1 100 1.4142
y 1 200 0.7071

**chadn737** · 02-13-2013, 08:22 PM

Originally posted by Shanrong View Post

Yes, both DESeq and edgeR have functions to normalize the data. However, it's wrong to assign the size factors calculated in edgeR to DESeq, though conceptually fine at first sight. Because in DESEq, the size factor is used to 'transform' the raw reads into a 'common' ground, and you can use the normalized counts for differential analysis. But the size factor in edgeR adjusts the library size so that the gene abundence (=counts/"effective library size", and "effective library size = "library size" * "size factor") is comparable across samples.

To illustrate this point, see example below.

Code:

# data
y <- x <- rep(1,100)
y[1] <- 101  
xy <- data.frame(x=x,y=y)

#edgeR
edger <- DGEList(counts=xy)
edger <- calcNormFactors(edger)
edger$samples

#DESeq
deseq = newCountDataSet( xy, conditions=c("c1","c2") )
deseq = estimateSizeFactors( deseq )
sizeFactors( deseq )

> sizeFactors( deseq )
x y
1 1
> edger$samples
group lib.size norm.factors
x 1 100 1.4142
y 1 200 0.7071

Thank you for this. For my own work I have not done this, but in a project where I am a collaborator, the statistician in the group did use the EdgeR normalized data for input into DESeq. I know it gives very different results, and have avoided it in my own work because the DESeq size factors seemed to give more conservative results and I prefer working with fewer genes that I am very confident in than more genes of lower confidence. I'll have to bring this up on the project that I am collaborating on.

**Marianna85** · 03-03-2013, 01:49 PM

Hi everyone,
I'm dealing with the two normalization methods DESeq and edgeR.
I have two conditions and only one replicate per condition (I know, bad experimental design...) and I tried to normalize the raw counts.
With bot the normalization methods I obtain size factors very different:
-using DESeq 0,095 for one library and 10,85 for the other.
-using edgeR 0,14 and 7,2 respectively.

Obviously, by dividing the raw counts for the corrisponding size factor, the raw counts drammatically change, sometimes inverting the starting conditions (an upregulated gene become dowregulated).

Does it make sense?
do you think it's correct to use this normalization methods despite the weird results??

Thank you all

**Jeremy** · 03-03-2013, 09:32 PM

Originally posted by Marianna85 View Post

Hi everyone,
I'm dealing with the two normalization methods DESeq and edgeR.
I have two conditions and only one replicate per condition (I know, bad experimental design...) and I tried to normalize the raw counts.
With bot the normalization methods I obtain size factors very different:
-using DESeq 0,095 for one library and 10,85 for the other.
-using edgeR 0,14 and 7,2 respectively.

Obviously, by dividing the raw counts for the corrisponding size factor, the raw counts drammatically change, sometimes inverting the starting conditions (an upregulated gene become dowregulated).

Does it make sense?
do you think it's correct to use this normalization methods despite the weird results??

Thank you all

Looks like you have a huge difference (is that a 20-50 fold difference?) in read count between conditions, this is a problem because the normalization will significantly amplify the noise of the smaller sample making the (already unreliable without replicates) data less reliable.
But yes, you need to normalize. Is more sequencing an option?

**Marianna85** · 03-04-2013, 02:23 AM

Hi Jeremy,
yes I have a huge difference in read count between conditions: 36 vs 64 million of total reads.
I do not have other options unfortunately and I want to make a simple differential espression analysis, maybe with few differential expressed genes.
I know that, without replicates, it's difficult to make a DE analysis and I don't want to reach false conclusions. I know I have to be very conservative to say something really reliable...but how??

In your opinion, can I discard some genes, for example those with a low count reads, and make the normalization for the remaining ones?

Thanks a lot

Marianna

Topics	Statistics	Last Post
AI Model Maps 3D Genome Structures in Minutes by seqadmin Started by seqadmin, Yesterday, 09:07 AM	0 responses 12 views 0 likes	Last Post by seqadmin Yesterday, 09:07 AM
Long-Read Sequencing Speeds Up Diagnosis of Rare Genetic Diseases by seqadmin Started by seqadmin, 01-31-2025, 08:31 AM	0 responses 22 views 0 likes	Last Post by seqadmin 01-31-2025, 08:31 AM
New Genome Analysis Tool Offers Scalable Phylogenomic Insights by seqadmin Started by seqadmin, 01-24-2025, 07:35 AM	0 responses 78 views 0 likes	Last Post by seqadmin 01-24-2025, 07:35 AM
How T Cells Protect the Gut from Infections by seqadmin Started by seqadmin, 01-23-2025, 09:43 AM	0 responses 46 views 0 likes	Last Post by seqadmin 01-23-2025, 09:43 AM

Seqanswers Leaderboard Ad

Announcement

RNA Seq normalization

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News