SEQanswers (
-   Bioinformatics (
-   -   Using alternative normalization method in DESeq analysis of gene enrichment (

simon_seq 08-27-2015 08:24 AM

Using alternative normalization method in DESeq analysis of gene enrichment
I'm using DESeq to identify differentially expressed genes in a next-gen sequencing dataset. (DESeq: In my experiment, the normalization of read counts as implemented by DESeq may not perform as well as anticipated. DESeq uses a 'size factor' to achieve a common scale of count values across samples. The size factor for a given library is defined as the median of the ratios of observed counts to the geometric mean of each corresponding target over all samples.

For my dataset, upper-quartile scaling (using the 75th percentile of data, which often has low read counts, for linear scaling) may improve performance. In my specific case it may even be possible to define a set of genes as an internal scaling standard.

In order for the statistical test for differential gene expression to work correctly, am I allowed to use an alternative method for data normalization?

dpryan 08-27-2015 11:05 AM

Sure, see the CQN package for an example of a different normalization method that can be used with DESeq2, which you should be using rather than DESeq.

simon_seq 08-27-2015 11:54 AM

Thanks a lot for your quick answer!

May I ask, what are the formal constraints when changing normalization methods, such that the basic concept of DESeq(2) still works? Ie. assumption of a negative binomial distribution, Fisher's exact testing? As long as I transform the data linearly and get count values out, I'm fine to transform the data?

simon_seq 08-27-2015 11:59 AM

And to extend my question, would I be fine to apply a LOESS transformation? It appears to me that in some of my conditions, there are very active targets, which then cause a bias in reads. Thus, a LOESS transformation may be relevant. Here are some plots of my data, MA plots given on the right side of the diagonal.

(Here's the link in case the picture doesn't show: )

dpryan 08-27-2015 12:11 PM

The trick is to not transform the data at all. Leave the data untouched and supply offsets and such to produce the needed normalization. Again, see how the CQN package and DESeq2 interact.

All times are GMT -8. The time now is 01:51 AM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.