Go Back   SEQanswers > Bioinformatics > Bioinformatics

Similar Threads
Thread Thread Starter Forum Replies Last Post
DESeq2: diff gene expression between species using gene-specific normalization factor mra Bioinformatics 4 12-01-2014 07:17 AM
Proper installation of TopHat with Bowtie zorph Bioinformatics 5 09-17-2014 06:12 PM
DESeq2 normalization tellsparck Bioinformatics 3 07-15-2014 04:10 PM
proper normalization of ChIP-Seq data HESmith Bioinformatics 2 09-02-2013 07:50 AM
BWA pe alignment, sampe, proper pair kevlim83 Bioinformatics 3 10-28-2011 12:28 PM

Thread Tools
Old 08-06-2014, 06:06 PM   #1
Junior Member
Location: New Jersey

Join Date: Aug 2014
Posts: 2
Default DESeq2 - proper normalization for clustering?

Hello all,

While I know that there is no "right" way to perform clustering, I am wondering whether the rlog normalization in DESeq shrinks the data enough to be used for heirarchical clustering and subsequent analysis of time series expression profiles using cutree().

I am wondering because with rlog, I still get a pretty wide range of values, and my clusters end up not being as "tight" as I want them to be. Would doing something like median centering help, or should I use a different normalization than rlog, like standardization so that all of my rows have mean = 0, sd = 1.
rutgers2015 is offline   Reply With Quote
Old 08-09-2014, 08:59 PM   #2
Michael Love
Senior Member
Location: Boston

Join Date: Jul 2013
Posts: 333

You might also try the variance stabilizing transformation, but it could be that the within group variance is just large in your dataset.

Note that centering alone will not affect the distances. We do not recommend scaling the rows to have constant variance, because we do not want the rows with noisy, low counts to contribute equally as the rows with high counts, where we believe there to be more informative signal.
Michael Love is offline   Reply With Quote
Old 08-10-2014, 01:40 AM   #3
Wolfgang Huber
Senior Member
Location: Heidelberg, Germany

Join Date: Aug 2009
Posts: 109

Filtering out likely uninformative variables (e.g. with low overall variance after rlog or VST) can also improve clustering. As in the best case, they only add noise and uniformly increase all distances (Central Limit Theorem) and in other cases they may disproportionally pick up subtle underlying confounders (e.g. "batch effects"). One can also think of this as variable weighting.
Wolfgang Huber
Wolfgang Huber is offline   Reply With Quote


Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

All times are GMT -8. The time now is 11:35 AM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO