SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Variance Estimation KellerMac Bioinformatics 13 02-18-2013 01:07 AM
Can DEXSeq output variance stabilized data like DESeq? elizzybethy Bioinformatics 0 09-19-2012 07:09 AM
Variance of fragment sizes minhduc Sample Prep / Library Generation 3 05-20-2012 11:42 PM

Reply
 
Thread Tools
Old 05-07-2013, 01:02 AM   #1
john_nl
Member
 
Location: UK

Join Date: Feb 2012
Posts: 13
Default DESeq Variance Stabilizing Transformation

Hello,

I am looking for some feedback regarding the use of the variance-stabilization (VST) methods found in the DESeq2 package. Hopefully one of the authors will respond and the comments will be of help to others.

For me, the purpose for applying this transformation is to be able to generate moderated fold changes for clustering of genes (not samples as in the vignette).

My data consists of a time series, where for each time point there is a "treated" sample and a "control" sample. Each sample (timepoint) consists of 4 biological replicates.

I performed the VST on the entire set of data and plot the per-gene standard deviation against the rank of the
mean*, for the shifted logarithm log2 (n + 1) (left) and the variance stabilizing transformation (right), it does not appear to have a pronounced effect.



However, if i set up a count dataset that consists of the samples corresponding to one timepoint only (first timepoint in the example below), and perform the VST and plot the standard deviation against rank of the mean, the transformed values have a much better stabilized standard deviation.



So my questions are: Is there anyway to obtain better variance stabilized data when considering the entire timeseries? Should I just perform the VST on a per timepoint basis; after all I will only be computing fold changes between treatment and control samples at the same timepoint.

*The procedure was performed as per the DESeq2 manual:

dds <- estimateSizeFactors(dds)
dds <- estimateDispersions(dds)
vsd <- varianceStabilizingTransformation(dds)
par(mfrow=c(1,2))
plot(rank(rowMeans(counts(dds))), genefilter::rowVars(log2(counts(dds)+1)), main="log2(x+1) transform")
plot(rank(rowMeans(assay(vsd))), genefilter::rowVars(assay(vsd)), main="VST")
john_nl is offline   Reply With Quote
Old 07-23-2013, 05:50 AM   #2
moritzhess
Member
 
Location: freiburg

Join Date: Apr 2010
Posts: 25
Default

As far as I know, you have to tell DESEQ to treat all expression values as if they were emerging from a single condition by specifying method="blind" when extimating the Dispersions.
moritzhess is offline   Reply With Quote
Old 11-20-2013, 02:57 PM   #3
Him26
Member
 
Location: California US

Join Date: Aug 2011
Posts: 19
Default

I have a slightly unrelated question. It's about the plot.
Why is the variance low for low mean ? shouldn't it start high and decrease as the mean increase?
I have a similar data set and even if I filter requiring higher cpm the trend still persists.
Any one know of why this is the case?
Him26 is offline   Reply With Quote
Old 11-28-2013, 07:18 AM   #4
Shawn_Li
Junior Member
 
Location: Winnipeg

Join Date: Oct 2012
Posts: 1
Default DESeq2 variance

I guess it all depends on the type of data. For my NGS bacterial 16sRNA data, SD increase as the mean increases.
Attached Files
File Type: pdf Rplot.pdf (1.18 MB, 37 views)
Shawn_Li is offline   Reply With Quote
Old 12-09-2013, 06:08 AM   #5
Michael Love
Senior Member
 
Location: Boston

Join Date: Jul 2013
Posts: 333
Default

hi John,

The VST helps to stabilize the variance over the mean, insofar as this can be captured by the parametric curve of dispersion over mean. You might also try the rlog transformation, which sometimes performs qualitatively better than the VST (for example, if the size factors vary a lot across samples).
Michael Love is offline   Reply With Quote
Old 04-28-2014, 01:11 AM   #6
ayana.rajagopal
Junior Member
 
Location: bangalore, india

Join Date: Jul 2013
Posts: 2
Default

Hi guys,
Is the VST package of DESeq still functional? Because most of the functions of VST including getVarianceStabilizedData() seem to be dysfunctional in R version 3.0.1. Please help.
ayana.rajagopal is offline   Reply With Quote
Old 04-30-2014, 06:38 AM   #7
Michael Love
Senior Member
 
Location: Boston

Join Date: Jul 2013
Posts: 333
Default

hi Ayana,

Can you post the code which you think is not working. Please include full code, R output and sessionInfo()

The VST and rlog are both implemented in DESeq2, which we suggest you use over DESeq.
Michael Love is offline   Reply With Quote
Old 04-30-2014, 11:11 AM   #8
Wolfgang Huber
Senior Member
 
Location: Heidelberg, Germany

Join Date: Aug 2009
Posts: 109
Default

Quote:
Originally Posted by moritzhess View Post
As far as I know, you have to tell DESEQ to treat all expression values as if they were emerging from a single condition by specifying method="blind" when extimating the Dispersions.
Yes. And depending on the data, there may not always be a variance stabilising transformation. In particular, the error model on which the transformation is based assumes that for most genes the variance is dominated by technical noise and natural biological variation between replicates, and that the effects of true differential expression affect only a minority of genes. If that is not the case, then the whole concept does not really work.

As Mike Love says, the variance stabilsing transformation tends to be misled in cases when the size factors strongly vary between samples, and (at least) in these case the rlog transformation is preferable.
__________________
Wolfgang Huber
EMBL

Last edited by Wolfgang Huber; 04-30-2014 at 11:16 AM.
Wolfgang Huber is offline   Reply With Quote
Old 04-30-2014, 11:14 AM   #9
Wolfgang Huber
Senior Member
 
Location: Heidelberg, Germany

Join Date: Aug 2009
Posts: 109
Default

@Him26: Note that in John's plots the y-axis is on a log-scale.
If you do the same kind of plot with sd computed on the original scale of the counts, then you will indeed expect them to increase with the mean.
__________________
Wolfgang Huber
EMBL
Wolfgang Huber is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 02:49 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO