SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Reply
 
Thread Tools
Old 03-13-2013, 09:19 AM   #1
Simon Anders
Senior Member
 
Location: Heidelberg, Germany

Join Date: Feb 2010
Posts: 994
Default DESeq2

Hi,

An announcement of interest to users of DESeq:

Mike Love, Wolfgang Huber and I have been updating the DESeq package. This resulted in the package DESeq2, which is already now available from the Bioconductor development branch, and scheduled to be included in the next Bioconductor release.

For several release cycles, the original package (DESeq) will be maintained at its current functionality, in order to not disrupt the workflows of DESeq users. For new projects, we recommend using DESeq2. Major innovations are:

* Base class: SummarizedExperiment (from the GenomicRanges package) is used as the superclass for storing the data, rather than eSet. This allows closer integration with upstream workflows involving GenomicRanges features, such as summarizeOverlaps, and facilitates downstream analyses of the genomic regions of interest.

* Simplified workflow: the wrapper function DESeq() performs all steps for a differential expression analysis. The individual steps are of course also accessible.

* More powerful statistics: incorporation of prior distributions into the estimation of dispersions and fold changes (empirical-Bayes shrinkage). The dispersion shrinkage improves power compared to the old DESeq. The fold changes shrinkage help moderate the otherwise large spread in log fold changes for genes with low counts, while it has negligible effect on genes with high counts; it may be particularly useful for visualisation, clustering, classification, ordination (PCA, MDS), similar to the variance-stabilizing transformation in the old DESeq. A Wald test for significance is provided as the default inference method, with the chi-squared test of the previous version is also available. A manuscript is in preparation.

* Normalization: it is possible to provide a matrix of sample- and gene-specific normalization factors, which allows the use of normalisation factors from Bioconductor packages such as cqn and EDASeq.

Examples of usage are provided in the vignette, and more details are available in the manual pages (specifically, the DESeq function and estimateDispersions function).

Enjoy -

Mike, Simon, Wolfgang.
Simon Anders is offline   Reply With Quote
Old 03-13-2013, 10:33 AM   #2
chadn737
Senior Member
 
Location: US

Join Date: Jan 2009
Posts: 392
Default

Exciting news, thanks Simon. You guys have created some of the best tools out there and I am excited to see what this offers.

PS. I notice that the vignette is as well written as your last and puts the details on a level that people like me can easily grasp. Thanks.

Last edited by chadn737; 03-13-2013 at 10:36 AM.
chadn737 is offline   Reply With Quote
Old 03-13-2013, 10:38 AM   #3
turnersd
Senior Member
 
Location: Charlottesville, VA

Join Date: May 2011
Posts: 112
Default

Great news, thanks Simon, Mike, Wolfgang. Looking forward to going through the vignette. Are you publishing any comparisons with the original DESeq and/or other tools?
turnersd is offline   Reply With Quote
Old 03-27-2013, 01:09 AM   #4
EGrassi
Member
 
Location: Turin, Italy

Join Date: Oct 2010
Posts: 66
Default

Thanks! Looking forward to the new options.

Is it just me or the bioconductor linked vignette is a 16Mb pdf with 10701 pages and a lot of repetitions starting from page 6?
EGrassi is offline   Reply With Quote
Old 03-27-2013, 01:15 AM   #5
Simon Anders
Senior Member
 
Location: Heidelberg, Germany

Join Date: Feb 2010
Posts: 994
Default

Quote:
Originally Posted by EGrassi View Post
Is it just me or the bioconductor linked vignette is a 16Mb pdf with 10701 pages and a lot of repetitions starting from page 6?
Yes, that's a bug that Mike has already fixed. The corrected vignette should become available today or tomorrow.
Simon Anders is offline   Reply With Quote
Old 05-29-2013, 06:27 PM   #6
FLYINGDOLPHIN
Member
 
Location: washington dc

Join Date: Apr 2013
Posts: 11
Default

Hi,

I'd love to install deseq2. But I have a problem in installing in.


> source("http://bioconductor.org/biocLite.R")
BioC_mirror = http://www.bioconductor.org
Change using chooseBioCmirror().
> biocLite("DESeq2")
Using R version 2.11.1, biocinstall version 2.6.10.
Installing Bioconductor version 2.6 packages:
[1] "DESeq2"
Please wait...

Warning message:
In getDependencies(pkgs, dependencies, available, lib) :
package ‘DESeq2’ is not available

I have no problem in installing deseq though.

Thanks a lot!

Q
FLYINGDOLPHIN is offline   Reply With Quote
Old 05-29-2013, 06:29 PM   #7
chadn737
Senior Member
 
Location: US

Join Date: Jan 2009
Posts: 392
Default

I would first try making sure R is up to date.
chadn737 is offline   Reply With Quote
Old 05-29-2013, 06:35 PM   #8
FLYINGDOLPHIN
Member
 
Location: washington dc

Join Date: Apr 2013
Posts: 11
Default

oh, it is a R package from the core installation.
R version 2.11.1, biocinstall version 2.6.10.

Thanks.
Q
FLYINGDOLPHIN is offline   Reply With Quote
Old 05-29-2013, 10:51 PM   #9
Simon Anders
Senior Member
 
Location: Heidelberg, Germany

Join Date: Feb 2010
Posts: 994
Default

If you have an old R, the BiocLite script will pull matching old Bioconductor packages. DESeq2 is new, and while you could install it manually on an old R, the recommended way is to use R 3.0.0, and then biocLite will find and install it automatically.
Simon Anders is offline   Reply With Quote
Old 05-30-2013, 03:29 AM   #10
FLYINGDOLPHIN
Member
 
Location: washington dc

Join Date: Apr 2013
Posts: 11
Default

Thanks!
Q
FLYINGDOLPHIN is offline   Reply With Quote
Old 05-30-2013, 06:39 AM   #11
Heisman
Senior Member
 
Location: St. Louis

Join Date: Dec 2010
Posts: 535
Default

As said above, you guys do great things. Question; do you believe that this package would be good for determining differentially methylated regions from enrichment data (ie, MBD-seq or MeDIP-seq)? I know there are some papers using DESeq for this purpose, so I imagine so, but I was curious if you have any specific thoughts or caveats in mind.
Heisman is offline   Reply With Quote
Old 05-30-2013, 09:57 AM   #12
FLYINGDOLPHIN
Member
 
Location: washington dc

Join Date: Apr 2013
Posts: 11
Default

Hi,

Does deseq development version better than the release version? The manual that i found on Bioconductor seems to aim for the development version since the "normalize=True" does not work. I am just wondering whether it is worthwhile to switch to a development version and where to find it?

Thanks.
Q
FLYINGDOLPHIN is offline   Reply With Quote
Old 07-25-2013, 10:19 AM   #13
trpc
Junior Member
 
Location: china

Join Date: Oct 2010
Posts: 1
Default

Hi
This is my DESeqDataSet object
Quote:
> Genes
class: DESeqDataSet
dim: 21937 4
exptData(0):
assays(1): counts
rownames(21937): 0610007C21Rik 0610007L01Rik ... Zzef1 Zzz3
rowData metadata column names(0):
colnames(4): 13-5.bam 13-5-5.bam E15-2.sorted.bam E15-5.sorted.bam
colData names(2): fileName E13E15
Since I want to test only "13-5.bam" and "13-5-5.bam", so I set colData(Genes)$E13E15 like this:
Quote:
>colData(Genes)$E13E15 <- factor(colData(Genes)$E13E15,levels=c("E13WT","E13KO"))
> colData(Genes)
DataFrame with 4 rows and 2 columns
fileName E13E15
<BamFileList> <factor>
13-5.bam ######## E13WT
13-5-5.bam ######## E13KO
E15-2.sorted.bam ######## NA
E15-5.sorted.bam ######## NA
However, when I change the DESeqDataSet object into Deseq object, it reports error:
Quote:
> Genes<-DESeq(Genes)
estimating size factors
estimating dispersions
same number of samples and coefficients to fit, estimating dispersion by treating samples as replicates
gene-wise dispersion estimates
mean-dispersion relationship
final dispersion estimates
fitting generalized linear model

error: element-wise multiplication: incompatible matrix dimensions: 4x1 and 2x1

Error:

Last edited by trpc; 07-25-2013 at 10:37 AM.
trpc is offline   Reply With Quote
Old 08-13-2013, 07:24 AM   #14
zhang51
Junior Member
 
Location: OH

Join Date: Aug 2013
Posts: 3
Default

Dear Simon and everyone,

I have downloaded DESeq2 package with no trouble. But when I tried to run your example R code, I got trouble even with the first line "dds <- DESeqDataSet(se = se, design = ~ condition)". The error message is "error in evaluating the argument 'x' in selecting a method for function 'assays': Error: object 'se' not found".

I have run "library(DESeq2)" at the very beginning. I wonder what else I should do to make it work?

I used R-3.0.1 for the installation.

Thanks in advance.

Jenny
zhang51 is offline   Reply With Quote
Old 08-13-2013, 07:27 AM   #15
Simon Anders
Senior Member
 
Location: Heidelberg, Germany

Join Date: Feb 2010
Posts: 994
Default

Which example code are your running, i.e., what documentation are you reading?
Simon Anders is offline   Reply With Quote
Old 08-13-2013, 07:33 AM   #16
dpryan
Devon Ryan
 
Location: Freiburg, Germany

Join Date: Jul 2011
Posts: 3,480
Default

Quote:
Originally Posted by zhang51 View Post
Dear Simon and everyone,

I have downloaded DESeq2 package with no trouble. But when I tried to run your example R code, I got trouble even with the first line "dds <- DESeqDataSet(se = se, design = ~ condition)". The error message is "error in evaluating the argument 'x' in selecting a method for function 'assays': Error: object 'se' not found".

I have run "library(DESeq2)" at the very beginning. I wonder what else I should do to make it work?
Assuming you're following the vignette, you missed a few lines, which would have setup "se".

Code:
library("parathyroidSE")
data("parathyroidGenesSE")
se <- parathyroidGenesSE
colnames(se) <- colData(se)$run
Edit: Just go to the next page of the vignette, I expect what you're looking at is the "Quick Start" section.

Last edited by dpryan; 08-13-2013 at 07:38 AM.
dpryan is offline   Reply With Quote
Old 08-13-2013, 10:11 AM   #17
zhang51
Junior Member
 
Location: OH

Join Date: Aug 2013
Posts: 3
Default

To Simon,

I'm running code in the file http://www.bioconductor.org/packages...doc/DESeq2.pdf

I encountered the same problem when running meanSdPlot function. It works only when normalized=FALSE. There is an error message when using normalized=TRUE. Please see below:

In is.na(sizeFactors(object)) :
is.na() applied to non-(list or vector) of type 'NULL'
Error in meanSdPlot(log2(counts(dds, normalized = TRUE)[notAllZero, ] + :
error in evaluating the argument 'x' in selecting a method for function 'meanSdPlot': Error in .local(object, ...) :
first calculate size factors, add normalizationFactors, or set normalized=FALSE

What would be the solution to this problem?

Thanks
zhang51 is offline   Reply With Quote
Old 08-13-2013, 10:20 AM   #18
zhang51
Junior Member
 
Location: OH

Join Date: Aug 2013
Posts: 3
Default

I figured it out. I did not run DESeq function first before making this plot. Now it works.

Thanks
zhang51 is offline   Reply With Quote
Old 08-22-2013, 11:38 AM   #19
haggardd
Junior Member
 
Location: Oregon

Join Date: Apr 2013
Posts: 5
Default

Hello,

DESeq was great in that the nbinomTest() function allowed for me to specify which conditions I wanted to run the test on. However, DESeq2 appears to not have this functionality. I have data that was run through HTSeq and so am using the DESeqDataSetFromHTSeqCount() function. I am analyzing data that underwent two different exposure concentrations at different percentages of vehicle (i.e. 1% vehicle control, 10uM chemical in 1% vehicle, 0.1% vehicle control, and 1uM chemical in 0.1% vehicle). There are four total conditions for this experiment and I want to compare the expression between pairs with the same percentage of vehicle (i.e. 1% vehicle control versus 10uM chemical in 1% vehicle). However, when I run DESeq() the resultsNames() I get to choose from are only from four comparisons and not the pairwise comparisons I need. Is this possible in DESeq2 as it was in the original DESeq to specify which conditions I want to test? Or would it just be easier to initially have the comparisons I want to analyze already separated instead of how I am doing it currently? Thoughts?

Thanks!
haggardd is offline   Reply With Quote
Old 08-22-2013, 10:25 PM   #20
Michael Love
Senior Member
 
Location: Boston

Join Date: Jul 2013
Posts: 333
Default

For making comparisons of multiple conditions (not only against the base level of a condition), we have recently implemented contrasts in the development branch. This allows one to fit a single model, then generate log2 fold change estimates, standard errors and tests of null hypotheses for other comparisons.

The functionality is described in section 3.2 of the vignette, 'Contrasts' and in the man page for ?results, for DESeq2 version 1.1.x which is paired with Bioc 2.13.

http://bioconductor.org/packages/2.1...doc/DESeq2.pdf

You can either try using the devel branch of Bioconductor...

http://bioconductor.org/developers/how-to/useDevel/

or this version will be released in Oct 15, 2013:

http://bioconductor.org/developers/release-schedule/
Michael Love is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 10:13 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO