SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
[DESeq] versions, conds definition, estimateDispersions vs estimateVarianceFunctions Azazel Bioinformatics 5 11-17-2011 10:51 AM
Different versions of samtools-pileup dg.pooja Bioinformatics 3 02-11-2011 08:46 AM
keeping track of versions mgogol Bioinformatics 1 02-08-2011 02:33 PM
MAQ versions... aleferna Bioinformatics 1 07-22-2010 02:57 AM

Reply
 
Thread Tools
Old 12-12-2011, 10:34 AM   #1
daler
Junior Member
 
Location: DC Metro area

Join Date: Feb 2011
Posts: 8
Default DESeq versions: mimic 1.4.1 with 1.6.1 settings?

I am unable to replicate DESeq v1.4.1 results using v1.6.1, even when using the settings that -- as far as I can tell from the docs -- should replicate the old behavior. Here's a self-contained working example . . . but it needs parallel installations of R 2.13.1 and R 2.14.0 in order to work.

First, I created data using only v1.6.1 and saved it to file:

Code:
library(DESeq)
cds <- makeExampleCountDataSet()
write.table(counts(cds), file='example.counts')

In R 2.13.1 I ran DESeq v1.4.1:
Code:
library(DESeq)
x <- read.table('example.counts')
conds <- c('A', 'A','B','B','B')
cds <- newCountDataSet(x, conds)
cds <- estimateSizeFactors(cds)

cds <- estimateVarianceFunctions(cds, method='normal')

res <- nbinomTest(cds, 'A', 'B')
write.table(res, file='old.results', sep='\t', row.names=F)
Then, over in R 2.14.0, I ran DESeq v1.6.1. Note that everything except the "estimateDispersions" line is the same:
Code:
library(DESeq)
x <- read.table('example.counts')
conds <- c('A', 'A','B','B','B')
cds <- newCountDataSet(x, conds)
cds <- estimateSizeFactors(cds)
cds <- estimateSizeFactors(cds)

cds <- estimateDispersions(cds, sharingMode='fit-only', 
                           fitType='local', method='per-condition')

res <- nbinomTest(cds, 'A', 'B')
write.table(res, file='new.results', sep='\t', row.names=F)
When I compare new.results with old.results, basemeanA, basemeanB, and the fold change columns are identical.

However, the pval and padj columns are different; plotting them results in two straight-ish lines on either side of the 1:1 (see attached PNG):

Code:
new = read.table('new.results', header=T)
old = read.table('old.results', header=T)
plot(new$pval, old$pval)
abline(0, 1, col='red')
What could be causing this discrepancy? Are there other parameters to estimateDispersions that I'm missing? Has something changed in nbinomTest between versions?

-ryan
Attached Images
File Type: png compare-deseq-version-pvals.png (44.2 KB, 14 views)
daler is offline   Reply With Quote
Old 12-12-2011, 12:47 PM   #2
Simon Anders
Senior Member
 
Location: Heidelberg, Germany

Join Date: Feb 2010
Posts: 994
Default

Yes, we did change 'nbinomTest'. It used to employ an approximation that usually would only be a few percent off (which, for p values, does not matter; one is only interested ion the magnitude, after all), but gave in a few rare cases drastically wrong results. Realizing that this approximation was not really necessary anyway, we removed it. See the end of the new vignette, by the way, for a summary of this and related changes.
Simon Anders is offline   Reply With Quote
Old 12-12-2011, 01:11 PM   #3
daler
Junior Member
 
Location: DC Metro area

Join Date: Feb 2011
Posts: 8
Default

Ah, your explanation here plus the note in the vignette (which I somehow missed) clears it up -- thanks.
daler is offline   Reply With Quote
Reply

Tags
deseq

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 03:55 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO