Seqanswers Leaderboard Ad

**Simon Anders** · 10-26-2011, 07:58 AM

Sometimes, both condition have unequal variance (for example, knock-down samples might differ strongly from each other than untreated control samples because knock-down efficiency is so hard to keep constant), and then, "per-condition" can give more power. This is why this was the default. However, I realized recently that our way of avoiding outliers (see the discussion of 'sharingMode="maximum"' in the vignette) does not work as reliably as I hoped when using "per-condition" estimation. This is why I changed the default to "pooled" and added a note about this fact to the help page. I have some ideas on how to improve this matter but pending that i recommend "pooled".

For fitType, both ways should give good results, and so far, this does not seem to make much of a difference. If you plot the dispersions against the means, as shown in the vignette, you can see which of the two fit types gives a fit that seem to follow the data more closely.

**horizon** · 10-26-2011, 09:05 AM

Hi Simon,
thanks a lot for your answer.

However, I still don't understand how one single pooled empirical dispersion value "pooled" versus an empirical dispersion value for each condition with biol. replicates "per-condition" is applied for the subsequent calculation, which could help me understand in which case I'd expect more/fewer diffex genes.
In my case using two different examples (each with 3 biol. repl. per condition) the pooled option reduced the amount of diffex genes. Is this what you would have expected?

I'm sorry if this is already answered in the threat you mentioned (see the discussion of 'sharingMode="maximum"' in the vignette), which I unfortunately couldn't find (would be great if you could post the link).

Thanks lots!

**horizon** · 10-26-2011, 09:29 AM

Additionally, I have added the "funnel" plots of the results of the diffex assessment with the respective # of identified genes of two treatments each with 3 biol. repl. using two different parameter setting for the estimateDispersions function:

cds.1 <- estimateDispersions( cds.1, sharingMode="maximum", method="per-condition", fitType="local" ); s="max"; m="per-cond"; f="local"
--> # diffex genes: 415

cds.1 <- estimateDispersions( cds.1, sharingMode="maximum", method="pool", fitType="local" ); s="max"; m="pool"; f="local"
--> # diffex genes: 214

Is it to be expected that a lot of genes with "high" log2FCs and "high" mean expression are not identified as significant?!
Do these plots look "normal" to you?!

Thanks a lot!

Attached Files

**Simon Anders** · 10-26-2011, 10:28 AM

I'm sorry if this is already answered in the threat you mentioned (see the discussion of 'sharingMode="maximum"' in the vignette), which I unfortunately couldn't find (would be great if you could post the link).

I mean the vignette, not a thread. See pages 4 to 6 here.

Originally posted by horizon View Post

Is it to be expected that a lot of genes with "high" log2FCs and "high" mean expression are not identified as significant?!
Do these plots look "normal" to you?!

Use the 'identify' function of R to get the gene IDs of some of those black point with high mean and high log FC and then look at the individual normalized counts. I expect that you will find that they vary a lot form replicate to replicate and this why DESeq (at least the new version) does not call them as differentially expressed.

**aggp11** · 03-29-2012, 11:08 AM

Simon,

I have a similar question. I have miRNA data and am looking for differentially expressed and get a lot more D.E miRNAs from the previous version of DESeq as compared to the newer version.

I would like to understand whether, the newer version could be more conservative for lower # of reads data as compared to the older version?

FYI, the sizefactors for our datasets are:

u_1 u_2 s_1 s_2
1.4265463 1.0675662 0.6081645 1.1458061

where u and s are the conditions and 1 and 2 are the replicates.

Thanks,
Praful

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 39 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 41 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 35 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 55 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

DESeq 1.5.30 - estimateDispersions

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News