Seqanswers Leaderboard Ad

**Michael Love** · 10-13-2013, 01:41 PM

10 genes is probably too small for some of the steps of DESeq(): the estimation of the dispersion trend line for instance.

One approach would be to estimate the dispersion using the full data set and then recycle these estimates to test the effect of genotype in the 10 genes.

following your analysis of condition: treatment vs control:

# if rowIdx gives you the index of the 10 genes
ddsSub <- dds[rowIdx,]
design(ddsSub) <- formula(~ condition + genotype)
ddsSub <- nbinomWaldTest(ddsSub)
resSub <- results(ddsSub)

This should be a conservative approach, because the estimate of dispersion using only condition should be larger than an estimation using condition + genotype (because any variation explainable by genotype would be subtracted).

Another question is what modeling approach for the genotype effect.

If you encode genotype as a numeric, (0,1,2) then you are assuming if AB doubles expression then would you expect BB to quadruple.

Another approach would be to encode genotype with two variables, allele1 and allele2, where having allele2 might have a different effect size than allele1. Then you would have:

design(ddsSub) <- formula(condition + allele1 + allele2)

hopefully this helps,

Mike

**Michael Love** · 10-16-2013, 05:19 AM

After discussing with Simon Anders, he pointed out that, while conservative, this approach probably has no power to detect true differences, because in these situations the dispersion will be overestimated.

With only 10 genes, you might be left to estimate the dispersions per gene only, without shrinking towards a common mean, as you probably don't have a large enough set to get a sense of the distribution of dispersions. This would look like:

# dds is now the object limited to 10 genes
design(dds) <- formula(~ condition + genotype)
dds <- estimateSizeFactors(dds)
dds <- estimateDispersionsGeneEst(dds)
dispersions(dds) <- mcols(dds)$dispGeneEst
dds <- nbinomWaldTest(dds)

An alternative approach, if you have enough samples, would be to perform permutation tests on the relationship between gene expression and genotype. See section "6.3 Linear regression and estimation of FDR" of the Supplement of the Pickrell eQTL paper: http://www.ncbi.nlm.nih.gov/pubmed/20220758

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 30 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 32 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 28 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 53 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

DESeq2-test a subset of genes

Comment

Comment

Latest Articles

ad_right_rmr

News