Unconfigured Ad

**Michael Love** · 10-13-2013, 01:41 PM

10 genes is probably too small for some of the steps of DESeq(): the estimation of the dispersion trend line for instance.

One approach would be to estimate the dispersion using the full data set and then recycle these estimates to test the effect of genotype in the 10 genes.

following your analysis of condition: treatment vs control:

# if rowIdx gives you the index of the 10 genes
ddsSub <- dds[rowIdx,]
design(ddsSub) <- formula(~ condition + genotype)
ddsSub <- nbinomWaldTest(ddsSub)
resSub <- results(ddsSub)

This should be a conservative approach, because the estimate of dispersion using only condition should be larger than an estimation using condition + genotype (because any variation explainable by genotype would be subtracted).

Another question is what modeling approach for the genotype effect.

If you encode genotype as a numeric, (0,1,2) then you are assuming if AB doubles expression then would you expect BB to quadruple.

Another approach would be to encode genotype with two variables, allele1 and allele2, where having allele2 might have a different effect size than allele1. Then you would have:

design(ddsSub) <- formula(condition + allele1 + allele2)

hopefully this helps,

Mike

**Michael Love** · 10-16-2013, 05:19 AM

After discussing with Simon Anders, he pointed out that, while conservative, this approach probably has no power to detect true differences, because in these situations the dispersion will be overestimated.

With only 10 genes, you might be left to estimate the dispersions per gene only, without shrinking towards a common mean, as you probably don't have a large enough set to get a sense of the distribution of dispersions. This would look like:

# dds is now the object limited to 10 genes
design(dds) <- formula(~ condition + genotype)
dds <- estimateSizeFactors(dds)
dds <- estimateDispersionsGeneEst(dds)
dispersions(dds) <- mcols(dds)$dispGeneEst
dds <- nbinomWaldTest(dds)

An alternative approach, if you have enough samples, would be to perform permutation tests on the relationship between gene expression and genotype. See section "6.3 Linear regression and estimation of FDR" of the Supplement of the Pickrell eQTL paper: http://www.ncbi.nlm.nih.gov/pubmed/20220758

Topics	Statistics	Last Post
Whole-Genome Sequencing Traces Faroe Islands Ancestry to a North Atlantic Founder Population by SEQadmin2 Started by SEQadmin2, 06-17-2026, 06:09 AM	0 responses 25 views 0 reactions	Last Post by SEQadmin2 06-17-2026, 06:09 AM
Sequencing the Two-Toed Sloth Genome Reveals Jumping Genes Tied to Its Extreme Metabolism by SEQadmin2 Started by SEQadmin2, 06-09-2026, 11:58 AM	0 responses 42 views 0 reactions	Last Post by SEQadmin2 06-09-2026, 11:58 AM
A New Method Makes Hantavirus Genome Analysis Faster and More Accessible by SEQadmin2 Started by SEQadmin2, 06-05-2026, 10:09 AM	0 responses 48 views 0 reactions	Last Post by SEQadmin2 06-05-2026, 10:09 AM
A New Single-Cell Method Maps DNA-Protein Interactions by SEQadmin2 Started by SEQadmin2, 06-04-2026, 08:59 AM	0 responses 49 views 0 reactions	Last Post by SEQadmin2 06-04-2026, 08:59 AM

Unconfigured Ad

DESeq2-test a subset of genes

Comment

Comment

Latest Articles

ad_right_rmr

News