SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
batch effect in radseq oselm De novo discovery 1 11-10-2016 06:43 PM
Batch effect help ea11 Bioinformatics 5 10-27-2015 07:13 AM
Batch Effect emolinari Bioinformatics 5 06-30-2014 08:09 AM
Batch effect Amative Bioinformatics 3 04-28-2013 04:01 PM

Reply
 
Thread Tools
Old 01-25-2017, 04:00 AM   #1
krausezuhause
Junior Member
 
Location: Netherlands

Join Date: May 2013
Posts: 5
Default DESeq2 lrt with multiple factors and batch effect

Dear all,
I have a question concerning a multiple factor analysis with a batch effect reflecting the day of the library preparation (2 dates). I am using the likrlihood ratio test in DESeq2. My variable of interest is a continious variable indicating how much a person is exposed. Further I want to control for possible confounders sex, age and BMI.

My first question would be if the design makes sense:

Code:
dds <- DESeqDataSetFromMatrix(countData = MyCounts,
                                colData = MyData,
                                design = ~libbatch + sex + age + BMI + Exposure)

dds<-DESeq(dds,test= "LRT",full = design(dds),reduced = ~libbatch)

res<-results(dds,name = "Exposure" ,pAdjustMethod = "fdr")
or would I need to do something like this:


Code:
dds<-DESeq(dds,test= "LRT",full = design(dds),reduced = ~libbatch + sex + age + BMI +)
And the second question concerning the results:
Why are so many NAs among the adjusted pvalues? and why are many of them equal?

Quote:

res<-results(dds,name = "Expo.delta" ,pAdjustMethod = "fdr")
baseMean log2FoldChange lfcSE stat pvalue padj
gene1 13009.48564 -0.0005561894 0.001880162 25.89735 0.0002326616 0.01561069
gene2 163.28590 -0.0043968404 0.003520945 25.21172 0.0003119647 0.01561069
gene4 88.93107 -0.0074868961 0.006939026 26.88819 0.0001519605 0.01561069
gene5 121.15589 -0.0092059826 0.004727699 25.50741 0.0002749380 0.01561069
... ... ... ... ... ... ...
genex 24.22494 -0.0117729650 0.011910591 5.048386 0.5376224 NA
geney 23.03576 0.0070158920 0.010191693 5.260325 0.5108840 NA

Many thanks for your help!

Last edited by krausezuhause; 01-25-2017 at 04:09 AM.
krausezuhause is offline   Reply With Quote
Old 01-25-2017, 04:18 AM   #2
dpryan
Devon Ryan
 
Location: Freiburg, Germany

Join Date: Jul 2011
Posts: 3,479
Default

Your reduced design should be "~libbatch + sex + age + BMI", though I'm curious why you explicitly want an LRT.

The NAs are probably due to independent filtering. I'd have to look up which method the "fdr" correction is using, I only ever use the standard BH method.
dpryan is offline   Reply With Quote
Old 01-25-2017, 04:36 AM   #3
krausezuhause
Junior Member
 
Location: Netherlands

Join Date: May 2013
Posts: 5
Default

I did not know you can also correct for a batch effect using the Wald test.
How would the model look like then? the reduced model is ignored in a Wald test.
krausezuhause is offline   Reply With Quote
Old 01-25-2017, 04:41 AM   #4
dpryan
Devon Ryan
 
Location: Freiburg, Germany

Join Date: Jul 2011
Posts: 3,479
Default

Your full design is the same for Wald and LRT, only the latter needs a reduced design.

Code:
dds <- DESeqDataSetFromMatrix(countData = MyCounts,
                                colData = MyData,
                                design = ~libbatch + sex + age + BMI + Exposure)
dds<-DESeq(dds)
res <- resuls(dds) # I think this will default to Exposure, being the last variable in the design
dpryan is offline   Reply With Quote
Old 01-25-2017, 04:44 AM   #5
krausezuhause
Junior Member
 
Location: Netherlands

Join Date: May 2013
Posts: 5
Default

and what would be the interpretation of this model?

Quote:
dds <- DESeqDataSetFromMatrix(countData = MyCounts,
colData = MyData,
design = ~libbatch + sex + age + BMI + Exposure)

dds<-DESeq(dds,test= "LRT",full = design(dds),reduced = ~libbatch)
Only correcting for batch effect and the other confounders are ignored?

Thanks for your reply!
krausezuhause is offline   Reply With Quote
Old 01-25-2017, 04:57 AM   #6
dpryan
Devon Ryan
 
Location: Freiburg, Germany

Join Date: Jul 2011
Posts: 3,479
Default

? It's the same model, you're still correcting for the batch and accounting for changes in the confounders. You're just getting your p-value according to whether the log2FC of "Exposure" is different from 0 (Wald test) as opposed to whether the full or reduced models fit better (LRT).
dpryan is offline   Reply With Quote
Old 01-25-2017, 05:19 AM   #7
krausezuhause
Junior Member
 
Location: Netherlands

Join Date: May 2013
Posts: 5
Default

So If I understand correctly now, with the LRT above I account for batch and confounders but can not relate the significant genes to exposure?

I get about 300 significant hits with the LRT (reduced = ~libbatch) model, but 0 hits with the Wald or LRT (reduced = ~libbatch + sex + age + BMI )
krausezuhause is offline   Reply With Quote
Old 01-25-2017, 05:58 AM   #8
dpryan
Devon Ryan
 
Location: Freiburg, Germany

Join Date: Jul 2011
Posts: 3,479
Default

Your confounders are masking any effect of exposure. Sorry your results didn't turn out better.
dpryan is offline   Reply With Quote
Old 01-25-2017, 06:03 AM   #9
krausezuhause
Junior Member
 
Location: Netherlands

Join Date: May 2013
Posts: 5
Default

Many thanks for the explanation!
krausezuhause is offline   Reply With Quote
Reply

Tags
batch effect, deseq2, lrt, rnaseq

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 06:13 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2018, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO