SEQanswers

SEQanswers (http://seqanswers.com/forums/index.php)
-   Bioinformatics (http://seqanswers.com/forums/forumdisplay.php?f=18)
-   -   DEXSeq for multi-factor design (http://seqanswers.com/forums/showthread.php?t=31432)

alittleboy 06-26-2013 06:11 PM

DEXSeq for multi-factor design
 
I am using DEXSeq for testing differential exon usage between two conditions: control and treatment. For each condition, I have 8 biological replicates (C1-C8, and T1-T8). The design is listed below.

condition subject
C1 control 1
C2 control 2
C3 control 3
C4 control 4
C5 control 5
C6 control 6
C7 control 7
C8 control 8
T1 treatment 1
T2 treatment 2
T3 treatment 3
T4 treatment 4
T5 treatment 5
T6 treatment 6
T7 treatment 7
T8 treatment 8


As you can see from the last column, we have 8 subjects involved in the experiment. Subject 1 has both the control and the treatment, and so on for all the other subjects. This is different from the situation discussed in the DEXSeq vignette here, for example:

design(pasillaExons)

gives:

condition type
treated1fb treated single-read
treated2fb treated paired-end
treated3fb treated paired-end
untreated1fb untreated single-read
untreated2fb untreated single-read
untreated3fb untreated paired-end
untreated4fb untreated paired-end


I think in the pasilla example, the biological replicates are all different. Thus in my situation, in order to see if there is differential exon usage between the treatment and control, can I do:

(1) ignore the fact that each subject had both control and treatment? In this case, in my implementation, shall I write:

f_dispersion = count ~ sample + condition * exon
pExons = estimateDispersions(pExons, formula=f_dispersion)
pExons = fitDispersionFunction(pExons)
Null model: f_0 = count ~ sample + condition
Alternative model: f_1 = count ~ sample + condition * I(exon == exonID)
pExons = testForDEU(pExons, formula0 = f_0, formula1 = f_1)


(2) incorporate the subject as a corvariate (coded that column as a factor), and then analyze in the GLM framework? In this case, in my implementation, shall I write:

f_dispersion = count ~ sample + (condition + subject) * exon
Null model: f_0 = count ~ sample + subject * exon + condition
Alternative model: f_1 = count ~ sample + subject * exon + condition * I(exon == exonID)


(3) I am not sure if including subject as a corvariate is the best approach in my situation. Are there any other options that I can consider?

(4) I write the formula for null and alternative models exactly according to the vignette, but I am not sure if they are what I should put in R implementation.

Thank you so much ;-)

dpryan 06-27-2013 02:08 AM

You'll want option (2). This happened to be recently discussed on the bioconductor email list, so have a look at that thread.

alittleboy 06-27-2013 05:08 AM

Quote:

Originally Posted by dpryan (Post 108873)
You'll want option (2). This happened to be recently discussed on the bioconductor email list, so have a look at that thread.

Hi @dpryan:

That's a really relevant post, and it's convenient to include the subject effect in the GLM setting ;-)

Can I know if, according to my design matrix above, the following formula are correct?

f_dispersion = count ~ sample + (condition + subject) * exon
Null model: f_0 = count ~ sample + subject * exon + condition
Alternative model: f_1 = count ~ sample + subject * exon + condition * I(exon == exonID)

Thanks!

dpryan 06-27-2013 05:24 AM

By my understanding, yes. Hopefully someone else will jump in if my understanding is wrong!

alittleboy 06-27-2013 10:28 AM

Quote:

Originally Posted by dpryan (Post 108892)
By my understanding, yes. Hopefully someone else will jump in if my understanding is wrong!

Hi @dpryan:

According to this post (pretty recent!): the formula I wrote should be correct for the dispersion and testDEU ;-)

Thanks!

dpryan 06-27-2013 10:43 AM

Confirmation is always good :)


All times are GMT -8. The time now is 04:06 AM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.