I have an experiment where I am testing for the effect of two factors (age and diet) on gene expression. I have three biological replicates for each age*diet combination.
I would like to test for both main effects and an interaction term on gene expression using a negative binomial regression model, but I see that others prefer a Poisson model. At any rate, I have been using Cufflinks and Cuffdiff because I like the concept of looking at differences in isoform abundance across treatments, however, I see that the best analysis is only a pairwise comparison of FPKM in one condition compared to another. I used cuffdiff for a first look, indicating that I had three biological replicates. I entered the following to compare gene expression in diets A vs B, for age A only (edited, of course so people see where I'm going):
This gives me an output comparing, in pairwise fashion, whether Diet A has significantly different gene expression compared to Diet B, all at Age A. However, I would like to do a proper 2-factor analysis on these data.
So my question is the following: if I run a cuffdiff analysis and say that each age*diet combination is essentially its own replicate by entering this:
can I get an estimate for isoform abundance (FPKM) in each library in the cuffdiff file labeled 'genes.fpkm_tracking' and then input those values into a downstream analysis that tests for significant effects of age, diet, or age*diet using a Negative Binomial regression? It appears (from the cufflinks documentation) that the FPKMs in this 'genes.fpkm_tracking' file are normalized to account for, say, differences in library size and overdispersion, however, DESeq does not account for isoform abundance, which is appealing to me.
Thanks for any comments on this.
Vanessa
I would like to test for both main effects and an interaction term on gene expression using a negative binomial regression model, but I see that others prefer a Poisson model. At any rate, I have been using Cufflinks and Cuffdiff because I like the concept of looking at differences in isoform abundance across treatments, however, I see that the best analysis is only a pairwise comparison of FPKM in one condition compared to another. I used cuffdiff for a first look, indicating that I had three biological replicates. I entered the following to compare gene expression in diets A vs B, for age A only (edited, of course so people see where I'm going):
> cuffdiff -p 8 -o outputdirectory ReferenceGTF -L DietA,DietB AgeADietARep1.bam,AgeADietARep2.bam,AgeADietARep3.bam AgeADietBRep1.bam,AgeADietBRep2.bam,AgeADietBRep3.bam
This gives me an output comparing, in pairwise fashion, whether Diet A has significantly different gene expression compared to Diet B, all at Age A. However, I would like to do a proper 2-factor analysis on these data.
So my question is the following: if I run a cuffdiff analysis and say that each age*diet combination is essentially its own replicate by entering this:
> cuffdiff -p 8 -o outputdirectory ReferenceGTF -L AgeADietARep1,AgeADietARep2,AgeADietARep3,AgeADietBRep1,AgeADietBRep2,AgeADietBRep3,AgeBDietARep1,AgeBDietARep2,AgeBDietARep2,AgeBDietBRep1,AgeBDietBRep2,AgeBDietBRep3 AgeADietARep1.bam AgeADietARep2.bam AgeADietARep3.bam AgeADietBRep1.bam AgeADietBRep2.bam AgeADietBRep3.bam AgeBDietARep1.bam AgeBDietARep2.bam AgeBDietARep2.bam AgeBDietBRep1.bam AgeBDietBRep2.bam AgeBDietBRep3.bam
can I get an estimate for isoform abundance (FPKM) in each library in the cuffdiff file labeled 'genes.fpkm_tracking' and then input those values into a downstream analysis that tests for significant effects of age, diet, or age*diet using a Negative Binomial regression? It appears (from the cufflinks documentation) that the FPKMs in this 'genes.fpkm_tracking' file are normalized to account for, say, differences in library size and overdispersion, however, DESeq does not account for isoform abundance, which is appealing to me.
Thanks for any comments on this.
Vanessa