Seqanswers Leaderboard Ad

**dpryan** · 11-17-2015, 06:10 AM

Ah, yes, that's correct. I always forget that it uses a "." rather than ":".

**frymor** · 11-17-2015, 08:47 AM

Sorry to be so picky, but I still have some difficulties understanding it, even if it might not seems this way.

What would be the difference between the two models:

Code:

~condition + stimulation

and

Code:

~condition + stimulation + condition:stimulation

in term of the question answered by the model?
I know what the user guide says about interactions, but to be honest, I'm not sure what it really means.

Is there a way to simplify it better?

my try would be this:
the first model try to identify the differences in the two groups of stimulation, buy accounting for the differences in the condition groups (this is reformulating the user guide). Does it means, that I sort of trying to discard all changes happening due to conditions and concentrate only on the effect causes by the stimulation?

the second model adds an interaction. according to the user guide, it means that i am trying to calculate the differences of one condition (here my term condition) based on the second condition (here my term stimulation)?
Does this means here that I'm trying to see the differences happening in my samples between the WT and KO based on the stimulation changes?

This doesn't sound very simplified, but maybe someone can do it better

**dpryan** · 11-18-2015, 12:46 AM

No worries about being picky, this is important to get correct

This boils down to, "so what's an interaction anyway?". Let's take your experiment as the example and consider the design:

Code:

~condition + stimulation

Let's just consider one example gene, called foo. Suppose in the WT unstimulated case, foo has a value of 1. Upon stimulation, that increases to 2. If don't stimulate but instead knock out some other gene, the value instead goes to 3. That covers 3 of the 4 groups. What this model says is that in the case of both stimulating AND knocking out some other gene, we expect the resulting value to be 2*3=6 (the fold changes multiply).

But suppose the effect due to knocking out some gene is dependent upon stimulation (e.g., the stimulation activates a pathway that we've partly inactivated due to the knockout). We then would have what's referred to as an interaction between the stimulation and the condition. So we would no longer expect the knockout and stimulated group to have a value of 6, but something else entirely. The resulting interaction coefficient tells you the fold change from what you would expect if the knockout and stimulation effect don't interfere with each other (or have a synergistic effect). It could well be the case that instead of this group having a value of 6 we instead see no change from the baseline WT unstimulated group, so the value is 1, meaning that the fit coefficient is 1/6. Interactions always describe the additional change on top of what would be expected if your various conditions acted independently.

Hope that helps.

**frymor** · 11-18-2015, 01:25 AM

Yes, it helps a lot. I think I start

understanding it.

I have another question about the colData. Does the order of the samples here must be the same as in the count table.
I am asking because I have done the analysis mentioned above and found no DE genes. When I than changed the colData file to that:

Code:

name	condition	stimulation
Vav_KO_1	KO	no
Vav_KO_2	KO	no
Vav_KO_5	KO	no
Vav_KO_2_C	KO	yes
Vav_KO_4_C	KO	yes
Vav_KO_5_C	KO	yes
Vav_WT_1	wildtype	no
Vav_WT_2	wildtype	no
Vav_WT_4	wildtype	no
Vav_WT_4_C	wildtype	yes
Vav_WT_1_C	wildtype	yes
Vav_WT_2_C	wildtype	yes

I suddenly get 106 genes with an adjp<=0.1

the sample names was not changed and is identical to the column names of the count table.
Now I also get a different last comparison possiblity:

Code:

> resultsNames(dds)
[1] "Intercept"                        "condition_wildtype_vs_KO"        
[3] "stimulation_yes_vs_no"            "conditionwildtype.stimulation[B]yes[/B]"

compared to "conditionwildtype.stimulationnone" from before.
Is there an explanation for this kind of changes?

thanks in advance

**dpryan** · 11-18-2015, 02:14 AM

That's...weird. I don't have a good explanation for that. Well, actually I don't know why you got "Curdlan" as part of a coefficient name before given that it wasn't in the original colData. The coefficient names at least make sense this time. Given that, my only guess is that something went wonky previously.

**frymor** · 11-22-2015, 11:32 PM

Originally posted by dpryan View Post

No worries about being picky, this is important to get correct

But suppose the effect due to knocking out some gene is dependent upon stimulation (e.g., the stimulation activates a pathway that we've partly inactivated due to the knockout). We then would have what's referred to as an interaction between the stimulation and the condition. So we would no longer expect the knockout and stimulated group to have a value of 6, but something else entirely. The resulting interaction coefficient tells you the fold change from what you would expect if the knockout and stimulation effect don't interfere with each other (or have a synergistic effect). It could well be the case that instead of this group having a value of 6 we instead see no change from the baseline WT unstimulated group, so the value is 1, meaning that the fit coefficient is 1/6. Interactions always describe the additional change on top of what would be expected if your various conditions acted independently.

Hi Devon,

you mentioned, being picky can be a virtue, so I'm being picky again.

I was wondering about my first question for the comparison between the knock-out and the stimulation. I wanted to find the deregulated genes independently from the stimulation effect. What I did was a Wald test between the contrasts KO and wildtype. But with this comparison, I have taken both the stimulated and the non-stimulated samples in one go.
Does it make sense to do it this way, or is it better to take only the not stimulated samples.

I think that for the genes affected by the stimulation, the not-stimulated vs. stimulated samples might cancel each other, but it is a risk.
Taking all samples though gives me a higher statistical power (six instead of only three samples).

But this might also give me genes changing due to the combination of knock-out and stimulation, which I don't want to have in the first question.

What do you think is a better way of doing this analysis?

thanks
Assa

**dpryan** · 11-23-2015, 02:09 AM

Yes, you'll get slightly better results taking everything at once and using the contrast like you did. The reason for this is simply due to the additional samples being used in dispersion estimation. If you include the interaction term in the design then you won't have that affecting things.

**EVELE** · 02-18-2019, 03:20 PM

Hi,
Sorry for the resurrection of such old dialogues but I am quite confused.
I was thinking of doing a time-course analysis (as Assa did) having only 1 condition , 4 time points and 3 replicates for each time point. But what Ryan says here "not to use replicates as a factor" is contradictory to the answer that Michael Love gives in this link 4 months ago:

Deseq2 design for one condition over 8 different time point

https://support.bioconductor.org/p/113630/

Should I finally consider replicates as a factor in my model or better to forget them?

Thanks for your answer in advance,
Eva

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 55 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 52 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 45 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 55 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News