Seqanswers Leaderboard Ad

**Michael Love** · 02-02-2015, 02:10 PM

Your initial design doesn't include Season in the full and reduced, which is pretty important (we have this in our rnaseqGene example as 'time'). This term keeps track of differences over season, while the interaction term keeps track of differences over season which are specific for individual standTypes.

re: the distinct sites, there is a useful trick for specifying such a design (originally from the edgeR user guide). I posted this on the Bioc support site here: https://support.bioconductor.org/p/62357/#62368 Take a look at that post to get the idea. Your model should take into account the 18 sites, by including a new variable 'nested.site' which keeps track of the 6 sites within each standType. This will be a column of numbers 1-6 identifying the different sites within each standType, and tracking these across all seasons.

You can then use a design (corrected):

Code:

~ season + standType + standType:nested.site + season:standType

The reduced model would only remove season:standType, in order to find genes where there are standType specific differences across season.

**Beebola** · 02-05-2015, 09:23 AM

Thanks for the reply and advice on model design!

I didn’t originally include season in my model as I did not set out to specifically test for seasonal effect and communities group strongly to standType. I was thinking that leaving out the term would just mean the variation caused by season would be attributed to the unexplained variation, but maybe this is not fully correct.

I ran the models as describe above using a data column with siteID (1-18) for the information of nested.site but ran into this error:
litDEnf_seas1<-DESeqDataSet(litternDEf, ~ standType + season + season:siteID + season:standTtye)

Error in DESeqDataSet(litternDEf, ~standType + season + season:siteID + season:standType) :
the model matrix is not full rank, so the model cannot be fit as specified.
one or more variables or interaction terms in the design formula
are linear combinations of the others and must be removed

Shoko also notes this issue here: https://support.bioconductor.org/p/63134/ where it seemed to be a result of missing samples.

Quote: "You are encountering this error, because you have missing samples dispersed in the cross of treatment and time"

I cannot see any missing sample/treatment combinations but it is expected that a species may not be present in all of the 90 samples (or in this model, in all 18 sites or in all 5 seasons). Would species absences at a given site in a given season cause the error? The error only occurs when I add the interaction term season:siteID into the model.

Thanks in advance

**Michael Love** · 02-06-2015, 07:54 AM

Can you show the full column data, including the new siteID columns?

**Beebola** · 02-06-2015, 08:24 AM

Hi,
the data are attached in a .txt file.

Thanks,

Barbara

Attached Files

DESeq_litter_samdata.txt (5.0 KB, 41 views)

**Michael Love** · 02-06-2015, 08:37 AM

So you don't have the same problem as Shoko.

Go back and check the Bioc support link I posted.

You need to create a new column, nested.site, which should only be values "site_1" to "site_6". These keep track of the different sites within each stand type.

Here is a small example of how it should look:

spruce site_1 fall1
spruce site_2 fall1
spruce site_3 fall1
...
beech site_1 fall1
beech site_2 fall1
beech site_3 fall1
...
oak site_1 fall1
oak site_2 fall1
oak site_3 fall1

It's not a problem that spruce site_1 is not the same site as oak site_1, because we use an interaction term standType:nested.site, which produces unique coefficients in the model for each combination of stand and site.

**Beebola** · 02-06-2015, 09:15 AM

Okay,I see the issue. I just ran the model with the new column data and it ran without error. Thanks for the help and prompt reply!

**lbragg** · 07-13-2015, 04:26 PM

Originally posted by Michael Love View Post

So you don't have the same problem as Shoko.

Go back and check the Bioc support link I posted.

You need to create a new column, nested.site, which should only be values "site_1" to "site_6". These keep track of the different sites within each stand type.

Here is a small example of how it should look:

spruce site_1 fall1
spruce site_2 fall1
spruce site_3 fall1
...
beech site_1 fall1
beech site_2 fall1
beech site_3 fall1
...
oak site_1 fall1
oak site_2 fall1
oak site_3 fall1

It's not a problem that spruce site_1 is not the same site as oak site_1, because we use an interaction term standType:nested.site, which produces unique coefficients in the model for each combination of stand and site.

I have a longitudinal study where individuals are nested within diet, and also there may be a batch processing effect (samples processed in two batches).

Would I make two of these new id columns, one for individuals within diet, and one for individuals within batch?

Thanks!

Topics	Statistics	Last Post
Expanding the Horizons of Cellular Research with the Single Cell Atlas by seqadmin Started by seqadmin, Yesterday, 11:49 AM	0 responses 13 views 0 likes	Last Post by seqadmin Yesterday, 11:49 AM
Genetic Variants and Diabetes Risk in Childhood Cancer Survivors by seqadmin Started by seqadmin, 04-24-2024, 08:47 AM	0 responses 16 views 0 likes	Last Post by seqadmin 04-24-2024, 08:47 AM
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 61 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 60 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM

Seqanswers Leaderboard Ad

Announcement

DESeq2 Analysis of Time Series data and model comparison

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News