Seqanswers Leaderboard Ad

**gamb** · 10-27-2014, 02:37 PM

Dang it! Good catch, my columns did not match up.

When I get another error I'll screenshot it and post it here. They seem to pop-up at will. I have a normal release of R and the errors occur at my lab computer + personal computer.

**Michael Love** · 10-27-2014, 03:15 PM

A safer way to reorder columns would be to build the DESeqDataSet with a design of ~ 1, then rearrange the columns of the DESeqDataSet, then add a new column anim.nested, then add the desired design. What do you mean "at will"? After running a particular function? At what points in the analysis to you get these?

**gamb** · 10-27-2014, 03:24 PM

What do you mean "at will"? After running a particular function? At what points in the analysis to you get these?

It's been pretty random when they happen- usually when I'm not running code (often just editing script files prior to running them) so I haven't been able to track down what particular thing is triggering them. All I do know is that they happen in sessions when I've loaded those packages. I think it might be tied into the phyloseq data structures.

**Michael Love** · 10-27-2014, 03:30 PM

Yeah, that won't be from DESeq2 then. And maybe ask the phyloseq maintainers if they have seen this on their github site, although their data structures are data.frame and matrix so I wouldn't know why there would be a problem there. If you are loading objects from memory at the beginning those could be corrupted.

**gamb** · 10-27-2014, 04:33 PM

So I got R to throw up this error (not a segfault...these don't happen too often) right after resultsNames(dds)

Code:

2014-10-27 19:31:47.892 R[5511:707] *** RController: caught ObjC exception while processing system events. Update to the latest GUI version and consider reporting this properly (see FAQ) if it persists and is not known. 
*** reason: *** -[__NSArrayM objectAtIndex:]: index 18446744073709551615 beyond bounds [0 .. 85]
*** name: NSRangeException, info: (null)
*** Version: R 3.1.1 () R.app R 3.1.1 GUI 1.65 Mavericks build(null)
Consider saving your work soon in case this develops into a problem.

**Michael Love** · 10-28-2014, 04:16 AM

I'm not sure about this, but I would recommend following the message and updating R to the latest GUI version.

**gamb** · 10-28-2014, 06:50 AM

Thats the thing... I am running the latest GUI

**Michael Love** · 10-28-2014, 08:55 AM

It might be a problem in .RData, which is loaded every time. Try deleting this file. (although doesn't explain why you get it on two separate computers)

**gtosser** · 12-03-2014, 01:46 PM

Originally posted by Michael Love View Post

hi gamb,

Firstly, as you are just starting to analyze, best to update to Bioconductor 3.0 and DESeq2 version 1.6 which was just released.

As the analysis is a bit complicated, I'd recommend using DESeq(dds, betaPrior=FALSE) as this simplifies the building of contrasts with interactions.

The design should be

Code:

~ tissue*chem1*chem2

...because you're interested in both the interactions of chem1 and chem2, and in the specific effects for each tissue.

The list of effects is given by

Code:

resultsNames(dds)

You can build up the contrasts using a list like below. I'll give some examples and see if you can extend to the others. Comparing the tissues alone is simple by specifying the name of the factor and the two levels to compare, e.g.:

Code:

results(dds, contrast=c("tissue","C","B"))

Then some basics about interaction terms: the effect of chem 1 in tissue B will be:

Code:

results(dds, contrast=list(c("chem1","chem1.tissueB")))

...because you need to add the main chem 1 effect (the effect for tissue A, the base level of tissue) and the extra chem 1 effect for tissue B.

If you wanted to test if chem 1 has an effect in tissue B different than tissue A you would only test the interaction term "chem1.tissueB", because this is that difference.

Tissue A is the base level, so the effect of chem 1 in tissue A is simply

Code:

results(dds, name="chem1")

The effect of chem 1 and chem 2 in tissue B is a bit lengthy, you need to add main effects and all relevant interaction terms:

Code:

results(dds, contrast=list(c("chem1","chem2","chem1.tissueB","chem2.tissueB","chem1.chem2","chem1.chem2.tissueB")))

It's simpler for tissue A because this is the base level:

Code:

results(dds, contrast=list(c("chem1","chem2","chem1.chem2")))

And if you want to test only the interaction, answering if the effect of the two chems is different than each separately, you remove some of these effects. For tissue B this contrast would be:

Code:

results(dds, contrast=list(c("chem1.chem2","chem1.chem2.tissueB")))

For tissue A it is again simpler:

Code:

results(dds, name="chem1.chem2")

Note that another way to compare individual groups would be to create a new variable which describes all groups:

Code:

dds$group = factor(paste(dds$chem1, dds$chem2, dds$tissue))
design(dds) = ~ group

Then use the following to compare to specific groups:

Code:

results(dds, contrast=c("group","...","..."))

Dear Michael,

I am afraid I cannot build my contrsats either.
I have cells from 2 species (bovin, porc) that have been treated or not (treatment=pos or treatment=neg)

Here is my model
design (ddsX)<-~species+treatment+species:treatment

resultsNames(ddsX)
#[1] "Intercept" "species_porc_vs_bovin" #"treatment_pos_vs_neg" "speciesporc.treatmentpos"

If I understand well, I can find the genes with a treatment effect with:
results(ddsX, contrast=c("treatment","neg","pos"))

then the genes with a species effect with:
results(ddsX, contrast=c("species","bovin","porc"))

But how can I find the genes:
- from species "bovin" with a treatment effect
- from species "porc" with a treatment effect
- with interaction (that do not react to the treatment in a same way in "bovin" or "porc")?

I am using DESeq2_1.6.2

Thanks for your help

Gwenola

**Michael Love** · 12-03-2014, 01:51 PM

hi Gwenola,

Check in the Examples section of the help for ?results:

You should find, Example 2: two conditions, two groups, with interaction term.

**gtosser** · 12-04-2014, 07:19 AM

Thanks

Thanks a lot.

## Example 2: two conditions, two groups, with interaction term

# the condition effect. Is this the condition effect in groupX?
results(dds, contrast=c("condition","B","A"))

# the condition effect in group B. Is there a typo (condition effect in group Y?)
results(dds, contrast=c(0,0,1,1))
# or, equivalently using list to add these two effects
results(dds, contrast=list(c("condition_B_vs_A","groupY.conditionB")))

If I want the condition effect (whatever the group), is that
results(dds, contrast=c(0,0,1,1/2))?

Best regards
Gwenola

Originally posted by Michael Love View Post

hi Gwenola,

Check in the Examples section of the help for ?results:

You should find, Example 2: two conditions, two groups, with interaction term.

**Michael Love** · 12-04-2014, 09:08 AM

Is this the condition effect in groupX?

yes

Is there a typo (condition effect in group Y?)

yes, thanks! I've just fixed this in devel

If I want the condition effect (whatever the group), is that results(dds, contrast=c(0,0,1,1/2))?

yes, that's the condition effect averaged over both groups

**gtosser** · 12-08-2014, 09:17 AM

Thanks

Thanks for your answer. It is clear now.
Gwenola

**rozitaa** · 03-30-2015, 04:38 AM

Originally posted by Michael Love View Post

There is a way to control for the animal effect in this experiment, by recoding the animals within blocks of chem1*chem2 and adding an interaction term. It gets a bit trickier to form contrasts, but I will show how to do this.

It helps to clean up the colData a bit, I'll show on a clean version, and you can reorder your samples correspondingly. Note that you should reorder the columns of the count matrix so that they match. If you have a SummarizedExperiment, the colData and matrix are tied together, so simply reordering the columns of the SummarizedExperiment does both for you.

I made a sample data.frame like your data:

Code:

d <- data.frame(anim=factor(rep(1:12,each=3)),
  chem1=factor(rep(rep(c("N","Y"),each=6),3)),
  chem2=factor(rep(rep(c("N","Y"),each=3),6)),
  tiss=factor(rep(1:3,12)))

this looks like

Code:

    anim chem1 chem2 tiss
 1     1     N     N    1
 2     1     N     N    2
 3     1     N     N    3
 4     2     N     Y    1
 5     2     N     Y    2
 6     2     N     Y    3
 7     3     Y     N    1
 8     3     Y     N    2
 9     3     Y     N    3
 10    4     Y     Y    1
 11    4     Y     Y    2
 12    4     Y     Y    3
 13    5     N     N    1
 14    5     N     N    2
 15    5     N     N    3
 16    6     N     Y    1
 17    6     N     Y    2
 18    6     N     Y    3
 19    7     Y     N    1
 20    7     Y     N    2
 21    7     Y     N    3
 22    8     Y     Y    1
 23    8     Y     Y    2
 24    8     Y     Y    3
 25    9     N     N    1
 26    9     N     N    2
 27    9     N     N    3
 28   10     N     Y    1
 29   10     N     Y    2
 30   10     N     Y    3
 31   11     Y     N    1
 32   11     Y     N    2
 33   11     Y     N    3
 34   12     Y     Y    1
 35   12     Y     Y    2
 36   12     Y     Y    3

Now reorder by chem1, chem2 and tiss. Then add a new variable which will be used to control animal specific effects within each block of chem1*chem2.

Code:

idx <- with(d, order(chem1, chem2, tiss))
d <- d[idx,]
d$anim.nested <- factor(rep(1:3,12))

You can examine d to see how this works.

The model matrix is then the chem1 and chem2 terms and interaction, as well as the interactions of these with animal, and the interaction of these with tissue:

Code:

m <- model.matrix(~ chem1 + chem2 + chem1:chem2 + chem1:chem2:anim.nested + chem1:chem2:tiss, data=d)

The column names of this are:

Code:

> colnames(m)
  [1] "(Intercept)"                "chem1Y"
  [3] "chem2Y"                     "chem1Y:chem2Y"
  [5] "chem1N:chem2N:anim.nested2" "chem1Y:chem2N:anim.nested2"
  [7] "chem1N:chem2Y:anim.nested2" "chem1Y:chem2Y:anim.nested2"
  [9] "chem1N:chem2N:anim.nested3" "chem1Y:chem2N:anim.nested3"
 [11] "chem1N:chem2Y:anim.nested3" "chem1Y:chem2Y:anim.nested3"
 [13] "chem1N:chem2N:tiss2"        "chem1Y:chem2N:tiss2"
 [15] "chem1N:chem2Y:tiss2"        "chem1Y:chem2Y:tiss2"
 [17] "chem1N:chem2N:tiss3"        "chem1Y:chem2N:tiss3"
 [19] "chem1N:chem2Y:tiss3"        "chem1Y:chem2Y:tiss3"

Many of the contrasts become hard to form, though possible with numeric contrasts. A trick for getting the numeric contrasts of the other comparisons, is to take the column means of the model matrix, for the groups of samples you want to compare. For instance, if you want to find the main chem1 effect in tissue 2, the contrast would be:

Code:

ctrst = colMeans(m[d$tiss == "2" & d$chem1 == "Y" & d$chem2 == "N",]) -
        colMeans(m[d$tiss == "2" & d$chem1 == "N" & d$chem2 == "N",])

This ends up being the chem1 base effect, the chem1Y effect in tissue 2 minus the chem1N effect in tissue 2 and a few 1/3 and -1/3 to the animal interaction effects.

Edit: note that interaction terms can be tested using the same logic, but you need to subtract the groups with only one chem and then add the base group. So it looks like: chem1Y chem2Y - chem1N chem2Y - chem1Y chem2N + chem1N chem2N.

Also, I had to update that custom line of size factor code for when all rows have a zero. If you were using that line of code to avoid the issue that all rows of your count matrix have one or more zeros, you should get the edited version:

DESeq2 Error in varianceStabilizingTransformation

https://support.bioconductor.org/p/62246/

We're implementing a solution here which was on the list of todos for a while now, but it won't be out for 6 months. Typical RNA-Seq doesn't have this issue, but since phyloseq we are seeing more metagenomics datasets which often have no genes without a zero.

Hi Michael,

I have also kinda same experimental design and I tried to use your suggestions in this thread but still I get the same error regarding to linear combination.
In my project each sample has been treated with one type of gut_microbiota: A, B, C, D, E and we have measured expression level of each mouse at three different tissues: s, p, l. Now I would like to see DE genes resulted from comparing of every gut_microbiota status (B, C, D, E) to the baseline (A) on each tissue separately. For each type of gut_microbiota I have different number of replicates, varying from 2 up to 7 replicates. Here is my design matrix:

Code:

d <- data.frame(animal=c(rep(c('5231', '5232', '5233', '5234', '5161', '5162', '5163'), each=3), rep('5164', 4) , rep(c('5165', '5166', '5167', '5168', '5169', '5170', '5171', '5172', '5173', '5174', '5175', '5176', '5177', '5178'), each=3), rep('5179', 2)), 
                           gut_microbiota = c(rep('A', 12), rep('B', 9), rep('E', 10), rep('D', 12), rep('B', 9), rep('D', 9), rep('C', 8)), 
                           tissue=c(rep(c('s', 'p', 'l'), each=1, 7), 's', 'p', 'l', 'l', rep(c('s', 'p', 'l'), each=1, 14), c('s', 'p')))

resulted in:

Code:

animal gut_microbiota tissue
1    5231              A      s
2    5231              A      p
3    5231              A      l
4    5232              A      s
5    5232              A      p
6    5232              A      l
7    5233              A      s
8    5233              A      p
9    5233              A      l
10   5234              A      s
11   5234              A      p
12   5234              A      l
13   5161              B      s
14   5161              B      p
15   5161              B      l
16   5162              B      s
17   5162              B      p
18   5162              B      l
19   5163              B      s
20   5163              B      p
21   5163              B      l
22   5164              E      s
23   5164              E      p
24   5164              E      l
25   5164              E      l
26   5165              E      s
27   5165              E      p
28   5165              E      l
29   5166              E      s
30   5166              E      p
31   5166              E      l
32   5167              D      s
33   5167              D      p
34   5167              D      l
35   5168              D      s
36   5168              D      p
37   5168              D      l
38   5169              D      s
39   5169              D      p
40   5169              D      l
41   5170              D      s
42   5170              D      p
43   5170              D      l
44   5171              B      s
45   5171              B      p
46   5171              B      l
47   5172              B      s
48   5172              B      p
49   5172              B      l
50   5173              B      s
51   5173              B      p
52   5173              B      l
53   5174              D      s
54   5174              D      p
55   5174              D      l
56   5175              D      s
57   5175              D      p
58   5175              D      l
59   5176              D      s
60   5176              D      p
61   5176              D      l
62   5177              C      s
63   5177              C      p
64   5177              C      l
65   5178              C      s
66   5178              C      p
67   5178              C      l
68   5179              C      s
69   5179              C      p

I have reordered design matrix as you mentioned and add a new term called, animal.nested:

Code:

idx <- with(d, order(gut_microbiota, tissue))
d <- d[idx, ]
d$animal.nested <- factor(c(rep(1:4, 3), rep(1:6, 3), c(1:2), rep(1:3, 2), rep(1:7, 3), 1, rep(1:3, 3)))

And finally this is my fitting design:

Code:

dds = DESeqDataSetFromHTSeqCount(sampleTable=sample_table, directory='~/htseq/', design= ~ gut_microbiota + gut_microbiota:animal.nested + gut_microbiota:tissue)

Can you please help with this? What is wrong in my codes?

**Michael Love** · 03-30-2015, 05:57 AM

hi rozitaa,

The issue here is that there are more animals in certain tissues. So when the R function model.matrix tries to form the model matrix internal to the DESeq() function, this results in a column of zeros for the nested animal 6, microbiota A combination for example.

There is an easy solution though. In the next release of Bioconductor, which will be released 17 April 2015, and has DESeq2 v1.8, you can provide a custom model matrix to DESeq(). (You can test the development version of DESeq2 v1.7 already, but this requires downloading the development version of R, and this is not impossible but not always easy.)

So all you need to do is to remove these interaction columns which have no samples:

Code:

design <- ~ gut_microbiota + gut_microbiota:animal.nested + gut_microbiota:tissue
mm <- model.matix(design, data=colData(dds))
## remove columns of mm which have all zeros:
(all.zero <- apply(mm, 2, function(x) all(x == 0)))
mm <- mm[, !all.zero]
dds <- DESeq(dds, full=mm, betaPrior=FALSE)

We do have to turn off the log2 fold change shrinkage for this options though (betaPrior=FALSE). But the inference and results tables work the same.

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 55 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 52 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 45 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 55 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News