I'm new on EdgeR and I'm facing some troubles that I could not realize how to solve...
I'm interested in to make comparisons both between and within subjects, that include animals presetning two different status of susceptibility, i.g. resistant (R) or susceptible (S), at two different time points, i.g. pre and post challenge. For this, I'm following the item 3.5 of the EdgeR user's manual (page 32).
Once I have organized my data frame and the design formula as the manual states, I have an error when I ask for the dispersion calculations
targets
animal status tempo
1 1 R Pre
2 1 R Pos
3 2 R Pre
4 2 R Pos
5 3 R Pre
6 3 R Pos
7 4 R Pre
8 4 R Pos
9 5 R Pre
10 5 R Pos
11 6 R Pre
12 6 R Pos
13 7 R Pre
14 7 R Pos
15 8 R Pre
16 8 R Pos
17 9 R Pre
18 9 R Pos
19 10 R Pre
20 10 R Pos
21 11 R Pre
22 11 R Pos
23 12 R Pre
24 12 R Pos
25 13 R Pre
26 13 R Pos
27 14 R Pre
28 14 R Pos
29 15 R Pre
30 15 R Pos
31 16 R Pre
32 16 R Pos
33 17 R Pre
34 17 R Pos
35 18 R Pre
36 18 R Pos
37 19 R Pre
38 19 R Pos
39 20 R Pre
40 20 R Pos
41 1 S Pre
42 1 S Pos
43 2 S Pre
44 2 S Pos
45 3 S Pre
46 3 S Pos
47 4 S Pre
48 4 S Pos
49 5 S Pre
50 5 S Pos
51 6 S Pre
52 6 S Pos
53 7 S Pre
54 7 S Pos
55 8 S Pre
56 8 S Pos
57 9 S Pre
58 9 S Pos
59 10 S Pre
60 10 S Pos
61 11 S Pre
62 11 S Pos
63 12 S Pre
64 12 S Pos
65 13 S Pre
66 13 S Pos
67 14 S Pre
68 14 S Pos
69 15 S Pre
70 15 S Pos
71 16 S Pre
72 16 S Pos
73 17 S Pre
74 17 S Pos
75 18 S Pre
76 18 S Pos
77 19 S Pre
78 19 S Pos
design <- model.matrix(~status+status:animal+status:tempo, data = targets)
dge <- estimateGLMCommonDisp(dge, design)
Error in glmFit.default(y, design = design, dispersion = dispersion, offset = offset, :
Design matrix not of full rank. The following coefficients not estimable:
statusS:animal20
Note that I have 20 animals in the first condition (R) and 19 in second (S). The same for the others dispersions.
Then, I've changed my data.frame to:
targets2
animal status tempo
1 1 R Pre
2 1 R Pos
3 2 R Pre
4 2 R Pos
5 3 R Pre
6 3 R Pos
7 4 R Pre
8 4 R Pos
9 5 R Pre
10 5 R Pos
11 6 R Pre
12 6 R Pos
13 7 R Pre
14 7 R Pos
15 8 R Pre
16 8 R Pos
17 9 R Pre
18 9 R Pos
19 10 R Pre
20 10 R Pos
21 11 R Pre
22 11 R Pos
23 12 R Pre
24 12 R Pos
25 13 R Pre
26 13 R Pos
27 14 R Pre
28 14 R Pos
29 15 R Pre
30 15 R Pos
31 16 R Pre
32 16 R Pos
33 17 R Pre
34 17 R Pos
35 18 R Pre
36 18 R Pos
37 19 R Pre
38 19 R Pos
39 20 R Pre
40 20 R Pos
41 21 S Pre
42 21 S Pos
43 22 S Pre
44 22 S Pos
45 23 S Pre
46 23 S Pos
47 24 S Pre
48 24 S Pos
49 25 S Pre
50 25 S Pos
51 26 S Pre
52 26 S Pos
53 27 S Pre
54 27 S Pos
55 28 S Pre
56 28 S Pos
57 29 S Pre
58 29 S Pos
59 30 S Pre
60 30 S Pos
61 31 S Pre
62 31 S Pos
63 32 S Pre
64 32 S Pos
65 33 S Pre
66 33 S Pos
67 34 S Pre
68 34 S Pos
69 35 S Pre
70 35 S Pos
71 36 S Pre
72 36 S Pos
73 37 S Pre
74 37 S Pos
75 38 S Pre
76 38 S Pos
77 39 S Pre
78 39 S Pos
And now, I have the following message:
dge <- estimateGLMCommonDisp(dge, design2)
Warning message:
In estimateGLMCommonDisp.default(y = y$counts, design = design, :
No residual df: setting dispersion to NA
My third try was change my design formula to:
design3 <- model.matrix(~animal+tempo, data = targets2)
And then I could calculate the dispersions and fit the model (glmFit). However, this is not exactly what the manual says and doesn't look like the best option for me, once that when I will ask for the glmLRT, it seems that will be important include the animal status between comparisons.
Moreover, is not clear until now how to represent the “coef” and “contrast” in this function, based on a numeric vector.
Some help will be greatly appreciate
Sincerely,
Daniela Moré
I'm interested in to make comparisons both between and within subjects, that include animals presetning two different status of susceptibility, i.g. resistant (R) or susceptible (S), at two different time points, i.g. pre and post challenge. For this, I'm following the item 3.5 of the EdgeR user's manual (page 32).
Once I have organized my data frame and the design formula as the manual states, I have an error when I ask for the dispersion calculations
targets
animal status tempo
1 1 R Pre
2 1 R Pos
3 2 R Pre
4 2 R Pos
5 3 R Pre
6 3 R Pos
7 4 R Pre
8 4 R Pos
9 5 R Pre
10 5 R Pos
11 6 R Pre
12 6 R Pos
13 7 R Pre
14 7 R Pos
15 8 R Pre
16 8 R Pos
17 9 R Pre
18 9 R Pos
19 10 R Pre
20 10 R Pos
21 11 R Pre
22 11 R Pos
23 12 R Pre
24 12 R Pos
25 13 R Pre
26 13 R Pos
27 14 R Pre
28 14 R Pos
29 15 R Pre
30 15 R Pos
31 16 R Pre
32 16 R Pos
33 17 R Pre
34 17 R Pos
35 18 R Pre
36 18 R Pos
37 19 R Pre
38 19 R Pos
39 20 R Pre
40 20 R Pos
41 1 S Pre
42 1 S Pos
43 2 S Pre
44 2 S Pos
45 3 S Pre
46 3 S Pos
47 4 S Pre
48 4 S Pos
49 5 S Pre
50 5 S Pos
51 6 S Pre
52 6 S Pos
53 7 S Pre
54 7 S Pos
55 8 S Pre
56 8 S Pos
57 9 S Pre
58 9 S Pos
59 10 S Pre
60 10 S Pos
61 11 S Pre
62 11 S Pos
63 12 S Pre
64 12 S Pos
65 13 S Pre
66 13 S Pos
67 14 S Pre
68 14 S Pos
69 15 S Pre
70 15 S Pos
71 16 S Pre
72 16 S Pos
73 17 S Pre
74 17 S Pos
75 18 S Pre
76 18 S Pos
77 19 S Pre
78 19 S Pos
design <- model.matrix(~status+status:animal+status:tempo, data = targets)
dge <- estimateGLMCommonDisp(dge, design)
Error in glmFit.default(y, design = design, dispersion = dispersion, offset = offset, :
Design matrix not of full rank. The following coefficients not estimable:
statusS:animal20
Note that I have 20 animals in the first condition (R) and 19 in second (S). The same for the others dispersions.
Then, I've changed my data.frame to:
targets2
animal status tempo
1 1 R Pre
2 1 R Pos
3 2 R Pre
4 2 R Pos
5 3 R Pre
6 3 R Pos
7 4 R Pre
8 4 R Pos
9 5 R Pre
10 5 R Pos
11 6 R Pre
12 6 R Pos
13 7 R Pre
14 7 R Pos
15 8 R Pre
16 8 R Pos
17 9 R Pre
18 9 R Pos
19 10 R Pre
20 10 R Pos
21 11 R Pre
22 11 R Pos
23 12 R Pre
24 12 R Pos
25 13 R Pre
26 13 R Pos
27 14 R Pre
28 14 R Pos
29 15 R Pre
30 15 R Pos
31 16 R Pre
32 16 R Pos
33 17 R Pre
34 17 R Pos
35 18 R Pre
36 18 R Pos
37 19 R Pre
38 19 R Pos
39 20 R Pre
40 20 R Pos
41 21 S Pre
42 21 S Pos
43 22 S Pre
44 22 S Pos
45 23 S Pre
46 23 S Pos
47 24 S Pre
48 24 S Pos
49 25 S Pre
50 25 S Pos
51 26 S Pre
52 26 S Pos
53 27 S Pre
54 27 S Pos
55 28 S Pre
56 28 S Pos
57 29 S Pre
58 29 S Pos
59 30 S Pre
60 30 S Pos
61 31 S Pre
62 31 S Pos
63 32 S Pre
64 32 S Pos
65 33 S Pre
66 33 S Pos
67 34 S Pre
68 34 S Pos
69 35 S Pre
70 35 S Pos
71 36 S Pre
72 36 S Pos
73 37 S Pre
74 37 S Pos
75 38 S Pre
76 38 S Pos
77 39 S Pre
78 39 S Pos
And now, I have the following message:
dge <- estimateGLMCommonDisp(dge, design2)
Warning message:
In estimateGLMCommonDisp.default(y = y$counts, design = design, :
No residual df: setting dispersion to NA
My third try was change my design formula to:
design3 <- model.matrix(~animal+tempo, data = targets2)
And then I could calculate the dispersions and fit the model (glmFit). However, this is not exactly what the manual says and doesn't look like the best option for me, once that when I will ask for the glmLRT, it seems that will be important include the animal status between comparisons.
Moreover, is not clear until now how to represent the “coef” and “contrast” in this function, based on a numeric vector.
Some help will be greatly appreciate
Sincerely,
Daniela Moré
Comment