SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
DESeq2 Simon Anders Bioinformatics 123 07-06-2015 01:45 AM
Error Message in nbinomLRT in DESeq2 ToddB Bioinformatics 13 09-05-2013 06:22 AM
FluxSimulator error model fatrabbit Bioinformatics 2 03-27-2012 07:33 AM
ONT error model and quality scoring SillyPoint The Pipeline 0 02-21-2012 07:21 AM
TopHat error: disk full joseph Bioinformatics 3 12-02-2010 03:46 AM

Reply
 
Thread Tools
Old 08-22-2013, 11:35 PM   #1
NicoBxl
not just another member
 
Location: Belgium

Join Date: Aug 2010
Posts: 263
Default DESeq2 error : the model matrix is not full rank

Hi,

I've got an error in a DESeq2 analysis. So I've 30 samples, with a 3 factor design. So I put a Replicate column in my design . it's for technical replicate. I don't know if it's a good idea...

So when I use DESeq2 :

Code:
dds <- DESeqDataSetFromMatrix(countData=OAR.readCount,colData=design,design= ~ Stranded + Replicate + condition )
and I have an error :

invalid class “DESeqDataSet” object: the model matrix is not full rank, i.e. one or more variables in the design formula are linear combinations of the others

Anyone has an idea to solve this ?

Thanks,

N.

Code:
Sample	condition	Stranded	Replicate
A.1	Ctrl	No	A
B.1	Ctrl	No	B
C.1	Ctrl	No	C
D.1	Tum	No	D
E.1	Tum	No	E
F.1	Tum	No	F
G.1	Tum	No	G
H.1	Tum	No	H
I.1	Tum	No	I
J.1	Tum	No	J
K.1	Tum	No	K
L.1	Tum	No	L
M.1	Ctrl	Yes	M
N.1	Tum	Yes	N
O.1	Tum	Yes	O
P.1	Tum	Yes	P
E.2	Tum	Yes	E
F.2	Tum	Yes	F
I.2	Tum	Yes	I
Q.1	Tum	Yes	Q
K.2	Tum	Yes	K
R.1	Tum	Yes	R
S.1	Tum	Yes	S
T.1	Tum	Yes	T
L.2	Tum	Yes	L
O.2	Tum	Yes	O
T.2	Tum	Yes	T
I.3	Tum	Yes	I
L.3	Tum	Yes	L
NicoBxl is offline   Reply With Quote
Old 08-25-2013, 06:57 AM   #2
Wolfgang Huber
Senior Member
 
Location: Heidelberg, Germany

Join Date: Aug 2009
Posts: 108
Default

Dear N.

try removing either one of the columns "Sample" or "Replicate", it seems they are redundant of each other.

Best wishes
Wolfgang
__________________
Wolfgang Huber
EMBL
Wolfgang Huber is offline   Reply With Quote
Old 03-26-2015, 05:28 AM   #3
rozitaa
Member
 
Location: Sweden

Join Date: Jun 2013
Posts: 51
Default

Hi,

I also get the same error but I don't have same columns as N. has!!! Can anyone help me with this? mine look like this:

Code:
muss_log	tissue	gut_microbiota
1	5231	Si5	GF
2	5231	PC	GF
3	5231	liver	GF
4	5232	Si5	GF
5	5232	PC	GF
6	5232	liver	GF
7	5233	Si5	GF
8	5233	PC	GF
9	5233	liver	GF
10	5234	Si5	GF
11	5234	PC	GF
12	5234	liver	GF
13	5161	Si5	mono
14	5161	PC	mono
15	5161	liver	mono
16	5162	Si5	mono
17	5162	PC	mono
18	5162	liver	mono
19	5163	Si5	mono
20	5163	PC	mono
21	5163	liver	mono
22	5164	Si5	prevExci
23	5164	PC	prevExci
24	5164	liver	prevExci
25	5164	liver	prevExci
26	5165	Si5	prevExci
27	5165	PC	prevExci
28	5165	liver	prevExci
29	5166	Si5	prevExci
30	5166	PC	prevExci
31	5166	liver	prevExci
32	5167	Si5	prev
33	5167	PC	prev
34	5167	liver	prev
35	5168	Si5	prev
36	5168	PC	prev
37	5168	liver	prev
38	5169	Si5	prev
39	5169	PC	prev
40	5169	liver	prev
41	5170	Si5	prev
42	5170	PC	prev
43	5170	liver	prev
44	5171	Si5	mono
45	5171	PC	mono
46	5171	liver	mono
47	5172	Si5	mono
48	5172	PC	mono
49	5172	liver	mono
50	5173	Si5	mono
51	5173	PC	mono
52	5173	liver	mono
53	5174	Si5	prev
54	5174	PC	prev
55	5174	liver	prev
56	5175	Si5	prev
57	5175	PC	prev
58	5175	liver	prev
59	5176	Si5	prev
60	5176	PC	prev
61	5176	liver	prev
62	5177	Si5	prevMono
63	5177	PC	prevMono
64	5177	liver	prevMono
65	5178	Si5	prevMono
66	5178	PC	prevMono
67	5178	liver	prevMono
68	5179	Si5	prevMono
69	5179	PC	prevMono
rozitaa is offline   Reply With Quote
Old 03-26-2015, 06:11 AM   #4
dpryan
Devon Ryan
 
Location: Freiburg, Germany

Join Date: Jul 2011
Posts: 3,455
Default

What's your design?
dpryan is offline   Reply With Quote
Old 03-26-2015, 06:16 AM   #5
rozitaa
Member
 
Location: Sweden

Join Date: Jun 2013
Posts: 51
Default

Quote:
Originally Posted by dpryan View Post
What's your design?

Code:
dds = DESeqDataSetFromHTSeqCount(sampleTable=sample_table, directory='~/dataAnalysis/Petia/barley_mice_rna-seq/htseq/', design= ~ muss_log + tissue + gut_microbiota)
rozitaa is offline   Reply With Quote
Old 03-26-2015, 06:19 AM   #6
dpryan
Devon Ryan
 
Location: Freiburg, Germany

Join Date: Jul 2011
Posts: 3,455
Default

You've already accounted for "gut_microbiota" with "muss_log", the latter determines the former.
dpryan is offline   Reply With Quote
Old 03-26-2015, 06:38 AM   #7
rozitaa
Member
 
Location: Sweden

Join Date: Jun 2013
Posts: 51
Default

Quote:
Originally Posted by dpryan View Post
You've already accounted for "gut_microbiota" with "muss_log", the latter determines the former.
How? I cannot understand that!!! It's true that each replicates can only have one gut_microbiota status but also for each sample I have three measurements (i.e. measuring three different tissues) that I want to take care of this within sample correlations. For example GF status is in 4 different samples 5231, 5232, 5233, 5234; I cannot see how gut_microbiota can cover muss_log!!!!
rozitaa is offline   Reply With Quote
Old 03-26-2015, 07:23 AM   #8
dpryan
Devon Ryan
 
Location: Freiburg, Germany

Join Date: Jul 2011
Posts: 3,455
Default

Quote:
How?
Let's take a simpler example:

Code:
df <- data.frame(Sample=factor(c(1:10)), Group=c(rep("A",5), rep("B",5)))
mm <- model.matrix(~Sample+Group, df)
mm
Have a look at "mm". The last column is simply the sum of columns 6-10, meaning that if you estimate those coefficients, then the last coefficient is completely determined by them (or inversely, if you estimate it then the others are already determined). Something along these lines is also the case for the model matrix in your design.
dpryan is offline   Reply With Quote
Old 03-26-2015, 08:00 AM   #9
rozitaa
Member
 
Location: Sweden

Join Date: Jun 2013
Posts: 51
Default

Quote:
Originally Posted by dpryan View Post
Let's take a simpler example:

Code:
df <- data.frame(Sample=factor(c(1:10)), Group=c(rep("A",5), rep("B",5)))
mm <- model.matrix(~Sample+Group, df)
mm
Have a look at "mm". The last column is simply the sum of columns 6-10, meaning that if you estimate those coefficients, then the last coefficient is completely determined by them (or inversely, if you estimate it then the others are already determined). Something along these lines is also the case for the model matrix in your design.

Thanks!
So I will remove muss_log!
rozitaa is offline   Reply With Quote
Old 03-14-2017, 06:10 AM   #10
chammer
Junior Member
 
Location: Lausanne

Join Date: Feb 2014
Posts: 2
Default

I have been trying to use DESeq2 for analysing RNA-seq data in R and I have a small question about that.

So to start, the conditions table looks like this:

Code:
sample         donor    virus    vpu    sex
DonorA1_01    A1    none    mock    male
DonorA1_02    A1    CH293    wt    male
DonorA1_03    A1    CH293    stop    male
DonorA1_04    A1    CH293    R50K    male
DonorA1_05    A1    CH293    teth_count    male
DonorA1_06    A1    CH077    wt    male
DonorA1_07    A1    CH077    stop    male
DonorA1_08    A1    CH077    R50K    male
DonorA1_09    A1    CH077    teth_count    male
DonorA1_10    A1    STC01    wt    male
DonorA1_11    A1    STC01    stop    male
DonorA1_12    A1    STC01    R50K    male
DonorA1_13    A1    STC01    teth_count    male
DonorX_01    X    none    mock    female
DonorX_02    X    CH293    wt    female
DonorX_03    X    CH293    stop    female
DonorX_04    X    CH293    R50K    female
DonorX_05    X    CH293    teth_count    female
DonorX_06    X    CH077    wt    female
DonorX_07    X    CH077    stop    female
DonorX_08    X    CH077    R50K    female
DonorX_09    X    CH077    teth_count    female
DonorX_10    X    STC01    wt    female
DonorX_11    X    STC01    stop    female
DonorX_12    X    STC01    R50K    female
DonorX_13    X    STC01    teth_count    female
DonorY_01    Y    none    mock    male
DonorY_02    Y    CH293    wt    male
DonorY_03    Y    CH293    stop    male
DonorY_04    Y    CH293    R50K    male
DonorY_05    Y    CH293    teth_count    male
DonorY_06    Y    CH077    wt    male
DonorY_07    Y    CH077    stop    male
DonorY_08    Y    CH077    R50K    male
DonorY_09    Y    CH077    teth_count    male
DonorY_10    Y    STC01    wt    male
DonorY_11    Y    STC01    stop    male
DonorY_12    Y    STC01    R50K    male
DonorY_13    Y    STC01    teth_count    male
DonorZ_01    Z    none    mock    female
DonorZ_02    Z    CH293    wt    female
DonorZ_03    Z    CH293    stop    female
DonorZ_04    Z    CH293    R50K    female
DonorZ_05    Z    CH293    teth_count    female
DonorZ_06    Z    CH077    wt    female
DonorZ_07    Z    CH077    stop    female
DonorZ_08    Z    CH077    R50K    female
DonorZ_09    Z    CH077    teth_count    female
DonorZ_10    Z    STC01    wt    female
DonorZ_11    Z    STC01    stop    female
DonorZ_12    Z    STC01    R50K    female
DonorZ_13    Z    STC01    teth_count    female
Now when I specify the model for differential expression analysis as dds <- DESeqDataSetFromTximport(txi, samples, ~vpu+donor+virus), I get an error message:

Error in checkFullRank(modelMatrix) :
the model matrix is not full rank, so the model cannot be fit as specified.
One or more variables or interaction terms in the design formula are linear
combinations of the others and must be removed.

Same problem if the model is “vpu + virus” or "donor + sex".

I understand that this is because of collinearity among the variables but I am not sure how to resolve the issue as in my case all covariates are collinear to each other (and not just a pair of collinear variables).

Any help on this will be highly appreciated!

Thanks!
Chris
chammer is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 10:09 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2017, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO