SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Error Running DESeq2 pm2012 RNA Sequencing 22 05-11-2016 03:05 PM
DESeq2 error stormin Bioinformatics 6 09-08-2014 04:18 AM
DESeq2 error: varianceStabilizingTransformation error JonB Bioinformatics 7 11-22-2013 01:15 AM
Wired DESeq2 error sindrle Bioinformatics 5 10-25-2013 05:29 AM
Error Message in nbinomLRT in DESeq2 ToddB Bioinformatics 13 09-05-2013 06:22 AM

Reply
 
Thread Tools
Old 01-26-2015, 11:52 AM   #1
nw328
Junior Member
 
Location: DC

Join Date: Jan 2015
Posts: 4
Default DESeq2 Error

Hi All,
I am trying to use DESeq2 to analyze some RNAseq count data from a .csv. It is an experiment with 5 conditions done in duplicate (WT, null, condition1, condition2, condition3) . I am relatively new to R, and have been primarily borrowing snippets of applicable scripts, largely from this thread. Here is what I have so far:



>library(ggplot2)
>library(DESeq2)
>library(Biobase)

>df<-read.csv('/Volumes/NICK3/R/data/RNAseq_sandbox.csv', header=T, sep=",", row.names=1)

>countsTable <-data.matrix(df[1:10], rownames.force = NA)
> head(countsTable)
# WT_a WT_b null_a null_b test1_a test1_b test2_a test2_b test3_a test3_b
#s00010 6293 8602 4339 6691 4614 8632 4093 8830 7656 9657
#s00020 4009 6370 2647 4488 3001 6044 2586 5885 5441 6468
#s00030 858 1413 624 1122 684 1220 648 1452 1170 1276
#s00040 5350 8466 4354 7175 4106 8349 4329 8047 7283 7729
#s00050 666 1033 448 884 446 1030 453 980 829 1157
#s00060 11259 12059 8609 9902 8496 11568 8281 12140 10827 11676

>conditions<-c("X1", "X2", "A1", "A2", "B1", "B2", "C1", 'C2', "D1", "D2")
>samples <- data.frame(row.names=conditions, condition=as.factor(c(rep("ctl",2), rep("test",8))))
> samples
# condition
#X1 ctl
#X2 ctl
#A1 test
#A2 test
#B1 test
#B2 test
#C1 test
#C2 test
#D1 test
#D2 test

dds <- DESeqDataSetFromMatrix(countData = countsTable, colData=samples, design=~condition)
#Error in if (any(assay(se) < 0)) { :
# missing value where TRUE/FALSE needed


I am a little unsure of where to go from here with this error. I dont have null values or negatives. It seems that I am not giving the right syntax for making the summarizedExperiment.

Any help would be appreciated; thanks in advance!
nw328 is offline   Reply With Quote
Old 01-27-2015, 10:01 AM   #2
hiddenrisk
Junior Member
 
Location: Texas

Join Date: Sep 2011
Posts: 7
Default

Hi- As I understand the documentation, you need to have a sample data frame that has replicated information. As I read your sample table at present, you have 12 different levels for the conditions (ie. 12 rows), and then only two levels for the "samples" ("ctl" and "test"). Therefore, you are telling DESeq that there are two conditions to test (ctl, and test) rather than the 5 conditions you actually tested. My suggestion would be to use the following code to make your colData:

> samples<-c("X1", "X2", "A1", "A2", "B1", "B2", "C1", 'C2', "D1", "D2")
> condition <- c(rep("ctrl",2),rep("A",2),rep("B",2),rep("C",2),rep("D",2)
> pData = cbind(samples, condition)
> dds <- DESeqDataSetFromMatrix(countData = countsTable, colData=pData, design=~condition)
hiddenrisk is offline   Reply With Quote
Old 01-27-2015, 10:15 AM   #3
Michael Love
Senior Member
 
Location: Boston

Join Date: Jul 2013
Posts: 333
Default

I can't see either where the error is coming from. Maybe some sanity checks on the matrix you provide to countData, to make sure it is an integer matrix with 10 columns and no NAs.

class(countsTable)
dim(countsTable)
apply(countsTable, 2, summary)
Michael Love is offline   Reply With Quote
Old 01-27-2015, 11:15 AM   #4
nw328
Junior Member
 
Location: DC

Join Date: Jan 2015
Posts: 4
Default

Hi Michael Love and hiddenrisk,

Thank you so much for your replies- both helped fix the error, for I did need to rework the comparisons, and I did have a stray NA. Thanks again!
nw328 is offline   Reply With Quote
Old 01-27-2015, 11:36 AM   #5
Michael Love
Senior Member
 
Location: Boston

Join Date: Jul 2013
Posts: 333
Default

I added an NA check to the constructor in front of the negative value check, since I only had an NA check in the validity function which is called later on.
Michael Love is offline   Reply With Quote
Old 02-13-2015, 10:59 AM   #6
lmolokin
Member
 
Location: Beltsville, MD, USA

Join Date: Jul 2012
Posts: 23
Default same error

I am getting the same error as OP. What are you guys referring to when you say NAs?

I have 24 samples: 2 treatment groups, 6 subjects with 2 time points each.

My code is as follows:
Code:
designframe <- data.frame(row.names = colnames(countData),
                          tx = factor(c(rep("c",12),rep("g",12))),
                          patient = factor(c("1","1","2","2","3","3","4","4","5","5",
                                             "6","6","1","1","2","2","3","3","4","4","5","5","6","6")),
                          time = factor(c(rep(1:2,12))))

patient <- factor(designframe$patient)
tx <- factor(designframe$tx)
time <- factor(designframe$time)
designframe <- data.frame(tx,patient,time)

dds <- DESeqDataSetFromMatrix(countData, colData = designframe, formula(~patient+time+tx:time))
The DESeqDataSetFromMatrix line results in:

Error in if (any(assay(se) < 0)) { :
missing value where TRUE/FALSE needed


The "sanity checks" yield the following:

Code:
> class(countData)
[1] "data.frame"
> dim(countData)
[1] 51798    24
> apply(countData, 2, summary)
           C211BL   C211M3    C215BL   C215M3    C220BL    C220M3    C305BL    C305M3    C317BL    C317M3   C324BL
Min.          0.0      0.0       0.0      0.0       0.0      0.00      0.00       0.0       0.0       0.0      0.0
1st Qu.       0.0      0.0       0.0      0.0       0.0      0.00      0.00       0.0       0.0       0.0      0.0
Median        0.0      0.0       0.0      0.0       0.0      0.00      0.00       0.0       0.0       0.0      0.0
Mean        288.3    112.4     231.4    159.2     404.6     59.02     74.74     254.6     273.5     327.6    162.1
3rd Qu.      19.0      8.0      17.0     10.0      24.0      2.00      6.00      19.0      15.0      24.0     14.0
Max.    3128000.0 863800.0 2138000.0 746700.0 1714000.0 211400.00 494600.00 1805000.0 5676000.0 5882000.0 419700.0
NA's          1.0      1.0       1.0      1.0       1.0      1.00      1.00       1.0       1.0       1.0      1.0
          C324M3   G211BL   G211M3 G215BL   G215M3   G220BL   G220M3  G305BL  G305M3   G317BL   G317M3   G324BL
Min.         0.0      0.0      0.0      0      0.0      0.0      0.0     0.0     0.0      0.0      0.0      0.0
1st Qu.      0.0      0.0      0.0      0      0.0      0.0      0.0     0.0     0.0      0.0      0.0      0.0
Median       0.0      0.0      0.0      0      0.0      0.0      0.0     0.0     0.0      0.0      0.0      0.0
Mean       199.5    202.7    196.2    209    301.1    170.6    121.6   167.8   155.9    219.8    278.5    211.4
3rd Qu.     17.0     16.0     16.0     18     22.0     11.0      5.0    14.0    12.0     18.0     29.0     17.0
Max.    609700.0 111000.0 146500.0 107700 168800.0 135900.0 132600.0 79130.0 84860.0 104900.0 108000.0 125200.0
NA's         1.0      1.0      1.0      1      1.0      1.0      1.0     1.0     1.0      1.0      1.0      1.0
         G324M3
Min.        0.0
1st Qu.     0.0
Median      0.0
Mean      116.6
3rd Qu.    10.0
Max.    77110.0
NA's        1.0
I see NA's of 1.0 under each column but I'm not sure what that means.

Thanks!
lmolokin is offline   Reply With Quote
Old 02-13-2015, 11:11 AM   #7
dpryan
Devon Ryan
 
Location: Freiburg, Germany

Join Date: Jul 2011
Posts: 3,479
Default

Likely unrelated, but you shouldn't have fractional counts.

The summary indicates that you have NA ("not applicable") somewhere. You probably just have a row of them, so:
Code:
countData[which(is.na(countData[,1])),]
will probably show the row in question. Just remove it.
dpryan is offline   Reply With Quote
Old 02-13-2015, 11:11 AM   #8
Michael Love
Senior Member
 
Location: Boston

Join Date: Jul 2013
Posts: 333
Default

That means that you have some NA's in the matrix, which should only have non-negative integers.

You can try to find them with:

Code:
narows <- apply(countData, 1, function(x) any(is.na(x)))

table(narows)
You need to remove these rows from the countData first:

Code:
countDataClean <- countData[ !narows, ]
Michael Love is offline   Reply With Quote
Old 02-13-2015, 11:22 AM   #9
lmolokin
Member
 
Location: Beltsville, MD, USA

Join Date: Jul 2012
Posts: 23
Default

There appeared to be a row of NAs but how is that possible? When I checked countData.csv,all counts were integers. Did the NAs somehow get introduced upon import?

Code:
countData = read.csv (file.choose(), header=TRUE, row.names=1)
?
lmolokin is offline   Reply With Quote
Old 02-13-2015, 02:28 PM   #10
dpryan
Devon Ryan
 
Location: Freiburg, Germany

Join Date: Jul 2011
Posts: 3,479
Default

If you have a blank line at the end then that'll happen.
dpryan is offline   Reply With Quote
Reply

Tags
biobase, bioconductor, deseq2

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 05:52 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2018, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO