SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
DESeq2: struggle to add multiple variables to DESeqDataSet JonB Bioinformatics 4 06-09-2014 01:00 PM
DESeq2 multiple levels id0 Bioinformatics 3 05-30-2014 11:31 AM
Are Cuffdiff q-values and p-values reliable when multiple treatments are analyzed? gwilymh Bioinformatics 1 03-16-2014 11:32 AM
Creating a data.frame for DESeq2 KHubbard Bioinformatics 3 10-11-2013 11:16 PM
cummeRbund andplotting for selected treatments among multiple treatments ataheri RNA Sequencing 1 04-10-2013 06:41 PM

Reply
 
Thread Tools
Old 07-15-2014, 10:55 PM   #1
KYR
Member
 
Location: Toronto

Join Date: May 2012
Posts: 18
Exclamation DESeq2 error in data.frame (multiple treatments and multiple replicates)

I have a text file, containing read counts per gene for each treatments and control, with the following column

[Gene Symbol] [C1] [C2] [C3] [A1] [A2] [A3] [B1] [B2] [B3]
  • C is Control
  • A is Treatment 1
  • B is Treatment 2
-> Each of C, A an B have 3 replicates

When I do data.frame it generates an error

Code:
library( "DESeq2" )
library("Biobase")
mydata = read.table("matrix.txt", header=TRUE)
col1 <- mydata[,1]

## Error message
ExpDesign = data.frame(row.names=col1, condition=c("C", "C", "C", "A", "A", "A", "B", "B", "B")
Error in data.frame(row.names = col1, condition = c("C", "C", "C", "A",  : 
  row names supplied are of the wrong length
## The following is what I would next if I didn't have any error message
Code:
countdata <- assay( mydata )
head( countdata )
coldata <- colData( mydata )
rownames( coldata ) <- coldata$run
colnames( countdata ) <- coldata$run
head( coldata[ , c("C", "C", "C", "A", "A", "A", "B", "B", "B") ] )
Eventually the goal is to have a heatmap with each replicates in the control, treatment A and B.

I think the problem comes from the fact that I should subset my data, though I have no clue how to do to that. Any suggestions on where is the error message coming from and how to subset data? (If I should ever subset that..)

Last edited by KYR; 07-15-2014 at 10:57 PM. Reason: typo
KYR is offline   Reply With Quote
Old 07-16-2014, 06:35 AM   #2
Michael Love
Senior Member
 
Location: Boston

Join Date: Jul 2013
Posts: 333
Default

This is your countData, which has as many rows as genes:

Code:
mydata = read.table("matrix.txt", header=TRUE)
col1 <- mydata[,1]
It looks like this will be the colData (sample information table).

Code:
ExpDesign = data.frame(row.names=col1, condition=c("C", "C", "C", "A", "A", "A", "B", "B", "B")
...which has as many rows as samples.

So the error comes when you try to name the rows of your colData using the gene names in col1.

You will also get an error later when you try to run
Code:
assay( mydata )
because mydata is a data.frame. assay() is a function for getting a matrix from SummarizedExperiment objects. You can just use
Code:
as.matrix( mydata )
in order to supply a matrix to DESeqDataSet.
Michael Love is offline   Reply With Quote
Old 03-19-2019, 08:51 AM   #3
rookie_genomics
Junior Member
 
Location: USA

Join Date: Mar 2019
Posts: 4
Default

Hi,

I followed your advice and tried to import as a matrix. But when I try to set up col.data I still get an error

This is my code

Quote:
deseq2_analysis2 <- read_excel("deseq2_analysis2.xlsx")
> View(deseq2_analysis2)
> analysis3 <- as.matrix(deseq2_analysis2)
> (condition <- factor(c(rep("group1", 4), rep("group2", 4), rep("group3", 4), rep("group4", 4))))
[1] group1 group1 group1 group1 group2 group2 group2 group2 group3 group3 group3 group3 group4
[14] group4 group4 group4
Levels: group1 group2 group3 group4
> (coldata <- data.frame(row.names=colnames(analysis3), condition))
Error in data.frame(row.names = colnames(analysis3), condition) :
row names supplied are of the wrong length
This is my result for head command

Quote:
head(deseq2_analysis2)
# A tibble: 6 x 17
gene Sample1_group1 Sample2_group1 Sample3_group1 Sample4_group1 Sample1_group2 Sample2_group2
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 YAL0~ 0 0 0 0 2 0
2 YAL0~ 0 0 0 0 0 0
3 YAL0~ 243 242 109 130 271 233
4 YAL0~ 16 7 52 30 23 10
5 YAL0~ 23 21 21 33 11 28
6 YAL0~ 38 42 76 88 47 40
# ... with 10 more variables: Sample3_group2 <dbl>, Sample4_group2 <dbl>, Sample1_group3 <dbl>,
# Sample2_group3 <dbl>, Sample3_group3 <dbl>, Sample4_group3 <dbl>, Sample1_group4 <dbl>,
# Sample2_group4 <dbl>, Sample3_group4 <dbl>, Sample4_group4 <dbl>
What am I doing wrong here?
rookie_genomics is offline   Reply With Quote
Old 03-19-2019, 09:05 AM   #4
rookie_genomics
Junior Member
 
Location: USA

Join Date: Mar 2019
Posts: 4
Default

Quote:
Originally Posted by Michael Love View Post
This is your countData, which has as many rows as genes:

Code:
mydata = read.table("matrix.txt", header=TRUE)
col1 <- mydata[,1]
It looks like this will be the colData (sample information table).

Code:
ExpDesign = data.frame(row.names=col1, condition=c("C", "C", "C", "A", "A", "A", "B", "B", "B")
...which has as many rows as samples.

So the error comes when you try to name the rows of your colData using the gene names in col1.

You will also get an error later when you try to run
Code:
assay( mydata )
because mydata is a data.frame. assay() is a function for getting a matrix from SummarizedExperiment objects. You can just use
Code:
as.matrix( mydata )
in order to supply a matrix to DESeqDataSet.
Hi,

I followed your advice and tried to import as a matrix. But when I try to set up col.data I still get an error

This is my code

Quote:
deseq2_analysis2 <- read_excel("deseq2_analysis2.xlsx")
> View(deseq2_analysis2)
> analysis3 <- as.matrix(deseq2_analysis2)
> (condition <- factor(c(rep("group1", 4), rep("group2", 4), rep("group3", 4), rep("group4", 4))))
[1] group1 group1 group1 group1 group2 group2 group2 group2 group3 group3 group3 group3 group4
[14] group4 group4 group4
Levels: group1 group2 group3 group4
> (coldata <- data.frame(row.names=colnames(analysis3), condition))
Error in data.frame(row.names = colnames(analysis3), condition) :
row names supplied are of the wrong length
This is my result for head command

Quote:
head(deseq2_analysis2)
# A tibble: 6 x 17
gene Sample1_group1 Sample2_group1 Sample3_group1 Sample4_group1 Sample1_group2 Sample2_group2
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 YAL0~ 0 0 0 0 2 0
2 YAL0~ 0 0 0 0 0 0
3 YAL0~ 243 242 109 130 271 233
4 YAL0~ 16 7 52 30 23 10
5 YAL0~ 23 21 21 33 11 28
6 YAL0~ 38 42 76 88 47 40
# ... with 10 more variables: Sample3_group2 <dbl>, Sample4_group2 <dbl>, Sample1_group3 <dbl>,
# Sample2_group3 <dbl>, Sample3_group3 <dbl>, Sample4_group3 <dbl>, Sample1_group4 <dbl>,
# Sample2_group4 <dbl>, Sample3_group4 <dbl>, Sample4_group4 <dbl>
What am I doing wrong here?
rookie_genomics is offline   Reply With Quote
Reply

Tags
deseq2

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 01:03 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO