SEQanswers

SEQanswers (http://seqanswers.com/forums/index.php)
-   Bioinformatics (http://seqanswers.com/forums/forumdisplay.php?f=18)
-   -   DESeq input table upload (http://seqanswers.com/forums/showthread.php?t=12875)

ndeshpan 07-20-2011 11:45 PM

DESeq input table upload
 
Hi,

I am new to DESeq (R version 2.13.1) and am trying to upload a tab delimited file.. (I am using the sample file "TagSeqExample.tab"), but I am getting an error..

==================================================
> library( DESeq )
Loading required package: Biobase

Welcome to Bioconductor

Vignettes contain introductory material. To view, type
'browseVignettes()'. To cite Bioconductor, see
'citation("Biobase")' and for packages 'citation("pkgname")'.

Loading required package: locfit
Loading required package: akima
Loading required package: lattice
locfit 1.5-6 2010-01-20
Warning messages:
1: '.readRDS' is deprecated.
Use 'readRDS' instead.
See help("Deprecated")
2: '.readRDS' is deprecated.
Use 'readRDS' instead.
See help("Deprecated")
3: '.readRDS' is deprecated.
Use 'readRDS' instead.
See help("Deprecated")
> over <- read.delim("C:\\Nandan\\TagSeqExample_top.tab", header=TRUE, stringsAsFactors=TRUE)
> conds=c(rep("GZ", 1), rep("DZ", 1))
> head(over)
gene T1a T1b T2 T3 N1 N2
1 Gene_00001 0 0 2 0 0 1
2 Gene_00002 20 8 12 5 19 26
3 Gene_00003 3 0 2 0 0 0
4 Gene_00004 75 84 241 149 271 257
5 Gene_00005 10 16 4 0 4 10
6 Gene_00006 129 126 451 223 243 149
> conds=c(rep("GZ", 1), rep("DZ", 1))
> cds <- newCountDataSet( over, conds )
Error in round(countData) : Non-numeric argument to mathematical function
==================================================

I tried my dataset (tab de-limited file) but with the same results.. I distinctly remember my tab separated file getting uploaded fine with the previous version of DeSeq (however I need to use this version to use a few new methods)

I have confirmed that the columns contain only numeric characters

Can anyone pint me in the right direction?

cheers,

Nandan

labunit 07-21-2011 12:13 AM

Quote:

Originally Posted by ndeshpan (Post 46975)
Hi,

I am new to DESeq (R version 2.13.1) and am trying to upload a tab delimited file.. (I am using the sample file "TagSeqExample.tab"), but I am getting an error..

==================================================
> library( DESeq )
Loading required package: Biobase

Welcome to Bioconductor

Vignettes contain introductory material. To view, type
'browseVignettes()'. To cite Bioconductor, see
'citation("Biobase")' and for packages 'citation("pkgname")'.

Loading required package: locfit
Loading required package: akima
Loading required package: lattice
locfit 1.5-6 2010-01-20
Warning messages:
1: '.readRDS' is deprecated.
Use 'readRDS' instead.
See help("Deprecated")
2: '.readRDS' is deprecated.
Use 'readRDS' instead.
See help("Deprecated")
3: '.readRDS' is deprecated.
Use 'readRDS' instead.
See help("Deprecated")
> over <- read.delim("C:\\Nandan\\TagSeqExample_top.tab", header=TRUE, stringsAsFactors=TRUE)
> conds=c(rep("GZ", 1), rep("DZ", 1))
> head(over)
gene T1a T1b T2 T3 N1 N2
1 Gene_00001 0 0 2 0 0 1
2 Gene_00002 20 8 12 5 19 26
3 Gene_00003 3 0 2 0 0 0
4 Gene_00004 75 84 241 149 271 257
5 Gene_00005 10 16 4 0 4 10
6 Gene_00006 129 126 451 223 243 149
> conds=c(rep("GZ", 1), rep("DZ", 1))
> cds <- newCountDataSet( over, conds )
Error in round(countData) : Non-numeric argument to mathematical function
==================================================

I tried my dataset (tab de-limited file) but with the same results.. I distinctly remember my tab separated file getting uploaded fine with the previous version of DeSeq (however I need to use this version to use a few new methods)

I have confirmed that the columns contain only numeric characters

Can anyone pint me in the right direction?

cheers,

Nandan

If I remember correctly you should first do the following:

over <- read.delim("C:\\Nandan\\TagSeqExample_top.tab", header=TRUE, stringsAsFactors=TRUE)
conds=c(rep("GZ", 1), rep("DZ", 1))

rownames(over) <- over$gene
over <- over[,-1]

This problem arises because non-numeric values are being passed on (the first colum counts as well). This way only numeric values are present in your count table.
Also check wether there are any empty entries in your table.

dariober 07-21-2011 03:09 AM

Hi,
I think the problem is in the count table (over). In your case the first column contains the gene name instead of containing the first count. You should be able to fix it by removing the first column and by assigning gene names to row names (untested):
Code:

over <- read.delim("C:\\Nandan\\TagSeqExample_top.tab", header=TRUE, stringsAsFactors=TRUE)
gene_names<- over[,1]
over<- over[,2:ncol(over)]
rownames(over)<- gene_names
## ...etc

Also, I don't remember the functions of DESeq exactly but I think the vector of conditions should be of the same length as the the number of samples so your conds=c(rep("GZ", 1), rep("DZ", 1)) is incorrect.

Dario


All times are GMT -8. The time now is 07:31 AM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.