Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • DESeq input table upload

    Hi,

    I am new to DESeq (R version 2.13.1) and am trying to upload a tab delimited file.. (I am using the sample file "TagSeqExample.tab"), but I am getting an error..

    ==================================================
    > library( DESeq )
    Loading required package: Biobase

    Welcome to Bioconductor

    Vignettes contain introductory material. To view, type
    'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")' and for packages 'citation("pkgname")'.

    Loading required package: locfit
    Loading required package: akima
    Loading required package: lattice
    locfit 1.5-6 2010-01-20
    Warning messages:
    1: '.readRDS' is deprecated.
    Use 'readRDS' instead.
    See help("Deprecated")
    2: '.readRDS' is deprecated.
    Use 'readRDS' instead.
    See help("Deprecated")
    3: '.readRDS' is deprecated.
    Use 'readRDS' instead.
    See help("Deprecated")
    > over <- read.delim("C:\\Nandan\\TagSeqExample_top.tab", header=TRUE, stringsAsFactors=TRUE)
    > conds=c(rep("GZ", 1), rep("DZ", 1))
    > head(over)
    gene T1a T1b T2 T3 N1 N2
    1 Gene_00001 0 0 2 0 0 1
    2 Gene_00002 20 8 12 5 19 26
    3 Gene_00003 3 0 2 0 0 0
    4 Gene_00004 75 84 241 149 271 257
    5 Gene_00005 10 16 4 0 4 10
    6 Gene_00006 129 126 451 223 243 149
    > conds=c(rep("GZ", 1), rep("DZ", 1))
    > cds <- newCountDataSet( over, conds )
    Error in round(countData) : Non-numeric argument to mathematical function
    ==================================================

    I tried my dataset (tab de-limited file) but with the same results.. I distinctly remember my tab separated file getting uploaded fine with the previous version of DeSeq (however I need to use this version to use a few new methods)

    I have confirmed that the columns contain only numeric characters

    Can anyone pint me in the right direction?

    cheers,

    Nandan

  • #2
    Originally posted by ndeshpan View Post
    Hi,

    I am new to DESeq (R version 2.13.1) and am trying to upload a tab delimited file.. (I am using the sample file "TagSeqExample.tab"), but I am getting an error..

    ==================================================
    > library( DESeq )
    Loading required package: Biobase

    Welcome to Bioconductor

    Vignettes contain introductory material. To view, type
    'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")' and for packages 'citation("pkgname")'.

    Loading required package: locfit
    Loading required package: akima
    Loading required package: lattice
    locfit 1.5-6 2010-01-20
    Warning messages:
    1: '.readRDS' is deprecated.
    Use 'readRDS' instead.
    See help("Deprecated")
    2: '.readRDS' is deprecated.
    Use 'readRDS' instead.
    See help("Deprecated")
    3: '.readRDS' is deprecated.
    Use 'readRDS' instead.
    See help("Deprecated")
    > over <- read.delim("C:\\Nandan\\TagSeqExample_top.tab", header=TRUE, stringsAsFactors=TRUE)
    > conds=c(rep("GZ", 1), rep("DZ", 1))
    > head(over)
    gene T1a T1b T2 T3 N1 N2
    1 Gene_00001 0 0 2 0 0 1
    2 Gene_00002 20 8 12 5 19 26
    3 Gene_00003 3 0 2 0 0 0
    4 Gene_00004 75 84 241 149 271 257
    5 Gene_00005 10 16 4 0 4 10
    6 Gene_00006 129 126 451 223 243 149
    > conds=c(rep("GZ", 1), rep("DZ", 1))
    > cds <- newCountDataSet( over, conds )
    Error in round(countData) : Non-numeric argument to mathematical function
    ==================================================

    I tried my dataset (tab de-limited file) but with the same results.. I distinctly remember my tab separated file getting uploaded fine with the previous version of DeSeq (however I need to use this version to use a few new methods)

    I have confirmed that the columns contain only numeric characters

    Can anyone pint me in the right direction?

    cheers,

    Nandan
    If I remember correctly you should first do the following:

    over <- read.delim("C:\\Nandan\\TagSeqExample_top.tab", header=TRUE, stringsAsFactors=TRUE)
    conds=c(rep("GZ", 1), rep("DZ", 1))

    rownames(over) <- over$gene
    over <- over[,-1]

    This problem arises because non-numeric values are being passed on (the first colum counts as well). This way only numeric values are present in your count table.
    Also check wether there are any empty entries in your table.

    Comment


    • #3
      Hi,
      I think the problem is in the count table (over). In your case the first column contains the gene name instead of containing the first count. You should be able to fix it by removing the first column and by assigning gene names to row names (untested):
      Code:
      over <- read.delim("C:\\Nandan\\TagSeqExample_top.tab", header=TRUE, stringsAsFactors=TRUE)
      gene_names<- over[,1]
      over<- over[,2:ncol(over)]
      rownames(over)<- gene_names
      ## ...etc
      Also, I don't remember the functions of DESeq exactly but I think the vector of conditions should be of the same length as the the number of samples so your conds=c(rep("GZ", 1), rep("DZ", 1)) is incorrect.

      Dario

      Comment

      Latest Articles

      Collapse

      • seqadmin
        Essential Discoveries and Tools in Epitranscriptomics
        by seqadmin




        The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
        Yesterday, 07:01 AM
      • seqadmin
        Current Approaches to Protein Sequencing
        by seqadmin


        Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
        04-04-2024, 04:25 PM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by seqadmin, 04-11-2024, 12:08 PM
      0 responses
      58 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-10-2024, 10:19 PM
      0 responses
      53 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-10-2024, 09:21 AM
      0 responses
      45 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-04-2024, 09:00 AM
      0 responses
      55 views
      0 likes
      Last Post seqadmin  
      Working...
      X