Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Error Running DESeq2

    Hi

    I am trying to run DESeq2 using the reference manual provided in the bioconductor website. However I am running in the following error after this step: ddsHTSeq <- DESeqDataSetFromHTSeqCount(sampleTable = sampleTable, directory = directory, design= ~ condition) I created my count files using HTseq.

    Error msg.

    Error in Ops.factor(a$V1, l[[1]]$V1) : level sets of factors are different In addition: Warning message: In is.na(e1) | is.na(e2) : longer object length is not a multiple of shorter object length

    Any help is appreciated as I am both R/bioconductor and DESeq2 newbie.

  • #2
    That's a new one. What are the contents of "sampleTable" and "condition"? Also, which version (just type "sessionInfo()" and post the results)?
    Last edited by dpryan; 04-15-2014, 08:04 AM. Reason: Someday I'll reread things before posting...

    Comment


    • #3
      Thanks for your reply.
      Here are the contents of sampleTable including the condition. I have 2 conditions with 2 replicates each.

      sampleNames sampleFiles condition stringAsFactors
      1 231un1 231-un1-DESeq un1 TRUE
      2 231un2 231-un2-DESeq un1 TRUE
      3 231trt1 231-trt1-DESeq trt2 TRUE
      4 231trt2 231-trt2-DESeq trt2 TRUE

      Here's the version info:

      > sessionInfo()
      R version 3.0.2 (2013-09-25)
      Platform: x86_64-pc-linux-gnu (64-bit)

      locale:
      [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
      [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
      [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
      [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
      [9] LC_ADDRESS=C LC_TELEPHONE=C
      [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

      attached base packages:
      [1] parallel stats graphics grDevices utils datasets methods
      [8] base

      other attached packages:
      [1] DESeq2_1.2.10 RcppArmadillo_0.4.200.0 Rcpp_0.11.1
      [4] GenomicRanges_1.14.4 XVector_0.2.0 IRanges_1.20.7
      [7] DESeq_1.14.0 lattice_0.20-24 locfit_1.5-9.1
      [10] Biobase_2.22.0 BiocGenerics_0.8.0

      loaded via a namespace (and not attached):
      [1] annotate_1.40.1 AnnotationDbi_1.24.0 DBI_0.2-7
      [4] genefilter_1.44.0 geneplotter_1.40.0 grid_3.0.2
      [7] RColorBrewer_1.0-5 RSQLite_0.11.4 splines_3.0.2
      [10] stats4_3.0.2 survival_2.37-7 XML_3.98-1.1
      [13] xtable_1.7-3

      Comment


      • #4
        After a bit of poking around the source code, it looks like this might happen if there's something wrong with the count files, namely if they're different lengths. From the command line, you might run:

        Code:
        cut -f 1 231-un1-DESeq | sort | uniq -c
        If you compare the results of that file to the others the output of uniq -c, which just counts how many unique gene/feature names you have in a file, should be the same...but may not be for you.

        BTW, you can get rid of the last column of sampleTable ("stringAsFactors").

        Comment


        • #5
          Thanks a lot for help. It was indeed a problem with my count files. I didn't realize I had to redirect the output of HTseq into a different file. I was using file generated with -o option as an input.
          I reran the script & was able to generate the correct file (also filtered the last few lines starting with __). The rest of code seems to be working well now.
          I also got rid of last colum in sampleTable. It was just one of the many things I was trying to solve my issue.

          Comment


          • #6
            Hello I have more ou less the same problem:

            > ddsHTSeq <- DESeqDataSetFromHTSeqCount(sampleTable = sampleTable, directory = directory, design = des)
            Error in Ops.factor(a$V1, l[[1]]$V1) :
            level sets of factors are different

            the error is constantly on the factors but I'm not understand why.

            I have my ---- sampleCondition=factor and sampleTable=data.frame(sampleName=sampleFiles, fileName=sampleFiles,condition=sampleCondition)

            des <- formula(~ condition)

            I do not know if you can help with this error.

            Thank you very much

            Comment


            • #7
              hi essepf,

              Did you check the length of the count files, as Devon recommended above?

              What does sampleCondition look like?

              What's your sessionInfo()

              Comment


              • #8
                DESeq2 --- htseq-count

                Hi Michael

                Thank you for your suggestion......

                Originally posted by Michael Love View Post
                hi essepf,

                Did you check the length of the count files, as Devon recommended above?

                Yes...all files have 38932....

                What does sampleCondition look like?

                > sampleCondition
                [1] pr pr pr wt wt wt
                Levels: pr wt


                What's your sessionInfo()

                > sessionInfo()
                R version 3.1.0 (2014-04-10)
                Platform: x86_64-apple-darwin13.1.0 (64-bit)

                locale:
                [1] C

                attached base packages:
                [1] parallel stats graphics grDevices utils datasets methods base

                other attached packages:
                [1] BiocInstaller_1.14.2 gplots_2.14.1 ggplot2_1.0.0 DESeq_1.16.0 lattice_0.20-29 locfit_1.5-9.1 Biobase_2.24.0
                [8] DESeq2_1.4.5 RcppArmadillo_0.4.320.0 Rcpp_0.11.2 GenomicRanges_1.16.4 GenomeInfoDb_1.0.2 IRanges_1.22.10 BiocGenerics_0.10.0

                loaded via a namespace (and not attached):
                [1] AnnotationDbi_1.26.0 DBI_0.2-7 KernSmooth_2.23-12 MASS_7.3-33 RColorBrewer_1.0-5 RSQLite_0.11.4 XML_3.98-1.1 XVector_0.4.0
                [9] annotate_1.42.1 bitops_1.0-6 caTools_1.17 colorspace_1.2-4 digest_0.6.4 gdata_2.13.3 genefilter_1.46.1 geneplotter_1.42.0
                [17] grid_3.1.0 gtable_0.1.2 gtools_3.4.1 munsell_0.4.2 plyr_1.8.1 proto_0.3-10 reshape2_1.4 scales_0.2.4
                [25] splines_3.1.0 stats4_3.1.0 stringr_0.6.2 survival_2.37-7 tools_3.1.0 xtable_1.7-3

                Comment


                • #9
                  What happens if you do this instead: des <- ~sampleCondition

                  Comment


                  • #10
                    Hi

                    this is my script:

                    library("DESeq2")
                    sampleFiles <- list.files(path="/Users/me/Desktop/RNASeq/htseq-count_Results_6Samples/htseq_Adp/")
                    sampleCondition=factor(c(rep("pr",3), rep("wt",3)))
                    sampleTable=data.frame(sampleName=sampleFiles, fileName=sampleFiles,condition=sampleCondition)
                    directory <- c("/Users/me/Desktop/RNASeq/htseq-count_Results_6Samples/htseq_Adp/")
                    des <- formula(~ condition)
                    ddsHTSeq <- DESeqDataSetFromHTSeqCount(sampleTable = sampleTable, directory = directory, design = des)

                    if I do what you suggest me I have exactly same error.

                    > ddsHTSeq <- DESeqDataSetFromHTSeqCount(sampleTable = sampleTable, directory = directory, design = des)
                    Error in Ops.factor(a$V1, l[[1]]$V1) :
                    level sets of factors are different

                    Thank you

                    Comment


                    • #11
                      Did you check your count files generated from HTSeq? I had an issue with the count file itself thats why I was getting the error. The count files need to be filtered. See my previous reply to the thread above.

                      Comment


                      • #12
                        Hi pm2012

                        Thanks for your reply.

                        The output files I have are on this format.

                        gene reads_WR1
                        610005C13Rik 2473
                        0610007N19Rik 15
                        0610007P14Rik 1291
                        0610008F07Rik 149
                        0610009B14Rik 0
                        0610009B22Rik 361
                        0610009D07Rik 272
                        0610009E02Rik 4
                        0610009L18Rik 8

                        when you say, filtered, you refers to what?

                        command I used to generate the count:

                        samtools view file.bam | htseq-count -s no -i gene_name - mus_musculus.gff > WT_results_counts.txt

                        Thank you for your help

                        Comment


                        • #13
                          You might just post those files somewhere so we can reproduce and track down the cause of this problem.

                          Comment


                          • #14
                            Hi dpryan

                            I can send you 2 files output from htseq-count by mail, can be?
                            can provide me your email?

                            Comment


                            • #15
                              Sure, at least as long as those 2 files are sufficient to cause the problem. You can email me at [email protected].

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                Investigating the Gut Microbiome Through Diet and Spatial Biology
                                by seqadmin




                                The human gut contains trillions of microorganisms that impact digestion, immune functions, and overall health1. Despite major breakthroughs, we’re only beginning to understand the full extent of the microbiome’s influence on health and disease. Advances in next-generation sequencing and spatial biology have opened new windows into this complex environment, yet many questions remain. This article highlights two recent studies exploring how diet influences microbial...
                                02-24-2025, 06:31 AM
                              • seqadmin
                                Quality Control Essentials for Next-Generation Sequencing Workflows
                                by seqadmin




                                Like all molecular biology applications, next-generation sequencing (NGS) workflows require diligent quality control (QC) measures to ensure accurate and reproducible results. Proper QC begins at nucleic acid extraction and continues all the way through to data analysis. This article outlines the key QC steps in an NGS workflow, along with the commonly used tools and techniques.

                                Nucleic Acid Quality Control
                                Preparing for NGS starts with isolating the...
                                02-10-2025, 01:58 PM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, 03-03-2025, 01:15 PM
                              0 responses
                              46 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 02-28-2025, 12:58 PM
                              0 responses
                              167 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 02-24-2025, 02:48 PM
                              0 responses
                              525 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 02-21-2025, 02:46 PM
                              0 responses
                              256 views
                              0 likes
                              Last Post seqadmin  
                              Working...
                              X