Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • DESeq problems

    I'm running DESeq for the first time and am having a few problems. I'm working through the documentation dated August 11, 2011.

    I had no problems setting up a CountDataSet object with my counts table and can access the counts and estimate size factors for samples fine.

    However when I tried to call the normalised counts, using the "normalized = TRUE" argument for the counts accessor, it says that there's no argument with that name. When I look at the man page for "counts", it also appears to have no arguments other than the name of the object it is counting.

    I skipped that step and went on to the variance estimation stage, but when I try to run the estimateDispersions function I get a "function not found" message.

    Is there something else I need to have installed?

    Cheers!

  • #2
    I think DESeq is a bit out of sync with the documentation/tutorial at the moment. I ran into the same problems as you, but after looking at some help pages and/or the vignette (can't remember which) I ended up using:

    cds <- newCountDataSet(data, conds)
    cds <- estimateSizeFactors(cds)
    cds <- estimateVarianceFunctions(cds)

    res <- nbinomTest(cds, "case", "control")

    ... which seems to work.

    Comment


    • #3
      Thanks, Kopi.

      I'm using it on a remote server so am waiting for one of our sysadmins to install the latest versions of R and DESeq - much googling revealed that using the developer version of R and the latest version DESeq would cure my problems.
      Last edited by janec; 10-04-2011, 02:32 AM.

      Comment


      • #4
        Version numbers matter

        Originally posted by kopi-o View Post
        I think DESeq is a bit out of sync with the documentation/tutorial at the moment. I ran into the same problems as you, but after looking at some help pages and/or the vignette (can't remember which) I ended up using:
        Hi Kopi,

        for DESeq, as for all Bioconductor packages, the software and the documentation are delivered together within the same package, and some attention is being paid that they are in sync. So, just use the documentation that comes with the version of DESeq that you are using.

        Of course, if you download the software from one place, and look at documentation that Google finds in another place, these may be out of sync, as should be expected for a package that is actively being maintained.

        Best wishes
        Wolfgang
        Wolfgang Huber
        EMBL

        Comment


        • #5
          Hi Wolfgang,

          It's probably my mistake then, I may have 'de-synced' the documentation with the actual version I am using myself :-) Anyway, my post above may still give a pointer to a set of commands that works for some versions.

          Comment


          • #6
            I had the similar problems as you and figured it out by using the old version manual (2010-01-19) with R2.11.1. Forget the "normalized" and " estimateDispersions" now Maybe somebody can use both to get the correct results (I cannot). And it will be great if s/he can share her/his details/experience with using "normalized" and " estimateDispersions" .

            In addition, I am using R2.13.2 and cannot even successfully install the latest DESeq, Could anybody give my some ideas? Thanks.

            > library("DESeq")
            Loading required package: Biobase

            Welcome to Bioconductor

            Vignettes contain introductory material. To view, type
            'browseVignettes()'. To cite Bioconductor, see
            'citation("Biobase")' and for packages 'citation("pkgname")'.

            Loading required package: locfit
            Loading required package: akima
            locfit 1.5-6 2010-01-20
            Error in library.dynam(lib, package, package.lib) :
            DLL 'genefilter' not found: maybe not installed for this architecture?
            In addition: Warning messages:
            1: '.readRDS' is deprecated.
            Use 'readRDS' instead.
            See help("Deprecated")
            2: '.readRDS' is deprecated.
            Use 'readRDS' instead.
            See help("Deprecated")
            Error: package/namespace load failed for 'DESeq'

            Originally posted by janec View Post
            I'm running DESeq for the first time and am having a few problems. I'm working through the documentation dated August 11, 2011.

            I had no problems setting up a CountDataSet object with my counts table and can access the counts and estimate size factors for samples fine.

            However when I tried to call the normalised counts, using the "normalized = TRUE" argument for the counts accessor, it says that there's no argument with that name. When I look at the man page for "counts", it also appears to have no arguments other than the name of the object it is counting.

            I skipped that step and went on to the variance estimation stage, but when I try to run the estimateDispersions function I get a "function not found" message.

            Is there something else I need to have installed?

            Cheers!
            Last edited by byou678; 10-07-2011, 07:45 AM.

            Comment


            • #7
              I installed DESeq (1.4.1) after running the codes below in R2.13.2.( to update all installed packages that are out of date)

              source("http://bioconductor.org/biocLite.R")
              update.packages(repos=biocinstallRepos(), ask=FALSE, checkBuilt=TRUE)

              And it showed:
              > library("DESeq")
              Warning message:
              '.readRDS' is deprecated.
              Use 'readRDS' instead.
              See help("Deprecated")

              But it still cannot show me normalized and estimateDispersions by using ?estimateDispersions
              > ?estimateDispersions
              No documentation for 'estimateDispersions' in specified packages and libraries:
              you could try '??estimateDispersions'
              Last edited by byou678; 10-07-2011, 10:02 AM.

              Comment


              • #8
                I believe you need to be running the dev build of R version 2.14 to take advantage of the latest DESeq package. If you install R 2.14 and then install DESeq via the biocLite function I think you'll get the updated version. I haven't done this yet because I don't want to run a dev version of R since I do many other things in R.
                /* Shawn Driscoll, Gene Expression Laboratory, Pfaff
                Salk Institute for Biological Studies, La Jolla, CA, USA */

                Comment


                • #9
                  In Bioconductor, you can use the command 'browseVignettes()' to see the vignettes (PDF manuals) that come with all the Bioconductor packages you installed. If you read a vignette found via Google instead, you may be out of sync as Wolfgang stressed. if you have to, note that all Bioconductor vignettes end with a 'sessionInfo' that lists with the version numbers of the packages used when building the vignette. Be sure to compare this with the output of 'sessionInfo()' in your R session to make sure that the vignette you are reading fits to the package versions you have installed.

                  Comment


                  • #10
                    Could anyone tell Which Version Galaxy has the function to run DESeq?

                    As far as I know, Penn State Version doesn't integrate the DESeq function, and Ratsch Lab Version does not work recently.

                    Thanks a lot for any response!!

                    Comment


                    • #11
                      Help with DESeq analysis and filtering the DEGs from its output

                      I have just started using DESeq and trying to compare my results for DEGs between cuffdiff , DESeq and RankProd. I would like to ask certain stuffs as I am confused at a point after the analysis is done. I am comparing 2 conditions of tumor where I am having in total 5 samples. Its like 3 samples for peripheries giving tumor (PGT) and 2 for peripheries not giving tumor(PDGT). So what I did is according to DESeq I created a matrix for the conditions with the raw fragment counts as DESeq works only with raw fragment counts and converted the matrix to nearest integer values as the package only works with integer values. Then I used the normal DESeq commands to create my own results of DEGs but the output does not preferentially gives DEGs , it lists for all the genes. Can you tell me where I am going wrong and also is there any pre filtering I should do or post filtering to extract the list of DEGs from the output. I am sending the output file as well and the script code. Another problem is the p.adj which is the corrected p-value is also not giving proper values so I cannot on the basis of that and then list my DEGs up and down with Log2FC values. The p.adj values are either 1 or NA and even I am not getting proper value in the field of Basemean as sometimes I am getting 0 and in Log2FC is #NAME? which means excel cannot recognize the formula used to calculate it as its for those rows where one of the BaseMean is 0 and so the FC is also zero and the Log2FC cannot be calculated.

                      dat1<- read.table("/Users/vdas/Documents/RNA-Seq_Smaples_Udine_08032013/GBM_29052013/UD_RP_25072013/RP_matrix_RF_PGTvsPDGT.txt",sep="",header=TRUE,stringsAsFactors=FALSE)

                      dat1[,-1]<- lapply(lapply(dat1[,-1],round),as.integer)

                      write.table(dat1,"/Users/vdas/Documents/RNA-Seq_Smaples_Udine_08032013/GBM_29052013/UD_RP_25072013/rev_RF_PGTvsPDGT.txt",sep="\t",)

                      count_table<-read.table("/Users/vdas/Documents/RNA-Seq_Smaples_Udine_08032013/GBM_29052013/UD_RP_25072013/rev_RF_PGTvsPDGT.txt",header=T,sep="\t",row.names=1)

                      expt_design <- data.frame(row.names = colnames(count_table),
                      condition = c("PGT","PGT","PGT","PDGT","PDGT"))

                      expt_design

                      conditions = expt_design$condition

                      conditions

                      data <- newCountDataSet(count_table, conditions)

                      head(counts(data))

                      data <- estimateSizeFactors(data)

                      sizeFactors(data)

                      data <- estimateDispersions(data)

                      results <- nbinomTest(data, "PGT", "PDGT")

                      Is there anything wrong in the analysis script? Please let me know or if I have to introduce some post filtering or not. Please let me know if you want any more infos.

                      Comment


                      • #12
                        On your second line, you're rounding and casting to an integer, which suggests that you don't actually have raw counts. Can you post a snippet of RP_matrix_RF_PGTvsPDGT.txt file?

                        BTW, it might be best to start a new thread, since the one you're replying to is quite old.

                        Comment


                        • #13
                          Sorry I am trying to start a new thread but am unable to do that , I donot know why.. yes I am converting the values to integer as with the direct raw counts I cannot run the DESeq commands as they are not integer values. So I have to convert them to the nearest integer and then carry out the analysis.

                          gene PGT-1 PGT-0 PGT-2 PDGT-0 PDGT-1
                          XLOC_000001 2603 1534 1764 9030 4309
                          XLOC_000002 304 175 208 1095 835
                          XLOC_000003 195 80 109 687 454
                          XLOC_000004 66 49 54 236 90
                          XLOC_000092 365 211 242 1523 624
                          XLOC_000093 0.666667 0.5 1 1.66667 3.33333
                          XLOC_000094 0 0 0 0 0
                          XLOC_000095 6 11.4802 4.56786 8.49762 7.22143
                          XLOC_000096 0 0.25 0 0 0
                          XLOC_000097 195.561 90.88 114.348 262.98 246.68
                          XLOC_000098 0 7.79035 3.89757 0 1.30276
                          XLOC_000099 39 18 23 55 27
                          XLOC_000100 10.9163 5 3 12.8974 8
                          XLOC_000101 533 32 28 854 288
                          XLOC_000102 2756.33 3090.17 2311 4873 1677.25


                          You can see the raw counts are having decimal values which does not work for DESeq so I converted them to nearest integer.

                          Comment


                          • #14
                            If the "raw" counts have decimals, then they're not raw counts. How did you generate these counts? The typical workflow would be to use htseq-count.

                            Comment


                            • #15
                              I am using the genes.readgrouptracking outpule file of cuffdiff which you can use to create a matrix of the raw fragment counts. I create the matrix from this tracking file output of cuffdiff. Here is a snipplet of it

                              tracking_id condition replicate raw_frags internal_scaled_frags external_scaled_frags FPKM effective_length status
                              XLOC_000001 PGT 1 2603 1669.52 1669.52 71.2509 - OK
                              XLOC_000001 PGT 0 1534 1601.85 1601.85 68.3629 - OK
                              XLOC_000001 PGT 2 1764 2534.53 2534.53 108.168 - OK
                              XLOC_000001 PDGT 0 9030 9030 9030 195.012 - OK
                              XLOC_000001 PDGT 1 4309 4309 4309 93.0574 - OK
                              XLOC_000002 PGT 1 304 194.98 194.98 8.71171 - OK
                              XLOC_000002 PGT 0 175 182.74 182.74 8.16483 - OK
                              XLOC_000002 PGT 2 208 298.856 298.856 13.3529 - OK
                              XLOC_000002 PDGT 0 1095 1095 1095 24.7572 - OK
                              XLOC_000002 PDGT 1 835 835 835 18.8788 - OK
                              XLOC_000003 PGT 1 195 125.07 125.07 14.6047 - OK
                              XLOC_000003 PGT 0 80 83.5384 83.5384 9.75499 - OK
                              XLOC_000003 PGT 2 109 156.612 156.612 18.288 - OK
                              XLOC_000003 PDGT 0 687 687 687 40.595 - OK
                              XLOC_000003 PDGT 1 454 454 454 26.827 - OK
                              XLOC_000092 PDGT 1 624 624 624 39.1462 - OK
                              XLOC_000093 PGT 1 0.666667 0.427588 0.427588 0.0107606 - OK
                              XLOC_000093 PGT 0 0.5 0.522115 0.522115 0.0131394 - OK
                              XLOC_000093 PGT 2 1 1.43681 1.43681 0.0361585 - OK
                              XLOC_000093 PDGT 0 1.66667 1.66667 1.66667 0.0212244 - OK
                              XLOC_000093 PDGT 1 3.33333 3.33333 3.33333 0.0424487 - OK
                              XLOC_000094 PGT 1 0 0 0 0 - OK
                              XLOC_000094 PGT 0 0 0 0 0 - OK
                              XLOC_000094 PGT 2 0 0 0 0 - OK
                              XLOC_000094 PDGT 0 0 0 0 0 - OK
                              XLOC_000094 PDGT 1 0 0 0 0 - OK
                              XLOC_000095 PGT 1 6 3.84829 3.84829 0.0342992 - OK
                              XLOC_000095 PGT 0 11.4802 11.9879 11.9879 0.106847 - OK
                              XLOC_000095 PGT 2 4.56786 6.56314 6.56314 0.0584964 - OK
                              XLOC_000095 PDGT 0 8.49762 8.49762 8.49762 0.0383257 - OK
                              XLOC_000095 PDGT 1 7.22143 7.22143 7.22143 0.0325698 - OK


                              the 4th coulmn is the raw fragment count and as you can see some of the values are having decimal values.

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                Strategies for Sequencing Challenging Samples
                                by seqadmin


                                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                                03-22-2024, 06:39 AM
                              • seqadmin
                                Techniques and Challenges in Conservation Genomics
                                by seqadmin



                                The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                                Avian Conservation
                                Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                                03-08-2024, 10:41 AM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, Yesterday, 06:37 PM
                              0 responses
                              11 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, Yesterday, 06:07 PM
                              0 responses
                              10 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 03-22-2024, 10:03 AM
                              0 responses
                              51 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 03-21-2024, 07:32 AM
                              0 responses
                              68 views
                              0 likes
                              Last Post seqadmin  
                              Working...
                              X