Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    Thanks Alejandro, you saved me an email to you! I expect this will save a lot of people some grief!

    Comment


    • #17
      error with DEXeq

      Hi Alejandro and dpryan,

      I did use as.character and I am getting the following error"
      Error: all(unlist(lapply(design, class)) == "factor") is not TRUE

      Any help or advice is greatly appreciated. Here's what I am doing:

      > Table <- data.frame(
      + row.names = c( "P110", "P124", "P149", "P185", "P189", "P192", "P218", "P227", "P235", "P280", "P308", "P351", "P357", "P367", "P377", "P384", "P426", "P543", "P584", "P590", "P594", "P610" ),
      + countFile = c( "P110.counts", "P124.counts", "P149.counts", "P185.counts", "P189.counts", "P192.counts", "P218.counts", "P227.counts","P235.counts", "P280.counts", "P308.counts", "P351.counts", "P357.counts", "P367.counts", "P377.counts", "P384.counts", "P426.counts", "P543.counts", "P584.counts", "P590.counts", "P594.counts", "P610.counts" ),
      + condition = c( "pre", "pre", "pre", "pre", "pre", "pre", "pre", "pre", "pre", "pre", "pre", "post", "post", "post", "post", "post", "post", "post", "post", "post", "post", "post" ),
      + stringsAsFactors=FALSE)
      > Table
      countFile condition
      P110 P110.counts pre
      P124 P124.counts pre
      P149 P149.counts pre
      P185 P185.counts pre
      P189 P189.counts pre
      P192 P192.counts pre
      P218 P218.counts pre
      P227 P227.counts pre
      P235 P235.counts pre
      P280 P280.counts pre
      P308 P308.counts pre
      P351 P351.counts post
      P357 P357.counts post
      P367 P367.counts post
      P377 P377.counts post
      P384 P384.counts post
      P426 P426.counts post
      P543 P543.counts post
      P584 P584.counts post
      P590 P590.counts post
      P594 P594.counts post
      P610 P610.counts post
      > ecs <- read.HTSeqCounts(
      + as.character( Table$countFile ),
      + Table,
      + "GRCh37_E64_1kg.gff" )
      Error: all(unlist(lapply(design, class)) == "factor") is not TRUE
      > sapply(Table,class)
      countFile condition
      "character" "character"
      >

      Comment


      • #18
        Try the following instead:
        Code:
        Table <- data.frame(
            row.names = c( "P110", "P124", "P149", "P185", "P189", "P192", "P218", "P227", "P235", "P280", "P308", "P351", "P357", "P367", "P377", "P384", "P426", "P543", "P584", "P590", "P594", "P610" ),
            countFile = c( "P110.counts", "P124.counts", "P149.counts", "P185.counts", "P189.counts", "P192.counts", "P218.counts", "P227.counts","P235.counts", "P280.counts", "P308.counts", "P351.counts", "P357.counts", "P367.counts", "P377.counts", "P384.counts", "P426.counts", "P543.counts", "P584.counts", "P590.counts", "P594.counts", "P610.counts" ),
            condition = factor(c( "pre", "pre", "pre", "pre", "pre", "pre", "pre", "pre", "pre", "pre", "pre", "post", "post", "post", "post", "post", "post", "post", "post", "post", "post", "post" )),
        stringsAsFactors=FALSE)

        Comment


        • #19
          it worked now! I just didn't have to specify "stringsAsFactor". Without it worked fine. Thank you Alejandro and Ryan for your great work

          Comment


          • #20
            Hi guys,
            I followed your instruction and working well now. You saved my day! Thank you very much.
            I just have a little issue now. When I ran on BioLinux 16 cores, this error show up
            >library("parallel")
            >ecs <- estimateDispersions( ecs, nCores=16)
            >ecs <- fitDispersionFunction( ecs )
            Error in fitDispersionFunction(ecs) :
            no CR dispersion estimations found, please first call estimateDispersions function

            I searched for solution in some forum and It seems due to old R version.
            Could anyone help me to clarify that?
            When I ran on my own Mac laptop 4 cores, Its working fine but the function ecs <- testForDEU( ecs, nCores=4) is running really slow.
            Thank you very much.
            Thanh

            Comment


            • #21
              Hi @thanhhoang,

              It does not look like a common error. What is the output of your sessionInfo()?
              What is the output of doing:

              all( is.na( fData(ecs)$dispBeforeSharing ) )

              ?

              How does the distribution of exon counts look like:

              hist( log( rowMeans(counts(ecs)) ) )

              ?

              Alejandro

              Comment


              • #22
                Hello Alejandro and Ryan

                I ran DEXSeq on 22 samples (2 conditions: pre and post, biological replicates) as per my post earlier. I did get 50 warnings after running > ecs <- estimateDispersions ( ecs )

                > Done
                There were 50 or more warnings (use warnings() to see the first 50).
                Here's one of the warnings:

                Warning messages:
                1: In chol.default(XVX + lambda * I, pivot = TRUE) :
                the matrix is either rank-deficient or indefinite Error in cat(list(...), file, sep, fill, labels, append) :
                argument 2 (type 'S4') cannot be handled by 'cat'


                Any advice what I may have done wrong?

                thank you
                Attached Files
                Last edited by nbahlis; 11-16-2013, 03:44 AM.

                Comment


                • #23
                  Hi nbahlis,

                  One question, do you have paired samples? If so, your analysis might benefit from adding the pairing information as an additional covariate, and might also help with the error message.

                  The reason for the warning is that DEXSeq assumes a mean-variance relation a bit different from the one in your data (e.g. http://www.ncbi.nlm.nih.gov/pmc/arti...195/figure/F2/). I could not read the x-axis labels, but the left part of your plot seems to be a bit strange. Have you tried to filter more strictly on lower counts (e.g. only allowing exon bins with more than 200 counts)? The minCount parameter of estimateDispersions could do this for you.

                  Alejandro

                  Comment


                  • #24
                    thank you Alejandro

                    they are paired data /samples. Can you please advise how to add the pairing as a variable. I will repeat the analysis with better (strict) filtering of the counts.
                    Thank you for your prompt responses, truly appreciate it

                    Comment


                    • #25
                      Hi @nbahlis,

                      No prob!

                      You can find the instructions of how to specify this in the section "Additional technical or experimental variables". Shortly, you have to specify the pairing information in your design matrix when creating your ExonCountSet object and modify your formulas as in the vignette in order to add the pairing information as a covariate. The vignette describes how to do this by specifying the sequencing library type, you should substitute this with your pairing variable!

                      Alejandro

                      Comment


                      • #26
                        Hi Alejandro

                        Is there a way to figure out what the bin number exactly represents. In other words what exon or exon-fusion/splice it represents?

                        thank you

                        Comment


                        • #27
                          Hi nbahlis,

                          It is possible to map the exonic bins to a transcript database by overlaping the exonic bins with the coordinates of the genomic transcripts, for example:

                          library(DEXSeq)
                          library(GenomicFeatures)
                          library(GenomicRanges)

                          data("pasillaExons", package="pasilla")
                          exonBins <- GRanges(
                          seqnames=fData(pasillaExons)$chr,
                          ranges=IRanges(
                          fData(pasillaExons)$start, fData(pasillaExons)$end ),
                          strand=fData(pasillaExons)$strand)
                          transcriptDb <- makeTranscriptDbFromGFF("/home/alejandro/Work/Graveley/Reanalisis/Annotations/Drosophila_melanogaster.BDGP5.25.62.mychr_tss.gtf", format="gtf")

                          exonsByTranscript <- exonsBy(transcriptDb, "tx", use.names=TRUE)

                          findOverlaps( unlist(exonsByTranscript), exonBins )
                          This will go back from the exonic bin definition to the original transcripts annotaions!

                          Comment


                          • #28
                            Many thanks!

                            Comment

                            Latest Articles

                            Collapse

                            • seqadmin
                              Strategies for Sequencing Challenging Samples
                              by seqadmin


                              Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                              03-22-2024, 06:39 AM
                            • seqadmin
                              Techniques and Challenges in Conservation Genomics
                              by seqadmin



                              The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                              Avian Conservation
                              Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                              03-08-2024, 10:41 AM

                            ad_right_rmr

                            Collapse

                            News

                            Collapse

                            Topics Statistics Last Post
                            Started by seqadmin, Yesterday, 06:37 PM
                            0 responses
                            11 views
                            0 likes
                            Last Post seqadmin  
                            Started by seqadmin, Yesterday, 06:07 PM
                            0 responses
                            10 views
                            0 likes
                            Last Post seqadmin  
                            Started by seqadmin, 03-22-2024, 10:03 AM
                            0 responses
                            51 views
                            0 likes
                            Last Post seqadmin  
                            Started by seqadmin, 03-21-2024, 07:32 AM
                            0 responses
                            68 views
                            0 likes
                            Last Post seqadmin  
                            Working...
                            X