Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • DEXSeq memory requirements

    Dear all,
    I am analyzing a RNA-seq experiment with DEXSeq. There are 73 samples (from 23 different conditions).
    When trying to estimate the dispersions for the whole exonCountSet (all conditions together), I am running out of memory and the job terminates. I increased the maximum allowed memory to 128G for this job, but it seems to be still too little.
    Is this function supposed to use as much memory?
    Does anyone have some experience to share about the analysis of large datasets with DEXSeq?

    Code:
    ## The design and sample annotation is in data frame called "samples"
    > library("DEXSeq")
    > library(parallel)
    > allExons <- read.HTSeqCounts(countfiles = file.path("prepared_counts", rownames(samples), "counts_DEXSeq.txt"),
                              design = samples,
                              flattenedfile = annotationfile)
    > sampleNames(allExons) <- rownames(samples)
    > allExons <- estimateSizeFactors(allExons)
    > allExons <- estimateDispersions(allExons, nCores=8, minCount = 100, file = "DEXSeq_output.out")
    
    > sessionInfo()
    R version 2.15.0 (2012-03-30)
    Platform: x86_64-redhat-linux-gnu (64-bit)
    
    locale:
     [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
     [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
     [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
     [7] LC_PAPER=C                 LC_NAME=C                 
     [9] LC_ADDRESS=C               LC_TELEPHONE=C            
    [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
    
    attached base packages:
    [1] parallel  stats     graphics  grDevices utils     datasets  methods  
    [8] base     
    
    other attached packages:
    [1] DEXSeq_1.4.0       Biobase_2.16.0     BiocGenerics_0.4.0
    
    loaded via a namespace (and not attached):
    [1] biomaRt_2.12.0 hwriter_1.3    plyr_1.7.1     RCurl_1.91-1   statmod_1.4.16
    [6] stringr_0.6    XML_3.9-4

  • #2
    Hi Julien Roux,

    I think this post will be helpful also for your case:



    Best wishes,
    Alejandro

    Comment


    • #3
      Thanks Alejandro for you help!
      Indeed the "TRT" fucntions worked a lot more quickly and with a reasonable amount of memory.
      When you say:
      > And you can see that you get the same results:
      > plot(fData(pasillaExons)$pvalue, fData(pasillaExonsTRT)$pvalue, log="xy")
      ... I actually find that the p-values are well correlated but not identical (often larger p-values are seen for the TRT implementation). Do you have any idea why this is happening?
      Thnaks
      Julien

      Comment


      • #4
        The TRT method is a different way of testing of differential exon usage than DEXSeq's default method, so the p values are only expected to be simlar, not identical.

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Understanding Genetic Influence on Infectious Disease
          by seqadmin




          During the COVID-19 pandemic, scientists observed that while some individuals experienced severe illness when infected with SARS-CoV-2, others were barely affected. These disparities left researchers and clinicians wondering what causes the wide variations in response to viral infections and what role genetics plays.

          Jean-Laurent Casanova, M.D., Ph.D., Professor at Rockefeller University, is a leading expert in this crossover between genetics and infectious...
          09-09-2024, 10:59 AM
        • seqadmin
          Addressing Off-Target Effects in CRISPR Technologies
          by seqadmin






          The first FDA-approved CRISPR-based therapy marked the transition of therapeutic gene editing from a dream to reality1. CRISPR technologies have streamlined gene editing, and CRISPR screens have become an important approach for identifying genes involved in disease processes2. This technique introduces targeted mutations across numerous genes, enabling large-scale identification of gene functions, interactions, and pathways3. Identifying the full range...
          08-27-2024, 04:44 AM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, Today, 06:25 AM
        0 responses
        13 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, Yesterday, 01:02 PM
        0 responses
        12 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 09-18-2024, 06:39 AM
        0 responses
        14 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 09-11-2024, 02:44 PM
        0 responses
        14 views
        0 likes
        Last Post seqadmin  
        Working...
        X