Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Cufflinks timing out - computing power required?

    I am analysing human transcriptome data (Illumina) via the Tophat -> Cufflinks pipeline (v2.0.2) using iGenomes references. My dataset comprises 14 patients and 6 controls, so I have 2 "conditions" to analyse with 14 and 6 biological replicates respectively.

    Until now I have been bypassing the full cufflinks protocol and just running cuffdiff providing a GTF, as follows:

    PHP Code:
    cuffdiff -p 8 -./cuffdiff_out -b genome.fa genes.gtf P1.bam,P2.bam,P3.bam,P4.bam,P5.bam,P6.bam,P7.bam,P8.bam,P9.bam,P10.bam,P11.bam,P12.bam,P13.bam,P14.bam C1.bam,C2.bam,C3.bam,C4.bam,C5.bam,C6.bam 
    This operation runs across 8 cores of our server (4GB per core) in 11-12h.

    However, I have been trying to run the full cufflinks -> cuffmerge -> cuffdiff protocol (as per the Nature Protocols publication) but as yet have not been able to successfully complete the entire process. My IT support team have been very helpful but the final cuffdiff job which I run is requiring HUGE amounts of computing power and time and I wonder what other people's experience of this is are or if I am doing something wrong.

    I have successfully run these operations:-

    Cufflinks for each BAM file:
    PHP Code:
    cufflinks -p 8 -./output_dir -b genome.fa -g genes.gtf P1.bam 
    Then create assemblies.txt file:-
    PHP Code:
    ./path/to/P1.bam
    ./path/to/P2.bam
    ...
    etc 
    Cuffmerge (this took 1h):
    PHP Code:
    cuffmerge -p 8 -./cuffmerge_out -g genes.gtf -s genome.fa assemblies.txt 
    Cuffdiff:
    PHP Code:
    cuffdiff -p 8 -./cuffdiff_out -b genome.fa -u merged.gtf P1.bam,P2.bam,P3.bam,P4.bam,P5.bam,P6.bam,P7.bam,P8.bam,P9.bam,P10.bam,P11.bam,P12.bam,P13.bam,P14.bam C1.bam,C2.bam,C3.bam,C4.bam,C5.bam,C6.bam 
    The last time I tried to run the cuffdiff step I was allocated 160GB across 8 cores for 5 days. The job timed out at the "Testing for differential expression and regulation in locus" step. It also only ever used ~30GB of the 160GB allocated.

    Can anyone offer any advice / suggestions / or even let me know how much computing power / time they use for their runs?

    Much appreciated
    Helen

  • #2
    Is this an issue just with the newest version of cufflinks (v.2.02) or did it also occur with older versions of cufflinks?

    Comment


    • #3
      Hi hlwright,

      I am also having the same problem. Could you pls tell me how you've solved your problem ?

      Thanks!

      Comment

      Latest Articles

      Collapse

      • seqadmin
        Current Approaches to Protein Sequencing
        by seqadmin


        Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
        04-04-2024, 04:25 PM
      • seqadmin
        Strategies for Sequencing Challenging Samples
        by seqadmin


        Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
        03-22-2024, 06:39 AM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by seqadmin, 04-11-2024, 12:08 PM
      0 responses
      30 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-10-2024, 10:19 PM
      0 responses
      32 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-10-2024, 09:21 AM
      0 responses
      28 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-04-2024, 09:00 AM
      0 responses
      52 views
      0 likes
      Last Post seqadmin  
      Working...
      X