Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • cufflinks stuck at a locus?

    When attempting to run cufflinks v2.0.2 (command line below) it quickly processes for moments and then gets stuck processing locus chr1:16765605-16765782. I've read on other threads to wait it out, but the job has been running for over 51hrs on 8 processors. It doesn't matter if I use the -g option or not or the -r option with a .gtf for all known rRNAs and tRNAs (downloaded from rmsk on UCSC table browser filtering for rRNA and tRNAs and then cat the files together). The reads were aligned with tophat2 without problems (~90% aligned or 50,000,000 paired-end reads from ribosomal depletion). Any suggestions? My only thought is to use the -r option with all of rmsk, but I'm not sure if that will help.

    Code:
    cufflinks -p 8 -M rrnatrnacatandsort.gtf -o C31_cufout accepted_hits.bam
    or
    Code:
    cufflinks -p 8 -M rrnatrnacatandsort.gtf -g ucscknowngenes.gtf -o C31_cufout accepted_hits.bam
    or
    Code:
    cufflinks -p 8 -g ucscknowngenes.gtf -o C31_cufout accepted_hits.bam
    or
    Code:
    cufflinks -p 8 -o C31_cufout accepted_hits.bam

  • #2
    Hi Caballien,

    Has you found out what's going on about cufflinks got stuck at a locus?

    I've got a similar problem too at a stage of doing "Inspecting reads and determining fragment length distribution". I have no idea what's happening with this problem. I've used a very trick way to run my data as well, but it doesn't mean I've addressed this problem.

    Code:
    #cufflinks -p 8 -M mask.gff -o ./data.th.cl ./data.th/accepted_hits.bam
    Here is how I ran cufflinks with the same dataset on 3 computers? The mask.gff is a file to exclude some genes I don't need.

    Computer-1:
    HP G7 server with CentOS6 and 28GB RAM. It got stuck at "Processing Locus Tb427_01_v4:1064380-1064569" .
    The same problem occurred when running with cufflinks complied from source code.

    Computer-2:
    DELL PC with CentOS 5.7 and 8GB RAM. It got stuck at "Processing Locus Tb427_01_v4:1064380-1064569".

    PS: 1 )Tb427 is T.brucei species, 01 is chr1 and v4 is just a version. The length of chr1 is 1064569.
    2) It works when the maks.gff includes all of items expect CDS and exon, but it doesn't make sense.

    Computer-3:
    Rocks cluster server with CentOS 5.6 and 256GB RAM. It was running as well but was over my head to figure out why it works.

    ---

    What I found among these three computers are that 1) Computer-3 makes great use of virtual memory and stack data when tracking from "top" command line, 2) the other two computers get stuck at processing loci when virtual memory and stack data reach about 3GB even if it's running whole the day, 3) There has a lot of physical memory left for both computer-1/2, and 4) All of three computers have more then 10GB swap space.

    Does anyone has a idea to explain this case or does anyone think that cufflinks allocating data to shuffle among swap ,stack and physical memory has something wrong?

    Many thanks in advance.

    Comment


    • #3
      For my test to Cufflinks in these couple of weeks, to ignore reads annotated like a rRNA, tRNA and so forth will solve problem of getting stuck at "Processing Locus ... ... ...". It has mentioned in the Cufflinks website.

      Hopefully, the information will help people meet the same problem in the future.

      Comment

      Latest Articles

      Collapse

      • seqadmin
        Advancing Precision Medicine for Rare Diseases in Children
        by seqadmin




        Many organizations study rare diseases, but few have a mission as impactful as Rady Children’s Institute for Genomic Medicine (RCIGM). “We are all about changing outcomes for children,” explained Dr. Stephen Kingsmore, President and CEO of the group. The institute’s initial goal was to provide rapid diagnoses for critically ill children and shorten their diagnostic odyssey, a term used to describe the long and arduous process it takes patients to obtain an accurate...
        12-16-2024, 07:57 AM
      • seqadmin
        Recent Advances in Sequencing Technologies
        by seqadmin



        Innovations in next-generation sequencing technologies and techniques are driving more precise and comprehensive exploration of complex biological systems. Current advancements include improved accessibility for long-read sequencing and significant progress in single-cell and 3D genomics. This article explores some of the most impactful developments in the field over the past year.

        Long-Read Sequencing
        Long-read sequencing has seen remarkable advancements,...
        12-02-2024, 01:49 PM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by seqadmin, 12-17-2024, 10:28 AM
      0 responses
      23 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 12-13-2024, 08:24 AM
      0 responses
      42 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 12-12-2024, 07:41 AM
      0 responses
      28 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 12-11-2024, 07:45 AM
      0 responses
      42 views
      0 likes
      Last Post seqadmin  
      Working...
      X