Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Subread segmentation faults on Scientific Linux

    Hi

    I am running Subread on scientific linux on a HPC cluster with a GPFS file system. With big FASTQ (>35M reads) subread falls over half way through with a segmentation fault.

    It seems to be memory related as smaller files work okay. Increasing the amount of available memory to the process to 31GB seems to make no difference.

    Has anyone seen this before?

  • #2
    Memory does not sound like the culprit here. Are you running into some other limit, say storage (quota) or tmp space or time assigned for job?

    Comment


    • #3
      Unfortunately not, the disk space assigned to the node is 10TB, the walltime is 5 days (when typically the job takes 2-3 hours) and the temp space is "unlimited" although practically is about 4TB.

      Comment


      • #4
        Dr. Shi (author of Subread) participates here and we may hear something enlightening from him. But I would have thought that once the genome index is read into memory that requirement should be more or less satisfied. Unless subread works differently (I don't use subread) than other aligners.

        Comment


        • #5
          Originally posted by abeggs View Post
          Hi

          I am running Subread on scientific linux on a HPC cluster with a GPFS file system. With big FASTQ (>35M reads) subread falls over half way through with a segmentation fault.

          It seems to be memory related as smaller files work okay. Increasing the amount of available memory to the process to 31GB seems to make no difference.

          Has anyone seen this before?
          Could you please provide the screen output and also your commands? Subread has no problems to process more than 35 million reads.

          Comment


          • #6
            Hi,

            The command I use for subjunc is:
            subjunc -i $TMPDIR/hg19 -r $TMPDIR/$2_R1.fastq.gz -R $TMPDIR/$2_R2.fastq.gz -o $1/$2.bam -T 8 --gzFASTQinput --allJunctions
            And the output is:

            ========== _____ _ _ ____ _____ ______ _____
            ===== / ____| | | | _ \| __ \| ____| /\ | __ \
            ===== | (___ | | | | |_) | |__) | |__ / \ | | | |
            ==== \___ \| | | | _ <| _ /| __| / /\ \ | | | |
            ==== ____) | |__| | |_) | | \ \| |____ / ____ \| |__| |
            ========== |_____/ \____/|____/|_| \_\______/_/ \_\_____/
            v1.5.0-p1

            //============================= subjunc setting ==============================\\
            || ||
            || Function : Read alignment + Junction/Fusion detection (RNA-Seq) ||
            || Threads : 8 ||
            || Input file 1 : /scratch/beggsa_909907.bb2torque.bb2.cluster/s018 ... ||
            || Input file 2 : /scratch/beggsa_909907.bb2torque.bb2.cluster/s018 ... ||
            || Output file : /gpfs/projects/s-beggsa01/P141115-N-DW-28-2673271 ... ||
            || Index name : /scratch/beggsa_909907.bb2torque.bb2.cluster/hg19 ||
            || Phred offset : 33 ||
            || ||
            || All subreads : 14 ||
            || Min read1 votes : 1 ||
            || Min read2 votes : 1 ||
            || Max fragment size : 600 ||
            || Min fragment size : 50 ||
            || ||
            || Allowed mismatch : 3 bases ||
            || Max indels : 5 ||
            || # of Best mapping : 1 ||
            || Unique mapping : no ||
            || Hamming distance : no ||
            || Quality scores : no ||
            || ||
            \\===================== http://subread.sourceforge.net/ ======================//

            //====================== Running (31-Mar-2016 15:30:00) ======================\\
            || ||
            || The input file contains base space reads. ||
            || The range of Phred scores observed in the data is [2,36] ||
            || Load the 1-th index block... ||
            || Map fragments... ||
            || 0% completed, 0.3 mins elapsed, rate=3.7k fragments per second ||
            || Finish the 3,495,253 fragments... ||
            || 5% completed, 11 mins elapsed, rate=3.7k fragments per second ||
            || 5% completed, 11 mins elapsed, rate=3.8k fragments per second ||
            || 6% completed, 11 mins elapsed, rate=3.9k fragments per second ||
            || 6% completed, 12 mins elapsed, rate=4.0k fragments per second ||
            || 6% completed, 12 mins elapsed, rate=4.1k fragments per second ||
            || 7% completed, 12 mins elapsed, rate=4.2k fragments per second ||
            || 7% completed, 13 mins elapsed, rate=4.3k fragments per second ||
            || Map fragments... ||
            || 7% completed, 13 mins elapsed, rate=4.4k fragments per second ||
            || Finish the 3,495,253 fragments... ||
            || 13% completed, 24 mins elapsed, rate=4.1k fragments per second ||
            || 13% completed, 24 mins elapsed, rate=4.1k fragments per second ||
            || 13% completed, 25 mins elapsed, rate=4.1k fragments per second ||
            || 14% completed, 25 mins elapsed, rate=4.2k fragments per second ||
            || 14% completed, 26 mins elapsed, rate=4.2k fragments per second ||
            || 14% completed, 26 mins elapsed, rate=4.2k fragments per second ||
            || 15% completed, 26 mins elapsed, rate=4.3k fragments per second ||
            || Map fragments... ||
            || 15% completed, 27 mins elapsed, rate=4.3k fragments per second ||
            || Finish the 3,495,253 fragments... ||
            || 20% completed, 38 mins elapsed, rate=4.1k fragments per second ||
            || 21% completed, 38 mins elapsed, rate=4.1k fragments per second ||
            || 21% completed, 39 mins elapsed, rate=4.2k fragments per second ||
            || 21% completed, 39 mins elapsed, rate=4.2k fragments per second ||
            || 22% completed, 39 mins elapsed, rate=4.2k fragments per second ||
            || 22% completed, 40 mins elapsed, rate=4.2k fragments per second ||
            || 22% completed, 40 mins elapsed, rate=4.2k fragments per second ||
            || 23% completed, 41 mins elapsed, rate=4.2k fragments per second ||
            || Map fragments... ||
            || 23% completed, 41 mins elapsed, rate=4.2k fragments per second ||
            || Finish the 3,495,253 fragments... ||
            ./SubReadRNAPipeline-highmem: line 43: 19380 Segmentation fault subjunc -i $TMPDIR/hg19 -r $TMPDIR/$2_R1.fastq.gz -R $TMPDIR/$2_R2.fastq.gz -o $1/$2.bam -T 8 --gzFASTQinput --allJunctions

            Comment


            • #7
              Thanks for providing the info. Can you also send us the fastq files so that we can reproduce the problem and find out what went wrong?
              Best,
              Wei

              Comment


              • #8
                I have solved it!

                I recompiled from source instead of using the binaries and it worked fine.

                We have a scientific linux HPC and it seems there was something about that which was causing problems if you ran the precompiled binaries.

                Comment

                Latest Articles

                Collapse

                • seqadmin
                  Non-Coding RNA Research and Technologies
                  by seqadmin




                  Non-coding RNAs (ncRNAs) do not code for proteins but play important roles in numerous cellular processes including gene silencing, developmental pathways, and more. There are numerous types including microRNA (miRNA), long ncRNA (lncRNA), circular RNA (circRNA), and more. In this article, we discuss innovative ncRNA research and explore recent technological advancements that improve the study of ncRNAs.

                  Nobel Prize for MicroRNA Discovery
                  This week,...
                  10-07-2024, 08:07 AM
                • seqadmin
                  Recent Developments in Metagenomics
                  by seqadmin





                  Metagenomics has improved the way researchers study microorganisms across diverse environments. Historically, studying microorganisms relied on culturing them in the lab, a method that limits the investigation of many species since most are unculturable1. Metagenomics overcomes these issues by allowing the study of microorganisms regardless of their ability to be cultured or the environments they inhabit. Over time, the field has evolved, especially with the advent...
                  09-23-2024, 06:35 AM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by seqadmin, Today, 07:29 AM
                0 responses
                12 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 10-15-2024, 06:35 AM
                0 responses
                11 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 10-14-2024, 02:44 PM
                0 responses
                12 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 10-11-2024, 06:55 AM
                0 responses
                19 views
                0 likes
                Last Post seqadmin  
                Working...
                X