Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • HGAP assembly failed task on SMRT Portal

    Hi

    So I have 10 SMRT cells that I've been playing with. But de novo assembly with protocols RS_HGAP_Assembly.2 and .3 on SMRT Portal shows an error message during the process. Using smrtanalysis_2.3.0.140936.run with smrtanalysis-patch_2.3.0.140936.p3.run.

    Code:
    [INFO] 2015-04-28 22:02:21,943 [smrtpipe.status refreshTargets 409] Workflow Completion Status 139/212 in ( ...... 65%) tasks completed.
    [ERROR] 2015-04-29 09:52:35,198 [smrtpipe.status refreshTargets 413] *** Failed task task://016450/P_PreAssemblerDagcon/hgapAlignForCorrection_006of006
    [ERROR] 2015-04-29 09:52:35,199 [smrtpipe.status refreshTargets 413] *** Failed task task://016450/P_PreAssemblerDagcon/hgapAlignForCorrection_004of006
    [ERROR] 2015-04-29 09:52:35,199 [smrtpipe.status refreshTargets 413] *** Failed task task://016450/P_PreAssemblerDagcon/hgapAlignForCorrection_005of006
    [ERROR] 2015-04-29 09:52:35,199 [smrtpipe.status refreshTargets 413] *** Failed task task://016450/P_PreAssemblerDagcon/hgapAlignForCorrection_003of006
    [ERROR] 2015-04-29 09:52:35,200 [smrtpipe.status refreshTargets 413] *** Failed task task://016450/P_PreAssemblerDagcon/hgapAlignForCorrection_001of006
    [ERROR] 2015-04-29 09:52:35,200 [smrtpipe.status refreshTargets 413] *** Failed task task://016450/P_PreAssemblerDagcon/hgapAlignForCorrection_002of006
    [INFO] 2015-04-29 09:52:35,212 [smrtpipe.status execute 627] Found 6 failed tasks.
    [INFO] 2015-04-29 09:52:35,213 [smrtpipe.status execute 629] task hgapAlignForCorrection_004of006 FAILED
    [INFO] 2015-04-29 09:52:35,213 [smrtpipe.status execute 629] task hgapAlignForCorrection_003of006 FAILED
    [INFO] 2015-04-29 09:52:35,214 [smrtpipe.status execute 629] task hgapAlignForCorrection_006of006 FAILED
    [INFO] 2015-04-29 09:52:35,214 [smrtpipe.status execute 629] task hgapAlignForCorrection_001of006 FAILED
    [INFO] 2015-04-29 09:52:35,215 [smrtpipe.status execute 629] task hgapAlignForCorrection_002of006 FAILED
    [INFO] 2015-04-29 09:52:35,215 [smrtpipe.status execute 629] task hgapAlignForCorrection_005of006 FAILED
    [ERROR] 2015-04-29 09:53:06,302 [SMRTpipe.SmrtPipeMain run 608] SmrtExit task://016450/P_PreAssemblerDagcon/hgapAlignForCorrection_004of006 Failed task://016450/P_PreAssemblerDagcon/hgapAlignForCorrection_003of006 Failed task://016450/P_PreAssemblerDagcon/hgapAlignForCorrection_006of006 Failed task://016450/P_PreAssemblerDagcon/hgapAlignForCorrection_001of006 Failed task://016450/P_PreAssemblerDagcon/hgapAlignForCorrection_002of006 Failed task://016450/P_PreAssemblerDagcon/hgapAlignForCorrection_005of006 Failed
    that's part of log file where the errors start showing up. Not sure where is the problem.
    Could it be due to a genome size limitation? Working with eukaryote genome (ca. 200Mb)
    Thanks

  • #2
    Originally posted by cascoamarillo View Post
    Hi

    So I have 10 SMRT cells that I've been playing with. But de novo assembly with protocols RS_HGAP_Assembly.2 and .3 on SMRT Portal shows an error message during the process. Using smrtanalysis_2.3.0.140936.run with smrtanalysis-patch_2.3.0.140936.p3.run.

    Code:
    [INFO] 2015-04-28 22:02:21,943 [smrtpipe.status refreshTargets 409] Workflow Completion Status 139/212 in ( ...... 65%) tasks completed.
    [ERROR] 2015-04-29 09:52:35,198 [smrtpipe.status refreshTargets 413] *** Failed task task://016450/P_PreAssemblerDagcon/hgapAlignForCorrection_006of006
    [ERROR] 2015-04-29 09:52:35,199 [smrtpipe.status refreshTargets 413] *** Failed task task://016450/P_PreAssemblerDagcon/hgapAlignForCorrection_004of006
    [ERROR] 2015-04-29 09:52:35,199 [smrtpipe.status refreshTargets 413] *** Failed task task://016450/P_PreAssemblerDagcon/hgapAlignForCorrection_005of006
    [ERROR] 2015-04-29 09:52:35,199 [smrtpipe.status refreshTargets 413] *** Failed task task://016450/P_PreAssemblerDagcon/hgapAlignForCorrection_003of006
    [ERROR] 2015-04-29 09:52:35,200 [smrtpipe.status refreshTargets 413] *** Failed task task://016450/P_PreAssemblerDagcon/hgapAlignForCorrection_001of006
    [ERROR] 2015-04-29 09:52:35,200 [smrtpipe.status refreshTargets 413] *** Failed task task://016450/P_PreAssemblerDagcon/hgapAlignForCorrection_002of006
    [INFO] 2015-04-29 09:52:35,212 [smrtpipe.status execute 627] Found 6 failed tasks.
    [INFO] 2015-04-29 09:52:35,213 [smrtpipe.status execute 629] task hgapAlignForCorrection_004of006 FAILED
    [INFO] 2015-04-29 09:52:35,213 [smrtpipe.status execute 629] task hgapAlignForCorrection_003of006 FAILED
    [INFO] 2015-04-29 09:52:35,214 [smrtpipe.status execute 629] task hgapAlignForCorrection_006of006 FAILED
    [INFO] 2015-04-29 09:52:35,214 [smrtpipe.status execute 629] task hgapAlignForCorrection_001of006 FAILED
    [INFO] 2015-04-29 09:52:35,215 [smrtpipe.status execute 629] task hgapAlignForCorrection_002of006 FAILED
    [INFO] 2015-04-29 09:52:35,215 [smrtpipe.status execute 629] task hgapAlignForCorrection_005of006 FAILED
    [ERROR] 2015-04-29 09:53:06,302 [SMRTpipe.SmrtPipeMain run 608] SmrtExit task://016450/P_PreAssemblerDagcon/hgapAlignForCorrection_004of006 Failed task://016450/P_PreAssemblerDagcon/hgapAlignForCorrection_003of006 Failed task://016450/P_PreAssemblerDagcon/hgapAlignForCorrection_006of006 Failed task://016450/P_PreAssemblerDagcon/hgapAlignForCorrection_001of006 Failed task://016450/P_PreAssemblerDagcon/hgapAlignForCorrection_002of006 Failed task://016450/P_PreAssemblerDagcon/hgapAlignForCorrection_005of006 Failed
    that's part of log file where the errors start showing up. Not sure where is the problem.
    Could it be due to a genome size limitation? Working with eukaryote genome (ca. 200Mb)
    Thanks
    At this point the failure is at the alignment-for-correction phase which occurs prior to usage of the genome size information to restrict the number of reads being used for the assembly. This failure is unrelated to the genome size setting, though that will come into play later.

    Can you post the contents of:

    [JOB_DIR]/log/P_PreAssemblerDagcon/hgapAlignForCorrection_001of006.log

    Comment


    • #3
      here they are:
      Code:
      Setting up ENV on cluster5-01.bpcservers.private for task hgapAlignForCorrection_001of006
      #!/bin/bash
      # Setting up SMRTpipe environment
      echo "Setting up ENV on $(uname -n)" for task hgapAlignForCorrection_001of006
      
      SEYMOUR_HOME=/smrtanalysis/install/smrtanalysis_2.3.0.140936
      source $SEYMOUR_HOME/etc/setup.sh
      
      # Create the local TMP dir if it doesn't exist
      tmp_dir=$(readlink -m "/smrtanalysis/tmpdir")
      if [ ! -e "$tmp_dir" ]; then
         stat=0
         mkdir -p $tmp_dir || stat=$?
         if [[ $stat -ne 0 ]]; then
             echo "SMRTpipe Unable to create TMP dir '/smrtanalysis/tmpdir' on $(uname -n)" 1>&2
             exit 1
         else
             echo "successfully created or found TMP dir '/smrtanalysis/tmpdir'"
         fi
      elif [[ ! -d "$tmp_dir" ]]; then
         echo "SMRTpipe TMP /smrtanalysis/tmpdir must be a directory on $(uname -n)" 1>&2
         exit 1
      fi
      
      ########### TASK metadata #############
      # Task            : hgapAlignForCorrection_001of006
      # Module          : P_PreAssemblerDagcon
      # Module Version  : 2.1.124285
      # TaskType        : None
      # URL             : task://016450/P_PreAssemblerDagcon/hgapAlignForCorrection_001of006
      # createdAt       : 2015-04-28 17:47:34.515890
      # createdAt (UTC) : 2015-04-28 21:47:34.515909
      # ncmds           : 2
      # LogPath         : /smrtanalysis/userdata/jobs/016/016450/log/P_PreAssemblerDagcon/hgapAlignForCorrection_001of006.log
      # Script Path     : /smrtanalysis/userdata/jobs/016/016450/workflow/P_PreAssemblerDagcon/hgapAlignForCorrection_001of006.sh
      
      # Input       : /smrtanalysis/userdata/jobs/016/016450/data/nocontrol_filtered_subreads.fasta
      # Input       : /smrtanalysis/userdata/jobs/016/016450/filtered_longreads.chunk001of006.fasta
      # Output      : /smrtanalysis/userdata/jobs/016/016450/seeds.chunk001of006.m4
      # Output      : /smrtanalysis/userdata/jobs/016/016450/seeds.chunk001of006.m4.fofn
      #
      ########### END TASK metadata #############
      
      cd /smrtanalysis/userdata/jobs/016/016450/log/P_PreAssemblerDagcon
      # Writing to log file
      cat /smrtanalysis/userdata/jobs/016/016450/workflow/P_PreAssemblerDagcon/hgapAlignForCorrection_001of006.sh >> /smrtanalysis/userdata/jobs/016/016450/log/P_PreAssemblerDagcon/hgapAlignForCorrection_001of006.log;
      
      
      
      echo "Running task://016450/P_PreAssemblerDagcon/hgapAlignForCorrection_001of006 on $(uname -a)"
      
      echo "Started on $(date -u)"
      echo 'Validating existence of Input Files'
      if [ -e /smrtanalysis/userdata/jobs/016/016450/data/nocontrol_filtered_subreads.fasta ]
      then
      echo 'Successfully found /smrtanalysis/userdata/jobs/016/016450/data/nocontrol_filtered_subreads.fasta'
      else
      echo 'WARNING: Unable to find necessary input file, or dir /smrtanalysis/userdata/jobs/016/016450/data/nocontrol_filtered_subreads.fasta.'
      fi
      if [ -e /smrtanalysis/userdata/jobs/016/016450/filtered_longreads.chunk001of006.fasta ]
      then
      echo 'Successfully found /smrtanalysis/userdata/jobs/016/016450/filtered_longreads.chunk001of006.fasta'
      else
      echo 'WARNING: Unable to find necessary input file, or dir /smrtanalysis/userdata/jobs/016/016450/filtered_longreads.chunk001of006.fasta.'
      fi
      echo 'Successfully validated input files'
      
      # Task hgapAlignForCorrection_001of006 commands:
      
      
      # Completed writing Task hgapAlignForCorrection_001of006 commands
      
      
      # Task 1
      blasr /smrtanalysis/userdata/jobs/016/016450/data/nocontrol_filtered_subreads.fasta /smrtanalysis/userdata/jobs/016/016450/filtered_longreads.chunk001of006.fasta -out /smrtanalysis/userdata/jobs/016/016450/seeds.chunk001of006.m4 -m 4 -nproc 1 -bestn 10 -nCandidates 10 -noSplitSubreads -minReadLength 200 -maxScore -1000 -maxLCPLength 16 || exit $?
      echo "Task 1 completed at $(date)"
      # Task 2
      echo /smrtanalysis/userdata/jobs/016/016450/seeds.chunk001of006.m4 > /smrtanalysis/userdata/jobs/016/016450/seeds.chunk001of006.m4.fofn || exit $?
      echo "Task 2 completed at $(date)"
      
      
      
      rcode=$?
      echo "Finished on $(date -u)"
      echo "Task hgapAlignForCorrection_001of006 with nproc 1 with exit code ${rcode}."
      exit ${rcode}Running task://016450/P_PreAssemblerDagcon/hgapAlignForCorrection_001of006 on Linux cluster5-01.bpcservers.private 2.6.32-431.el6.x86_64 #1 SMP Fri Nov 22 03:15:09 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
      Started on Wed Apr 29 01:52:18 UTC 2015
      Validating existence of Input Files
      Successfully found /smrtanalysis/userdata/jobs/016/016450/data/nocontrol_filtered_subreads.fasta
      Successfully found /smrtanalysis/userdata/jobs/016/016450/filtered_longreads.chunk001of006.fasta
      Successfully validated input files
      [INFO] 2015-04-28T21:52:18 [blasr] started.
      # Writing stdout and stderr from Popen:
      Your job 19310 ("Phga016450") has been submitted
      Job 19310 exited because of signal SIGKILL
      SIGKILL??
      Thanks

      Comment


      • #4
        What is the hardware specs for the server you are running this on? Are you using a cluster or a stand-alone server?

        Comment


        • #5
          Assuming you're running on SGE, one of two things happened:
          1) Your sys admin qdel'd your job (unlikely)
          2) Your job hit a resource limit, and SGE killed the job automatically either due to it's exceeding the time limit allowed for the job, cpu/memory limits.

          Talk with your sys admin and find out why the job may have been killed.

          Alternatively if you were running it locally, the job's memory consumption likely exceeded the system hardware.

          Comment


          • #6
            running on a server (CentOS 6.5) with SGE. 32 cpus and 1024 GB.

            Thank you for point me in that direction> I'll ask my sys admin.

            Comment


            • #7
              So the scheduler is killing the job but we are not sure why it's being set. The job is being submitted with a "hard" resource list that includes the parameter h_rt=43200. This means the hard limit real time lifespan of the job is 43200 seconds (12H).
              This resource limit isn't something impose on the cluster. It's coming from whatever is submitting the job (SMRT Portal?).

              Comment


              • #8
                Check the sge template scripts in <SMRT Analysis>/analysis/etc/cluster/SGE/*.tmpl

                Comment


                • #9
                  I recollect that SMRTportal needs to be able to submit sub-jobs from the original job that gets launched. My hunch is that your SGE may not be set up to allow that. You can ask your admins to verify.

                  Comment


                  • #10
                    Originally posted by cascoamarillo View Post
                    So the scheduler is killing the job but we are not sure why it's being set. The job is being submitted with a "hard" resource list that includes the parameter h_rt=43200. This means the hard limit real time lifespan of the job is 43200 seconds (12H).
                    This resource limit isn't something impose on the cluster. It's coming from whatever is submitting the job (SMRT Portal?).
                    12H is the default time limit per task in a default SMRTAnalysis install.

                    You can change that by following rhall's advice in a previous post and modifying the SGE scripts to increase the hard time limit that's already preset here:
                    [smrtanalysis_install]/analysis/etc/cluster/SGE/interactive.tmpl*

                    Comment


                    • #11
                      Thank you all!
                      Wow, how do you know where are those config files? Just set a bigger hard time. Anyway; it seems (de novo assembly with pacbio) is quite a long process. I place in parallel a subset of filtered subreads (120,000) to be assembled with CA (using 12 processors and 48G of memory) and still waiting after 4 days.
                      Last edited by cascoamarillo; 05-12-2015, 08:30 AM.

                      Comment


                      • #12
                        Originally posted by cascoamarillo View Post
                        Thank you all!
                        Wow, how do you know where are those config files? Just set a bigger hard time. Anyway; it seems (de novo assembly with pacbio) is quite a long process. I place in parallel a subset of filtered subreads to be assembled with CA (using 12 processors and 48G of memory) and still waiting after 4 days.
                        Glad it worked out for you! rhall and I are both pacbio employees who frequent this forum from time to time and are happy to lend a helping hand when time permits.

                        Alot of factors influence how long an assembly takes; e.g. cleanliness of library prep, size of genome, repetitiveness of genome, ploidy, quality and quantity of input data and the list goes on...

                        Comment


                        • #13
                          Originally posted by cascoamarillo View Post
                          Thank you all!
                          Wow, how do you know where are those config files? Just set a bigger hard time. Anyway; it seems (de novo assembly with pacbio) is quite a long process. I place in parallel a subset of filtered subreads (120,000) to be assembled with CA (using 12 processors and 48G of memory) and still waiting after 4 days.
                          Is the job running for 4 days or is it waiting for 4 days to get on the cluster?

                          Comment


                          • #14
                            Originally posted by GenoMax View Post
                            Is the job running for 4 days or is it waiting for 4 days to get on the cluster?
                            It's running. Apparently, it has finished with overlapStoreBuild process and now is dealing with the correct-frags.

                            Comment


                            • #15
                              Hi guys,
                              Let me continue this post with a question regarding CA output. In 9-terminator/ folder there is a summary of the assembly with Read Depth Histogram. There are contigs (consensus) with more read depth than others. I'd like to extract a subset of contigs with the maximun coverage reported. Is it possible to do that? Thanks.

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                Current Approaches to Protein Sequencing
                                by seqadmin


                                Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                                04-04-2024, 04:25 PM
                              • seqadmin
                                Strategies for Sequencing Challenging Samples
                                by seqadmin


                                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                                03-22-2024, 06:39 AM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, 04-11-2024, 12:08 PM
                              0 responses
                              18 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-10-2024, 10:19 PM
                              0 responses
                              22 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-10-2024, 09:21 AM
                              0 responses
                              16 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-04-2024, 09:00 AM
                              0 responses
                              47 views
                              0 likes
                              Last Post seqadmin  
                              Working...
                              X