Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Problem with Celera Assembler

    Hello,

    I have short reads sequences already in a FRG file (converted from fasta to amos then to frg) which I would like to assemble using Celera Assembler, for which I run the following command:

    runCA -d testDir -p testPrefix testShortReads.frg

    After running several subprograms (gatekeeper, initialTrim, meryl, etc) successfully it fails running overlapTrim:

    ERROR: Failed with signal ABRT (6)
    step OverlapTrim failed with 'Failed to build the obt store.'

    Anyone can help me with this problem?
    I would appreciate any help. Thanks.

  • #2
    CA will not assemble "shorts reads", if you mean Solexa or Solid.
    Have a look ath their wiki at https://sourceforge.net/apps/mediawi...lexa_Platforms

    If you have FLX or titanium data then the information provided is not enough ;-)

    Cheers,
    Sven

    Comment


    • #3
      Originally posted by sklages View Post
      CA will not assemble "shorts reads", if you mean Solexa or Solid.
      Have a look ath their wiki at https://sourceforge.net/apps/mediawi...lexa_Platforms

      If you have FLX or titanium data then the information provided is not enough ;-)

      Cheers,
      Sven
      OK, I should have said a little more about the sequences. They are 454 FLX. A sample of the original fasta file is here:

      >000038_0115_1501 length=95 uaccno=EYVY07101AKEWF
      CGTTACGTCTTCAAGCCTCAGAAAATTAAATCTTGATGCAAAAGGTCGAGATAAATGGTGCAGGCGGATTTTTCCCCGNGTCTGGTATCCAAGTT

      However I am presenting the input data to CA in FRG as requested and as I said in the original message.

      Any hint?
      Thanks for your time.

      ---sram

      Comment


      • #4
        This looks like 454 GS20 data. Sure it is FLX data?

        Anyway, if possible you always start from the SFF files. Generating input for CA from SFF files is best done with https://sourceforge.net/apps/mediawi...Inputs#sffToCA

        There will (probably) be no general solution for this kind of failure; there should have been written some error logs. Take a look at these.

        Also take a look at the help page https://sourceforge.net/apps/mediawi...php?title=Help
        and contact the authors if you cannot solve the problem.

        Nevertheless, if you have a simple solution you should post it here ;-)

        cheers,
        Sven

        Comment


        • #5
          Hi All,
          I am trying to run Celera assembler on the sun grid engine using the below option.

          perl /usr/local/wgs-6.1/Linux-amd64/bin/runCA-OBT.pl useGrid=1 scriptOnGrid=1 -d /roche/Trimmed_Reads/454/Unpaired/ -p grid_test /roche/Trimmed_Reads/454/Unpaired/*.frg ovlMemory="4GB --hashload 0.8 --hashstrings 100000" ovlThreads=2 ovlHashBlockSize=180000 ovlRefBlockSize=2000000 frgCorrBatchSize=200000 frgCorrThreads=2 ovlCorrBatchSize=800000 unitigger=bog

          The script executes but I am getting permission error inspite of changing permissions to the bin directory containing the runCA command.I have pasted runCA.sge.out and runCA.sge.out.sh error messages below.I would be happy if someone could help me resolve this issue.

          runCA.sge.out.01

          Warning: no access to tty (Bad file descriptor).
          Thus no job control in this shell.
          /bin/.: Permission denied.
          syst=Linux: Command not found.
          arch=x86_64: Command not found.
          name=454rig.dhmriad.local: Command not found.
          arch: Undefined variable.

          runCA.sge.out.01.sh
          #!/bin/sh
          #
          # Attempt to (re)configure SGE. For reasons Bri doesn't know,
          # jobs submitted to SGE, and running under SGE, fail to read his
          # .tcshrc (or .bashrc, limited testing), and so they don't setup
          # SGE (or ANY other paths, etc) properly. For the record,
          # interactive SGE logins (qlogin, etc) DO set the environment.

          . $SGE_ROOT/$SGE_CELL/common/settings.sh

          # On the off chance that there is a pathMap, and the host we
          # eventually get scheduled on doesn't see other hosts, we decide
          # at run time where the binary is.

          syst=`uname -s`
          arch=`uname -m`
          name=`uname -n`

          if [ "$arch" = "x86_64" ] ; then
          arch="amd64"
          fi
          if [ "$arch" = "Power Macintosh" ] ; then
          arch="ppc"
          fi

          bin="/usr/local/wgs-6.1/$syst-$arch/bin"

          /usr/bin/env perl $bin/runCA "useGrid=1" "scriptOnGrid=1" -d "/roche/Trimmed_Reads/454/Unpaired/" -p "grid_test" "/roche/Trimmed_Reads/454/Unpaired/FR6EAL4.frg" "/roche/Trimmed_Reads/454/Unpaired/FRK90FP0.frg" "/roche/Trimmed_Reads/454/Unpaired/FSIT0PR0.frg" "/roche/Trimmed_Reads/454/Unpaired/FTNMD73.frg" "/roche/Trimmed_Reads/454/Unpaired/FUORSTX0.frg" "/roche/Trimmed_Reads/454/Unpaired/GMEFJA40.frg" "ovlMemory=4GB --hashload 0.8 --hashstrings 100000" "ovlThreads=2" "ovlHashBlockSize=180000" "ovlRefBlockSize=2000000" "frgCorrBatchSize=200000" "frgCorrThreads=2" "ovlCorrBatchSize=800000" "unitigger=bog"


          Thanks,
          AR

          Comment


          • #6
            @archananraja,

            i AM HAVING THE SAME PROBLEM
            Are you sure that your current directory from which you launch the command and the filepath in your command line are the same?
            This could solve your permissions problem

            Comment


            • #7
              Hi,

              I am trying out Celera for assembling de novo my 1.8 billion illumina reads.
              Celera RunCA version 6.1.
              I have a question about the `Sun Grid Engine Options`
              On the following web page, they precise how you can adjust this grid to a small Sanger dataset:
              http://sourceforge.net/apps/mediawik...un_Grid_Engine
              But I find no information how I could use this for a large Illumina dataset of 75-100b reads. Especially because I have 1.8 billion reads, I was wondering how I could adjust the CPU and the Memory best for my kind of data with the Sun Grid Engine.

              Does anyone have experience with Celera and large Illumina datasets?
              I know they say CA1.6 should be able to assemble 1 billion reads, according to their website,and so I am hoping it could work for more too!

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Current Approaches to Protein Sequencing
                by seqadmin


                Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                04-04-2024, 04:25 PM
              • seqadmin
                Strategies for Sequencing Challenging Samples
                by seqadmin


                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                03-22-2024, 06:39 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, 04-11-2024, 12:08 PM
              0 responses
              22 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-10-2024, 10:19 PM
              0 responses
              24 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-10-2024, 09:21 AM
              0 responses
              20 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-04-2024, 09:00 AM
              0 responses
              52 views
              0 likes
              Last Post seqadmin  
              Working...
              X