Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • FALCON assembler without SGE?

    Has anyone had any success getting FALCON to work on a single node without SGE installed?

    -Jason

  • #2
    Hello Jason,

    Recently there has been an addition to the FALCON code base that allows it to run in stand alone more (via BASH). To do this, please wipe the SGE options empty, but do not remove those parameters, otherwise you will encounter an error. Also, you'll need to add the line 'job_type= local' to the CFG file. Lastly, make sure you are running the latest version to use this functionality.

    Happy assembling!
    - Roberto

    Comment


    • #3
      That worked Roberto, thank you very much!

      Comment


      • #4
        But even when I modified the main script (vim fc_run.py) into:

        def run_script(job_data, job_type = "local" ):

        And empty those params in fc_run.py:

        sge_option_da =
        sge_option_la =
        sge_option_pda =
        sge_option_pla =
        sge_option_fc =
        sge_option_cns =

        I still get this error msg:

        /export/arrayPRO2/PacBio/FALCON/fc_env/lib/python2.6/site-packages/falcon_kit-0.2.1-py2.6-linux-x86_64.egg/falcon_kit/FastaReader.py:40: DeprecationWarning: the md5 module is deprecated; use hashlib instead
        import md5
        sh: qsub: command not found

        What do I do wrong?

        Comment


        • #5
          Hi Lance,

          Please note: I am a beginner when it comes to FALCON and bioinformatics so what I did below might not be correct but I thought that I would share what I did with a similar error message.


          I had a similar qsub error after installing FALCON and trying the E.coli example locally from a terminal in Ubuntu. It was unclear to me from rlleras' comment above which file had to have the options removed while retaing the parameters.
          So I tried inserting 'job_type = local' into the fc_run_ecoli.cfg file above the first sge_option line so that it now reads:


          job_type = local

          sge_option_da = -pe smp 8 -q jobqueue


          I then saved it as fc_run_ecoli_local.cfg and ran it with the following command:

          path/to/my/FALCON-master/ecoli_test$ fc_run.py fc_run_ecoli_local.cfg


          I think that FALCON ran properly and generated the correct files.

          From what I understand from the FALCON manual, contigs were not generated from the above process so I tried constructing graph from overlaps like this:


          path/to/my/FALCON-master/ecoli_test$ fc_ovlp_to_graph.py /path/to/my/FALCON-master/ecoli_test/2-asm-falcon/preads.ovl


          Then I tried constructing the contigs from graph as follows :

          path/to/my/FALCON-master/ecoli_test$ fc_graph_to_contig.py


          note : Before entering the command above, I copied the 'preads4falcon shortcut' file from the 2-asm-falcon folder and pasted it into the 'e-coli_test' folder as this script seemed to be looking for it.


          My question now is where do I look for the contig files and what are they called? Or did I do something incorrectly?

          Comment


          • #6
            Hi all,

            I think that I have answered my question:

            It appears that the E-coli genome contig (~4.6Mbp) is p_ctg.fa and is output in : path/to/my/FALCON-master/ecoli_test/2-asm-falcon

            It appears that my extra steps of constructing graph from overlaps
            and constructing the contigs from graph was unnecessary as these steps were already completed.

            Comment


            • #7
              Dear ATㄣGC
              Thank you so much!!! You made the program finally running on a none-cluster server.

              Comment


              • #8
                Hello guys,

                I am very happy to see this thread. How do you specify a number of concurrent jobs on local computer / number of cores / memory falcon can use at most???

                I mean, I see that specification for resources are written on every job separately, but since it is not clear, how many jobs can run simultaneously, it I am not sure how to estimate it (and i do not want to overload our server).

                If there is only one job at the time (which is kind of logical for all non-overlap jobs), I should probably specify parameter -sXXX of pa_DBsplit_option to something really big (to have only one process only).

                I could probably find out by running it on lambda phage or so, but if you have done it already, I would appreciate, if you will share your experience...

                Cheers,
                Kamil

                Comment

                Latest Articles

                Collapse

                • seqadmin
                  Strategies for Sequencing Challenging Samples
                  by seqadmin


                  Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                  03-22-2024, 06:39 AM
                • seqadmin
                  Techniques and Challenges in Conservation Genomics
                  by seqadmin



                  The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                  Avian Conservation
                  Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                  03-08-2024, 10:41 AM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by seqadmin, Today, 06:37 PM
                0 responses
                7 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, Today, 06:07 PM
                0 responses
                7 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 03-22-2024, 10:03 AM
                0 responses
                49 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 03-21-2024, 07:32 AM
                0 responses
                66 views
                0 likes
                Last Post seqadmin  
                Working...
                X