Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • MEGAN by command line

    Hello friends

    I am getting trouble trying to analyze some BLAST results executing Megan 5 via command line. When I use the "import Blast" option of the GUI, I've got no problems reading the file and getting results. But when trying to do the same using the command line, I cannot get anything. Here is my command:

    ~/megan/MEGAN -g -x "load taxGIFile=gi_taxid_prot.bin; import blastfile=./my.blastx meganfile=test.blastx.rma"

    Result: no hits are recognized (trece below). Even if it recognizes correctly the blasttab format of the file. Any hint on this, please? Thanks a lot!

    Input format: BlastTAB
    Processing my.blastx
    Processing BlastTAB file(s)
    Note: Reads file(s) not given or found, RMA file will not contain read sequences.
    Total reads: 161001
    Total no-hits: 161001
    Total matches: 8360785
    Matches discarded: 5234055
    Parsing required 118 seconds
    Running Data analyzer: Init
    Analyzing all matches
    Applying min-support filter
    Number of changes due to min-support filter: 0
    Number of reads: 80501
    Low complexity: 0
    With valid hits: 0
    MEGAN> Total reads: 80501
    Assigned reads: 0
    Unassigned reads: 80501
    Reads with no hits: 0
    Reads low comp.: 0
    Induce Taxonomy tree, keeping 2 of 1111248 nodes

  • #2
    MEGAN 5.3.0 - More Questions

    Thanks for starting this thread jtamames!

    I too am having difficulties getting MEGAN to work from the command line. I've tried similar commands as you and I've also tried abiding by the recommendations in the manual to put your commands in a separate text file and then running the "-c" flag to specify the text file. Unfortunately, nothing happens with either approach. I don't even get the output you've posted (and I've tried specifying -v or verbose messaging).

    Here are the commands I'm trying to run. In the "command.txt approach" I've tried to emulate the start-up routine of MEGAN run in GUI-mode by loading the treeFiles. But, that routine isn't transparent. If there were some way of seeing what commands the GUI is issuing, that would go a long ways in helping me troubleshoot.

    WITHOUT COMMAND FILE:
    MEGAN -g false -v true -x "import blastFile=./AA10.compiled.subset.fa.blast.out fastaFile=./AA10.compiled.subset.fa meganFile=test.blastx.rma blastFormat=BlastTAB;"


    WITH COMMAND FILE:
    MEGAN -g false -v true -c command.txt

    Contained in the command file (separated by new line and without quotations)
    load treeFile=ncbi.map;
    load treeFile=ncbi.tre;
    load taxGIFile=~/Metagenomes/megan/gi_taxid_nucl.bin;
    import blastFile=./AA10.compiled.subset.fa.blast.out fastaFile=./AA10.compiled.subset.fa meganFile=test.blastx.rma blastFormat=BlastTAB;

    Thanks

    Comment


    • #3
      How to run MEGAN in command line mode

      Dear Roli,

      the -x command line option only works correctly for a single command, don't use it for multiple commands. (A future release of MEGAN will check that only one command has been submitted using -x.)

      There are two correct ways to have MEGAN perform multiple commands in command line mode:

      a) (As you mentioned): Put all commands into a file and give the file to MEGAN using -c, e.g.:
      megan/MEGAN -g -E -c commands.txt
      This works for me, using the following list of commands in commands.txt:
      load taxGIFile='malt-data/gi_taxid_prot-2014Jan04.bin';
      import blastfile='data/megan/ecoli/x.blastx' meganfile='x.rma';
      quit;

      In your example of a commands file you list the following commands:
      load treeFile=ncbi.map;
      load treeFile=ncbi.tre;

      Don't do this unless you intend to load an alternative taxonomic tree. But even then this is incorrect because the map file must be loaded using load mapFile=... and not load treeFile=...

      The crucial thing is that each command appears on a separate line, because MEGAN updates the program state after each newline. If the newlines are missing then some commands may have no effect. (That is why -x doesn't work as expected).

      b) The other option is to pipe the commands from a file, thus:
      megan/MEGAN -g -E < commands.txt

      Note the -E command line option: this causes MEGAN to exit when an exception is thrown, which is often what you want.
      Also, best to put file paths in single quotes; essential if they contain spaces.

      Best wishes
      Daniel

      Comment


      • #4
        Hello

        Thanks for the answer, I am getting results now.

        By the way, where can I get that taxonomy file gi_taxid_prot-2014Jan04.bin? The one in the web page is from 2013 and it is getting very outdated.

        Thanks

        Javier

        Comment


        • #5
          Dear Javier,

          I have just uploaded them on to the MEGAN5 download webpage:

          http://ab.inf.uni-tuebingen.de/data/...d/welcome.html

          Best wishes
          Daniel

          Comment


          • #6
            Great, thanks Daniel!

            Comment


            • #7
              Thank you Dr. Huson for your help!

              Unfortunately, I'm still floundering...

              The MEGAN command you've specified (megan/MEGAN -g -E -c command.txt) doesn't suppress the GUI and I get the following error:

              "MEGAN fatal error:
              java.awt.HeadlessException:
              No X11 DISPLAY variable was set, but this program performed an operation which requires it.
              java.awt.HeadlessException:
              No X11 DISPLAY variable was set, but this program performed an operation which requires it."


              When I specify "-g false" or "-g+" the error disappears, but nothing happens. No '.rma' file is created nor is there any standard output. Do you have any idea what might be causing this problem? I'm working with version 5.3.0

              Also, I noticed in your command list that your wrote "blastfile", though in the manual you use camel-case, like this "import blastFile." It doesn't seem to matter for my problem (I tried both), but clarification might help whomever reads this thread.

              Thanks in advance!

              Comment


              • #8
                MEGAN opens windows even in non-GUI mode because I couldn't get the program to work correctly otherwise.
                Run Megan using the Linux command xvfb-run on a server without graphical display.
                The exact syntax can be found in the MEGAN manual.
                I uploaded a new installer today that provides both a GUI executable and a non-GUI executable.
                Daniel

                Comment


                • #9
                  GREAT! I have it working! Following your advice I was successful when I ran:

                  xvfb-run --auto-servernum --server-num=1 MEGAN -g -E -c command.txt

                  Command.txt
                  load taxGIFile='~/Metagenomes/megan/gi_taxid_nucl-2014Jan04.bin';
                  import blastFile=FILE.blast.out fastaFile=FILE.fa meganFile=FILE.rma blastFormat=BlastTAB;

                  (Note to reader: I use blast output format 7)

                  Comment


                  • #10
                    Important update

                    I was too quick to celebrate in the above post. I had nearly everything correct.

                    The command to initiate MEGAN from the prompt worked great:

                    xvfb-run --auto-servernum --server-num=1 MEGAN -g -E -c command.txt

                    BUT, I was missing a key ingredient in my command text, since I am using blast output format 7 I require mapping GI numbers. I was successful in loading the "taxGIFile", but this necessitates an additional argument in your "import blastFile" command.

                    Command.txt
                    load taxGIFile='~/Metagenomes/megan/gi_taxid_nucl-2014Jan04.bin';
                    import blastFile=FILE.blast.out fastaFile=FILE.fa meganFile=FILE.rma blastFormat=BlastTAB mapping='Taxonomy:BUILT_IN=true,Taxonomy:GI_MAP=true';

                    For simplification, I've made a python script that automates the command-line process. It is a pretty clunky script, but should do the trick. Find it with some documentation at my github account: https://github.com/Roli-Wilhelm.

                    Comment


                    • #11
                      I have another question about runnning MEGAN from the command line. I have multiple samples I am trying to analyze. I am able to import from blast and generate the .rma files from a command line script. I then have another command line script that I use to try to extract the taxonomy assignments. I run the command:

                      megan/MEGAN -g -E -f MG100128.rma -c commands.txt

                      The file, commands.txt, looks like this:

                      collapse rank=SuperKingdom;
                      select nodes=all;
                      export what=DSV format=taxonname_count separator=tab counts=summarized file='MG100128_superkingdom.txt';
                      collapse rank=Phylum;
                      select nodes=leaves;
                      export what=DSV format=taxonname_count separator=tab counts=summarized file='MG100128_phyla.txt';
                      collapse rank=Genus;
                      select nodes=leaves;
                      export what=DSV format=taxonname_count separator=tab counts=summarized file='MG100128_genera.txt';
                      collapse rank=Species;
                      select nodes=leaves;
                      export what=DSV format=taxonname_count separator=tab counts=summarized file='MG100128_species.txt';
                      quit;

                      However, I only get output in MG100128_superkingdom.txt -- 0 lines are written to the other files. If I open MEGAN and do the commands manually, I get at least 1 line per output file (phylum, genus, and species level). Is there something I am missing in my command file?

                      Thanks for your help!

                      Comment


                      • #12
                        Hey JSM!

                        I've done the same, but instead of "DSV" format, I've simply exported "paths".

                        My command file looks like:
                        open file='/home/user/MEGAN/FILE.rma';
                        export what=paths file='/home/user/MEGAN/EXPORT/FILE.taxonomy.export.txt';

                        The output contains single lines per read that are semi-colon delimited:

                        OM1_scaffold-50553; root; 100; cellular organisms; 100; Bacteria; 100; Actinobacteria <phylum>; 100; Actinobacteria; 100; Actinobacteridae; 100; Actinomycetales; 100; Catenulisporineae; 100; Catenulispora; 100; Catenulispora acidiphila; 100; Catenulispora acidiphila DSM 44928; 100;

                        For trouble-shooting, I recommend using the MEGAN GUI and doing the same analysis and selecting "Message Window" from "View." This way you can review all of the commands that MEGAN used. You can then use the manual to incorporate those commands into your command text.


                        Hope that helps!
                        Last edited by roliwilhelm; 09-03-2014, 07:53 AM.

                        Comment


                        • #13
                          Thanks for the help! I got it to work both ways using xvfb-run.

                          Comment


                          • #14
                            Originally posted by Daniel.Huson View Post
                            MEGAN opens windows even in non-GUI mode because I couldn't get the program to work correctly otherwise.
                            Run Megan using the Linux command xvfb-run on a server without graphical display.
                            The exact syntax can be found in the MEGAN manual.
                            I uploaded a new installer today that provides both a GUI executable and a non-GUI executable.
                            Daniel
                            Hello Daniel,

                            I have downloaded 5.3.3 and (both linux and windows) I cannot find the commandline version of megan - (I would really like to get rid of xvfb from my scripts).
                            In the version I have it is mentioned in the manual, but does not exist in the file system - can please clarify.

                            many thanks

                            Paul

                            Comment


                            • #15
                              Hello friends

                              I am getting trouble trying to export COG data using a command file. I can export with no problem taxonomy data. Here is my command file:

                              load taxGIFile='gi_taxid_prot-2014Jan04.bin';
                              load cogRefSeqFile='ref2cog.map';
                              import blastFile='my.blastx' meganFile='megan.rma' maxMatches=100 minScore=50.0 maxExpected=0.01 topPercent=10.0 minSupport=50 minComplexity=0.0 useMinimalCoverageHeuristic=false useSeed=false useCOG=true useKegg=false paired=false useIdentityFilter=false textStoragePolicy=Embed blastFormat=BlastTAB mapping='Taxonomy:BUILT_IN=true,Taxonomy:GI_MAP=true,COG:REFSEQ_MAP=true';
                              show window=cogViewer;
                              select nodes=leaves;
                              export what=DSV format=cogname_readname separator=tab file='m_cogname.txt';


                              And this is the result. As you will see, COGs are assigned correctly but not expoerted. I think the problem is that the command show window=cogViewer is not working properly and the tool remains stuck in phylogenetic data, and therefore exporting cognames does not make sense. Any help on this will be very welcomed!


                              Executing: load treeFile='ncbi.tre';
                              Loading mapping file: ncbi.map
                              Reading file: ncbi.map: 913476
                              Loading taxonomy file: ncbi.tre
                              Reading file: ncbi.tre: 913475
                              Executing: ;
                              Executing: load taxGIFile='gi_taxid_prot-2014Jan04.bin';
                              GI lookup file 'gi_taxid_prot-2014Jan04.bin': 142099258 entries
                              100% (0.0s)
                              Executing: load cogRefSeqFile='ref2cog.map';
                              Loading cog.map: 94592
                              Loading cog.tre: 94592
                              Loading RefSeq2IdMap from file: ref2cog.map
                              done (847997 entries)
                              100% (1.5s)
                              Executing: import blastFile='my.blastx' meganFile='megan.rma' maxMatches=100 minScore=50.0 maxExpected=0.01 topPercent=10.0 minSupport=50 minComplexity=0.0 useMinimalCoverageHeuristic=false useSeed=false useCOG=true useKegg=false paired=false useIdentityFilter=false textStoragePolicy=Embed blastFormat=BlastTAB mapping='Taxonomy:BUILT_IN=true,Taxonomy:GI_MAP=true,COG:REFSEQ_MAP=true';
                              Deleting existing file: /home/tamames/metagenomes/MGRAST/megan.rma
                              Importing data:
                              Importing data: 0 reads file(s), 1 blast file(s)
                              Input format: BlastTAB
                              TextStoragePolicy: Embed matches and reads in MEGAN file
                              Will stream through reads, not load them into memory (assumes that reads occur in same order in BLAST and FASTA files)
                              Processing my.blastx
                              Processing BlastTAB file(s)
                              Note: Reads file(s) not given or found, RMA file will not contain read sequences.
                              Total reads: 10633
                              Total no-hits: 10633
                              Total matches: 1251343
                              Matches discarded: 891770
                              Parsing required 18 seconds
                              Running Data analyzer: Init
                              Analyzing all matches
                              Applying min-support filter
                              Number of changes due to min-support filter: 308
                              Number of reads: 5317
                              Low complexity: 0
                              With valid hits: 4732
                              With COG-ids: 3495
                              Writing classification tables
                              Number of taxa identified: 56
                              Number of COG classes identified: 1575
                              Syncing
                              Data processor required: 2 secs
                              Executing: export what=DSV format=cogname_readname separator=tab file='m_cogname.txt';
                              Export in DSV format: Initializing
                              done 0
                              Message - Wrote 0 line(s) to file: m_cogname.txt

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                Current Approaches to Protein Sequencing
                                by seqadmin


                                Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                                04-04-2024, 04:25 PM
                              • seqadmin
                                Strategies for Sequencing Challenging Samples
                                by seqadmin


                                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                                03-22-2024, 06:39 AM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, 04-11-2024, 12:08 PM
                              0 responses
                              30 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-10-2024, 10:19 PM
                              0 responses
                              32 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-10-2024, 09:21 AM
                              0 responses
                              28 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-04-2024, 09:00 AM
                              0 responses
                              52 views
                              0 likes
                              Last Post seqadmin  
                              Working...
                              X