Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • simpson
    Member
    • Dec 2012
    • 11

    parse xml to tabular format

    Hi, I've done blastx with -m 7 option. The output file is .xml.
    But i'd like to do a data analysis to know the percentage of my unigene set has a blast hit. So i want to cover the .xml to another format.
    Does anyone know any script that can do this?
    Thanks,
    Vivi
  • GenoMax
    Senior Member
    • Feb 2008
    • 7142

    #2
    See this thread on Biostar: http://www.biostars.org/p/7290/

    Stand-alone parser: https://github.com/pjotrp/blastxmlparser
    Last edited by GenoMax; 04-30-2013, 09:35 AM.

    Comment

    • simpson
      Member
      • Dec 2012
      • 11

      #3
      Originally posted by GenoMax View Post
      See this thread on Biostar: http://www.biostars.org/p/7290/

      Stand-alone parser: https://github.com/pjotrp/blastxmlparser
      I had already seen that website. But that seems not work. Do you have more specific script and command that i can use directly?

      Comment

      • maubp
        Peter (Biopython etc)
        • Jul 2009
        • 1544

        #4
        If you specifically want to regenerate the BLAST+ tabular output, try this (standalone) Python script of mine:
        https://bitbucket.org/peterjc/galaxy...ar.py?at=tools *DEAD LINK*
        https://github.com/peterjc/galaxy_bl..._to_tabular.py *NEW LINK*

        This is available with a Galaxy wrapper on the Galaxy Tool Shed as part of the BLAST+ suite:
        Last edited by maubp; 04-17-2014, 03:41 AM. Reason: Updating link; I moved this code from BitBucket to GitHub

        Comment

        • simpson
          Member
          • Dec 2012
          • 11

          #5
          Originally posted by maubp View Post
          If you specifically want to regenerate the BLAST+ tabular output, try this (standalone) Python script of mine:


          This is available with a Galaxy wrapper on the Galaxy Tool Shed as part of the BLAST+ suite:
          http://toolshed.g2.bx.psu.edu/view/d...cbi_blast_plus
          Thanks. another question: how could i execute the python script? Can you give me the command please?

          Comment

          • maubp
            Peter (Biopython etc)
            • Jul 2009
            • 1544

            #6
            Originally posted by simpson View Post
            Thanks. another question: how could i execute the python script? Can you give me the command please?
            Download the 'raw' python script and save it in your folder as blastxml_to_tabular.py then:

            Code:
            $ python blastxml_to_tabular.py
            Expect 3 arguments: input BLAST XML file, output tabular file, out format (std or ext)
            For example, if you want the standard 12 column tab separated variables,

            Code:
            $ python blastxml_to_tabular.py example.xml example.tsv std
            If you want more details, it does an extended 24 column output mode too:

            Code:
            $ python blastxml_to_tabular.py example.xml example.tsv ext
            The command line interface was deliberately minimal as this was intended primarily for use via the Galaxy interface.

            Comment

            • simpson
              Member
              • Dec 2012
              • 11

              #7
              Originally posted by maubp View Post
              Download the 'raw' python script and save it in your folder as blastxml_to_tabular.py then:

              Code:
              $ python blastxml_to_tabular.py
              Expect 3 arguments: input BLAST XML file, output tabular file, out format (std or ext)
              For example, if you want the standard 12 column tab separated variables,

              Code:
              $ python blastxml_to_tabular.py example.xml example.tsv std
              If you want more details, it does an extended 24 column output mode too:

              Code:
              $ python blastxml_to_tabular.py example.xml example.tsv ext
              The command line interface was deliberately minimal as this was intended primarily for use via the Galaxy interface.
              Traceback (most recent call last):
              File "xml.py", line 70, in <module>
              import xml.etree.cElementTree as ElementTree
              File "/scratch/hpc/tianwenl/blastall/xml.py", line 70, in <module>
              import xml.etree.cElementTree as ElementTree
              ImportError: No module named etree.cElementTree
              [tianwenl@submit1 blastall]$ module load python
              [tianwenl@submit1 blastall]$ python xml.py jatropha.unigene20.nr.xml jatropha.tabular.tsv ext
              Traceback (most recent call last):
              File "xml.py", line 70, in <module>
              import xml.etree.cElementTree as ElementTree
              File "/scratch/hpc/tianwenl/blastall/xml.py", line 70, in <module>
              import xml.etree.cElementTree as ElementTree
              ImportError: No module named etree.cElementTree
              [tianwenl@submit1 blastall]$ python xml.py jatropha.unigene20.nr.xml jatropha.tabular.tsv std
              Traceback (most recent call last):
              File "xml.py", line 70, in <module>
              import xml.etree.cElementTree as ElementTree
              File "/scratch/hpc/tianwenl/blastall/xml.py", line 70, in <module>
              import xml.etree.cElementTree as ElementTree
              ImportError: No module named etree.cElementTree

              Comment

              • maubp
                Peter (Biopython etc)
                • Jul 2009
                • 1544

                #8
                What version of python do you have?

                Comment

                • simpson
                  Member
                  • Dec 2012
                  • 11

                  #9
                  Originally posted by maubp View Post
                  What version of python do you have?
                  it is biopython 1.59

                  Comment

                  • maubp
                    Peter (Biopython etc)
                    • Jul 2009
                    • 1544

                    #10
                    This script doesn't use Biopython - I meant which version of Python do you have? e.g. python 2.5?

                    Comment

                    • simpson
                      Member
                      • Dec 2012
                      • 11

                      #11
                      Originally posted by maubp View Post
                      This script doesn't use Biopython - I meant which version of Python do you have? e.g. python 2.5?
                      sorry - -

                      Python 2.6.6 (r266:84292, Oct 12 2012, 14:23:48)

                      Comment

                      • maubp
                        Peter (Biopython etc)
                        • Jul 2009
                        • 1544

                        #12
                        Strange. The cElementTree library would normally be included with Python, however I've updated the script to fall back on the pure Python ElementTree library instead. Could you try that please (same link - that points at the latest version)? Thanks.

                        Comment

                        • simpson
                          Member
                          • Dec 2012
                          • 11

                          #13
                          it's working now!!!
                          Thank you very much!!


                          Originally posted by maubp View Post
                          Strange. The cElementTree library would normally be included with Python, however I've updated the script to fall back on the pure Python ElementTree library instead. Could you try that please (same link - that points at the latest version)? Thanks.

                          Comment

                          • amitbik
                            Member
                            • May 2013
                            • 53

                            #14
                            Hi.. maubp

                            I want to convert blast xml output to tabular form. I followed your link but it is not opening.
                            Can you send the again?

                            Thank you......

                            Comment

                            • maubp
                              Peter (Biopython etc)
                              • Jul 2009
                              • 1544

                              #15
                              Originally posted by amitbik View Post
                              Hi.. maubp

                              I want to convert blast xml output to tabular form. I followed your link but it is not opening.
                              Can you send the again?

                              Thank you......
                              Sorry - the old BitBucket link is dead now, that code moved to GitHub:

                              Comment

                              Latest Articles

                              Collapse

                              • SEQadmin2
                                Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                                by SEQadmin2


                                I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.


                                Here are nine questions we think about, in roughly the order they matter, before...
                                06-18-2026, 07:11 AM
                              • SEQadmin2
                                From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                                by SEQadmin2


                                Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                                The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                                ...
                                06-02-2026, 10:05 AM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by SEQadmin2, 06-17-2026, 06:09 AM
                              0 responses
                              26 views
                              0 reactions
                              Last Post SEQadmin2  
                              Started by SEQadmin2, 06-09-2026, 11:58 AM
                              0 responses
                              43 views
                              0 reactions
                              Last Post SEQadmin2  
                              Started by SEQadmin2, 06-05-2026, 10:09 AM
                              0 responses
                              48 views
                              0 reactions
                              Last Post SEQadmin2  
                              Started by SEQadmin2, 06-04-2026, 08:59 AM
                              0 responses
                              49 views
                              0 reactions
                              Last Post SEQadmin2  
                              Working...