Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Blast > parsing result in Exel

    Hy everybody,

    in this situation froma blast (-m 1) result file :

    Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer,
    Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997),
    "Gapped BLAST and PSI-BLAST: a new generation of protein database search
    programs", Nucleic Acids Res. 25:3389-3402.

    Query= 132-291
    (59 letters)

    Database: Scrivania/orchidea/mature_mirBase.fa
    21,643 sequences; 470,608 total letters

    Searching..................................................done



    Score E
    Sequences producing significant alignments: (bits) Value

    mtr-miR2644b MIMAT0013413 Medicago truncatula miR2644b 28 0.031
    mtr-miR2644a MIMAT0013412 Medicago truncatula miR2644a 28 0.031
    gga-miR-1704 MIMAT0007596 Gallus gallus miR-1704 22 1.9
    gga-miR-1557 MIMAT0007414 Gallus gallus miR-1557 22 1.9
    mmu-miR-880-5p MIMAT0017266 Mus musculus miR-880-5p 22 1.9

    132_0 8 cagccgctcagattgatggtgcctacagccttgccagcccgctcagattgat 59
    12631 5 .............. 18
    12630 5 .............. 18
    7826 5 ........... 15
    7644 19 ........... 9
    5394 3 ........... 13
    5394 3 ........... 13
    BLASTN 2.2.21 [Jun-14-2009]


    Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer,
    ...
    ....
    ..........

    ______________________________________________________________
    I need to parse in an exel sheet :

    1)ID 2)Name of the hit 3)E-value 4)Score 5)Species


    1) 132-291 2)mir2644b 3) 0,031 4)28 5) Medicago truncatula


    Is possible from a big blast result file obtain an exel with 5 columns where every field is the first hit of the blast result. Can anyone halp me to fix this problem ??? Also with a little script in perl.


    Thank you very much

  • #2
    use -m 8 for tabular output and then import in excel

    Comment


    • #3
      I know the -m 8 view but give me another result respect to m1 with lack of information. So i ask you a little script to handle the txt file and parse it on exel.

      Comment


      • #4
        You really don't want to try and use Excel for parsing plain text BLAST output. Parsing plain text BLAST output is annoying enough in a proper language like Perl or Python - BioPerl, Biopython and the NCBI don't recommend it. Rather they recommend to use the tabular output (simpler) or the XML ouput (richer).

        Note BLAST+ lets you request quite a lot of extra columns of information in the tabular output. If that still isn't enough, I would write a script (not using Excel) to parse the extra information from the XML BLAST output.

        In fact, you really shouldn't want to use Excel for Bioinformatics in the first place. One very nicely documented reason is here http://dx.doi.org/10.1186/1471-2105-5-80
        Last edited by maubp; 11-15-2011, 04:57 AM. Reason: Note about BLAST+ extra columns in tabular output; recommendation

        Comment


        • #5
          Thanks you very much, i did not know about this limit, i'll read the paper.

          Cheers

          Comment


          • #6
            I see you're opting for Perl, in which case using BioPerl to parse the BLAST text output is a very good idea: http://lists.open-bio.org/pipermail/...er/035872.html

            Comment


            • #7
              Yes, infact !!! Thanks you very much !!!

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Strategies for Sequencing Challenging Samples
                by seqadmin


                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                03-22-2024, 06:39 AM
              • seqadmin
                Techniques and Challenges in Conservation Genomics
                by seqadmin



                The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                Avian Conservation
                Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                03-08-2024, 10:41 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, 03-27-2024, 06:37 PM
              0 responses
              12 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 03-27-2024, 06:07 PM
              0 responses
              11 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 03-22-2024, 10:03 AM
              0 responses
              53 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 03-21-2024, 07:32 AM
              0 responses
              68 views
              0 likes
              Last Post seqadmin  
              Working...
              X