Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Blast > parsing result in Exel

    Hy everybody,

    in this situation froma blast (-m 1) result file :

    Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer,
    Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997),
    "Gapped BLAST and PSI-BLAST: a new generation of protein database search
    programs", Nucleic Acids Res. 25:3389-3402.

    Query= 132-291
    (59 letters)

    Database: Scrivania/orchidea/mature_mirBase.fa
    21,643 sequences; 470,608 total letters

    Searching..................................................done



    Score E
    Sequences producing significant alignments: (bits) Value

    mtr-miR2644b MIMAT0013413 Medicago truncatula miR2644b 28 0.031
    mtr-miR2644a MIMAT0013412 Medicago truncatula miR2644a 28 0.031
    gga-miR-1704 MIMAT0007596 Gallus gallus miR-1704 22 1.9
    gga-miR-1557 MIMAT0007414 Gallus gallus miR-1557 22 1.9
    mmu-miR-880-5p MIMAT0017266 Mus musculus miR-880-5p 22 1.9

    132_0 8 cagccgctcagattgatggtgcctacagccttgccagcccgctcagattgat 59
    12631 5 .............. 18
    12630 5 .............. 18
    7826 5 ........... 15
    7644 19 ........... 9
    5394 3 ........... 13
    5394 3 ........... 13
    BLASTN 2.2.21 [Jun-14-2009]


    Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer,
    ...
    ....
    ..........

    ______________________________________________________________
    I need to parse in an exel sheet :

    1)ID 2)Name of the hit 3)E-value 4)Score 5)Species


    1) 132-291 2)mir2644b 3) 0,031 4)28 5) Medicago truncatula


    Is possible from a big blast result file obtain an exel with 5 columns where every field is the first hit of the blast result. Can anyone halp me to fix this problem ??? Also with a little script in perl.


    Thank you very much

  • #2
    use -m 8 for tabular output and then import in excel

    Comment


    • #3
      I know the -m 8 view but give me another result respect to m1 with lack of information. So i ask you a little script to handle the txt file and parse it on exel.

      Comment


      • #4
        You really don't want to try and use Excel for parsing plain text BLAST output. Parsing plain text BLAST output is annoying enough in a proper language like Perl or Python - BioPerl, Biopython and the NCBI don't recommend it. Rather they recommend to use the tabular output (simpler) or the XML ouput (richer).

        Note BLAST+ lets you request quite a lot of extra columns of information in the tabular output. If that still isn't enough, I would write a script (not using Excel) to parse the extra information from the XML BLAST output.

        In fact, you really shouldn't want to use Excel for Bioinformatics in the first place. One very nicely documented reason is here http://dx.doi.org/10.1186/1471-2105-5-80
        Last edited by maubp; 11-15-2011, 04:57 AM. Reason: Note about BLAST+ extra columns in tabular output; recommendation

        Comment


        • #5
          Thanks you very much, i did not know about this limit, i'll read the paper.

          Cheers

          Comment


          • #6
            I see you're opting for Perl, in which case using BioPerl to parse the BLAST text output is a very good idea: http://lists.open-bio.org/pipermail/...er/035872.html

            Comment


            • #7
              Yes, infact !!! Thanks you very much !!!

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Advancing Precision Medicine for Rare Diseases in Children
                by seqadmin




                Many organizations study rare diseases, but few have a mission as impactful as Rady Children’s Institute for Genomic Medicine (RCIGM). “We are all about changing outcomes for children,” explained Dr. Stephen Kingsmore, President and CEO of the group. The institute’s initial goal was to provide rapid diagnoses for critically ill children and shorten their diagnostic odyssey, a term used to describe the long and arduous process it takes patients to obtain an accurate...
                12-16-2024, 07:57 AM
              • seqadmin
                Recent Advances in Sequencing Technologies
                by seqadmin



                Innovations in next-generation sequencing technologies and techniques are driving more precise and comprehensive exploration of complex biological systems. Current advancements include improved accessibility for long-read sequencing and significant progress in single-cell and 3D genomics. This article explores some of the most impactful developments in the field over the past year.

                Long-Read Sequencing
                Long-read sequencing has seen remarkable advancements,...
                12-02-2024, 01:49 PM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, 12-17-2024, 10:28 AM
              0 responses
              26 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 12-13-2024, 08:24 AM
              0 responses
              42 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 12-12-2024, 07:41 AM
              0 responses
              28 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 12-11-2024, 07:45 AM
              0 responses
              42 views
              0 likes
              Last Post seqadmin  
              Working...
              X