Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • SAM(tools) and BLAST

    Hi there,

    I am currently trying to use different aligners for metagenomic short/medium sequences and SAM seems to be a good intermediate format for my analyses. For sensitivity reasons my current reference is based on Blast but I cannot convert any blast result file into SAM format.

    The perl script contained in the samtools release 0.1.17 called blast2sam.pl does not really work and isn't very informative. Anybody has experience with Blast and samtools? I tried different blast output formats but found none of them working.

    -------------------------------------------------------------------------

    > blast2sam.pl test.blastn
    Use of uninitialized value $qend in subtraction (-) at /local/programs/samtools/blast2sam.pl line 58, <> line 1823.
    Use of uninitialized value $qlen in subtraction (-) at /local/programs/samtools/blast2sam.pl line 58, <> line 1823.
    Use of uninitialized value in substr at /local/programs/samtools/blast2sam.pl line 63, <> line 1823.
    Use of uninitialized value in concatenation (.) or string at /local/programs/samtools/blast2sam.pl line 63, <> line 1823.
    Use of uninitialized value in bitwise and (&) at /local/programs/samtools/blast2sam.pl line 65, <> line 1823.
    Use of uninitialized value $sam in join or string at /local/programs/samtools/blast2sam.pl line 72, <> line 1823.
    Use of uninitialized value $sam in join or string at /local/programs/samtools/blast2sam.pl line 72, <> line 1823.
    Use of uninitialized value $sam in join or string at /local/programs/samtools/blast2sam.pl line 72, <> line 1823.
    255 M * 0 0 * *

  • #2
    BLAST support will be dropped unless someone want to maintain it. I realize that it would be better to have fewer functionality to avoid letting others blame me for having too many bugs. I just thought this script may be useful to someone occasionally, but it is now causing more troubles than good. Sorry.

    Comment


    • #3
      more verbose

      I was aware that the script is not mature but for understanding its functionality I need some more information on its usage, e. g. what the BLAST output format should be for the program input. I was hoping someone could give me a hint into the right direction before starting to read through the perl code.

      Since I made my decision for the SAM format I might as well end up writing my own converter from blast to SAM.

      How about putting the script into the svn but not releasing it in the final tar balls?

      Comment


      • #4
        blast2sam.pl script works for me on default verbose textual output from blastall 2.2.19 .

        SAM format is well documented and easy to produce. BioPerl has a solid parser for blast output. How about writing an output module for SAM format? Bio::AlignIO::sam would be great to have!

        Comment


        • #5
          @fungs

          blast2sam.pl works with the default output. It fails probably because some regex matching fails. As you have that blast causing the problem, it may be easier for you to debug it (around line 1823). If you can fix it, please let me know. Thank you.

          Comment


          • #6
            I think this has been implemented via Bio::Assembly::IO::sam.

            Originally posted by Heikki View Post
            blast2sam.pl script works for me on default verbose textual output from blastall 2.2.19 .

            SAM format is well documented and easy to produce. BioPerl has a solid parser for blast output. How about writing an output module for SAM format? Bio::AlignIO::sam would be great to have!

            Comment


            • #7
              Native support

              Just to complete this: NCBI Blast+ now supports SAM directly.

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Current Approaches to Protein Sequencing
                by seqadmin


                Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                04-04-2024, 04:25 PM
              • seqadmin
                Strategies for Sequencing Challenging Samples
                by seqadmin


                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                03-22-2024, 06:39 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, 04-11-2024, 12:08 PM
              0 responses
              25 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-10-2024, 10:19 PM
              0 responses
              28 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-10-2024, 09:21 AM
              0 responses
              24 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-04-2024, 09:00 AM
              0 responses
              52 views
              0 likes
              Last Post seqadmin  
              Working...
              X