Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to convert blastn output file to sam/bam

    I am trying to convert the output of blast in a .sam or .bam file using the blast2sam tool.

    The alignement of the reads has been done with the command

    blastn -query 130205_UNC11-SN627_0280_AC1NEKACXX_TTAGGC_L004_1.fasta -db blast_ref -word_size 15 -outfmt "6 qseqid sseqid pident nident length mismatch positive gapopen gaps ppos qframe sframe sstrand qcovs qstart qend qseq sstart send sseq evalue bitscore score" -out blast_tab

    This is the first line of the output blast_tab:

    UNC11-SN627:280:C1NEKACXX:4:1101:11031:1976 sequenzadifusione 93.62 44 3 44 0 0 93.62 1 1 plus 98 2 48 TGAACCCGGGAGGTGGAGGTTGCAGTGAGCCGAGATTGCGCCACTGC 24710 24756 TGAACCCGGGAGGTGGAGGCTGCAGTGAGCTGAGATAGCGCCACTGC 6e-16 71.3 38

    Then the conversion has been done with the command blast2sam (not blast2bam)

    blast2sam.pl blast_tab > blast.sam

    For the conversion we didn't use the default format, but the tabular format of the output of blast.

    In the conversion there aren't errors, but the output file blast.sam is empty.



    Where can be the error?
    Is there another tool to make the conversion or another alignment tool for which it is possible to specify the output format as .sam or .bam?

  • #2
    Try the new code mentioned in this thread at the end: https://www.biostars.org/p/53434/ You will need to have your blast results in XML format (based on the readme for the new code).

    Comment


    • #3
      I downloaded the code, but I'm not able to create the ref.dict. How can I do it?
      Then in the folder "src" there are two codes (blastSam.c and blastSam.h), so which one should I use?
      Thanks

      Comment


      • #4
        Originally posted by federica.r View Post
        I downloaded the code, but I'm not able to create the ref.dict. How can I do it?
        Then in the folder "src" there are two codes (blastSam.c and blastSam.h), so which one should I use?
        Thanks
        That is source code. You will need to compile it into an executable using a compiler (gcc). What OS are you using?

        You may also have to re-run your blast to get output in XML format (unless there is a tab to XML converter available).

        Comment


        • #5
          Originally posted by GenoMax View Post
          That is source code. You will need to compile it into an executable using a compiler (gcc). What OS are you using?

          You may also have to re-run your blast to get output in XML format (unless there is a tab to XML converter available).
          I am using Linux.
          Where can I find the tab to XML converter?

          Comment


          • #6
            Originally posted by federica.r View Post
            I am using Linux.
            Where can I find the tab to XML converter?
            Were you able to compile the program?

            XML converter: https://www.biostars.org/p/7981/

            Ideally you should re-run the blast and save output as XML.

            Comment


            • #7
              * ref.dict is created with picard CreateSequenceDictionary http://broadinstitute.github.io/picard/
              * to compile the C program you need : GNU make and the GCC 'C compiler'
              * "Where can I find the tab to XML converter? " you can't : it's like creating a cow from a steak.

              Comment


              • #8
                Originally posted by GenoMax View Post
                Were you able to compile the program?

                XML converter: https://www.biostars.org/p/7981/

                Ideally you should re-run the blast and save output as XML.
                We tried to compile the program, but there is the following error:

                xsltproc --output parseXML.c --stringparam fileType c schema2c.xsl schema.xml
                make: xsltproc: Command not found
                make: *** [parseXML.c] Error 127

                What does it mean?

                We are also trying to run blast to obtain the XML output but it's taking a really long time (almost 24 hour, probably because the output is very big).

                Thank you

                Comment


                • #9
                  as specified in the Requirements section of https://github.com/guyduche/Blast2Bam , xsltproc is required . On most linux there is a command to quickly install it . Something like 'sudo apt-get install xsltproc'

                  "probably because the output is very big" : yes, because the sequences are fetched+added. You can pipe the output into gzip to reduce the size of the XML, or directly pipe the output of blasn into blast2sam as shown in https://github.com/guyduche/Blast2Ba...ster/README.md

                  Comment


                  • #10
                    You may need to install libxslt (and perhaps libxml2 as well). Install instructions would vary depending on what kind of linux distro you are using.

                    Comment


                    • #11
                      It isn't ready yet, but the NCBI seem to be working on adding SAM output to BLAST+ itself:
                      Recently NCBI BLAST+ 2.2.31  was released, and it contains an undocumented "Easter Egg" - this is still very rough around the edges but they...

                      Comment

                      Latest Articles

                      Collapse

                      • seqadmin
                        Recent Advances in Sequencing Analysis Tools
                        by seqadmin


                        The sequencing world is rapidly changing due to declining costs, enhanced accuracies, and the advent of newer, cutting-edge instruments. Equally important to these developments are improvements in sequencing analysis, a process that converts vast amounts of raw data into a comprehensible and meaningful form. This complex task requires expertise and the right analysis tools. In this article, we highlight the progress and innovation in sequencing analysis by reviewing several of the...
                        Today, 07:48 AM
                      • seqadmin
                        Essential Discoveries and Tools in Epitranscriptomics
                        by seqadmin




                        The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                        04-22-2024, 07:01 AM

                      ad_right_rmr

                      Collapse

                      News

                      Collapse

                      Topics Statistics Last Post
                      Started by seqadmin, Today, 07:17 AM
                      0 responses
                      8 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 05-02-2024, 08:06 AM
                      0 responses
                      19 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 04-30-2024, 12:17 PM
                      0 responses
                      20 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 04-29-2024, 10:49 AM
                      0 responses
                      28 views
                      0 likes
                      Last Post seqadmin  
                      Working...
                      X