Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Why don't mapping programs map directly into BAM format?

    Is this historical contingency? Or is there a functional reason why mapping programs like BWA or Stampy don't map directly to BAM?

  • #2
    1) because writing BAM is non-trivial and no one wants to add a dependency only for writing BAM

    2) because you can easily generate BAM by piping the output to samtools

    Comment


    • #3
      But assuming it was trivial, is there a good reason to output to SAM. Seems like all downstream operations would be BAM-based anyways?

      Comment


      • #4
        When mapping sequences each sequence is handled independently and the mapping program just wants to write this to the output file and move on (even more so when running multiple threads).

        Many people who want to use BAM files want to have indexed files which allow for efficient random access. Because the files need to be sorted the output needs to be post-processed after all of the mapping is done, you can't just write them out as you go along. Mapping programs would therefore still have to write out SAM files as they went along and then convert these to BAM at the end. Since you can easily do this with a samtools command the mapping programs generally don't bother to do this and leave it up to the user.

        Comment


        • #5
          Indeed. Or to put it another way,

          Output of unsorted SAM is easy.

          Output of pre-sorted SAM/BAM is hard.

          Since you (or the tool) will have to do a SAM/BAM coordinate sorting step anyway, you might as well do the unsorted SAM to sorted BAM in one go with samtools.

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Current Approaches to Protein Sequencing
            by seqadmin


            Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
            04-04-2024, 04:25 PM
          • seqadmin
            Strategies for Sequencing Challenging Samples
            by seqadmin


            Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
            03-22-2024, 06:39 AM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, 04-11-2024, 12:08 PM
          0 responses
          27 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 10:19 PM
          0 responses
          30 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 09:21 AM
          0 responses
          26 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-04-2024, 09:00 AM
          0 responses
          52 views
          0 likes
          Last Post seqadmin  
          Working...
          X