Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • dmacmillan
    Member
    • Jan 2012
    • 49

    BWA sampe to bam?

    So as everyone who uses bwa knows, the sampe function outputs a file in sam format. What I want to do is somehow convert that sam file to a bam file in some sort of pipe? It seems easy to implement, but I keep getting an error from samtools.

    cat file.sam | samtools view -Sb

    that does not work!
  • kopi-o
    Senior Member
    • Feb 2008
    • 319

    #2
    Look at the samtools manualpage: http://samtools.sourceforge.net/samtools.shtml

    You are looking for samtools view -bS or samtools view -bt

    Comment

    • swbarnes2
      Senior Member
      • May 2008
      • 910

      #3
      What you want is something like:

      bwa sampe ref.fa r1.sai r2.sai r1.fq r2.fq | samtools view -bSho out.bam -;

      Comment

      • xied75
        Senior Member
        • Feb 2012
        • 129

        #4
        Dear all,

        Just wondering, SAM is far bigger than BAM, and seems not much people will open the SAM and read it, if from BWA direct output BAM, it saves a lot effort and the disk I/O is faster due to smaller file size. Does this make sense or I forgot something?

        Best,

        dong

        Comment

        • arvid
          Senior Member
          • Jul 2011
          • 156

          #5
          Originally posted by xied75 View Post
          Dear all,

          Just wondering, SAM is far bigger than BAM, and seems not much people will open the SAM and read it, if from BWA direct output BAM, it saves a lot effort and the disk I/O is faster due to smaller file size. Does this make sense or I forgot something?

          Best,

          dong
          Theoretically, when your server is more CPU-limited than I/O-limited and you only need to sequentially read the whole file, SAM will be faster than BAM (due to the compression overhead in BAM). I found that this is never the case for our applications and therefore pipe aligners directly into a samtools chain (with the -m option to samtools sort to fit most alignments in memory, thus avoiding temporary files to be written to disk), to directly get a sorted BAM on disk.
          Last edited by arvid; 04-22-2012, 11:14 PM.

          Comment

          • dmacmillan
            Member
            • Jan 2012
            • 49

            #6
            Originally posted by swbarnes2 View Post
            What you want is something like:

            bwa sampe ref.fa r1.sai r2.sai r1.fq r2.fq | samtools view -bSho out.bam -;
            I understand what you are doing here, but what is with the '-;' at the end (ignoring the single quotations)?

            Comment

            • swbarnes2
              Senior Member
              • May 2008
              • 910

              #7
              Originally posted by dmacmillan View Post
              I understand what you are doing here, but what is with the '-;' at the end (ignoring the single quotations)?
              the '-' means "the thing that's being piped". At least, that's how I understand it. That command works, I use it all the time just like I wrote it there, so would this:

              Code:
              bwa sampe ref.fa r1.sai r2.sai r1.fq r2.fq | samtools view -bSh - > out.bam;

              Comment

              • sdriscoll
                I like code
                • Sep 2009
                • 436

                #8
                I don't know if it's necessary from the BWA output or not but I like to use the -F option for output from bowtie to eliminate unaligned reads from making their way into the BAM file. Also the -h option isn't necessary in this example - the BAM header gets created appropriately..in fact I don't think samtools will allow you to create a BAM file from a SAM file without the SAM file already having the correct header information. I've only needed the -h option when I view BAM files. By default the header is left off when viewing a BAM file as SAM via Samtools.

                So what I always use is this:

                Code:
                bwa sampe ref.fa r1.sai r2.sai r1.fq r2.fq | samtools view -bS -F 0x04 - > out.bam
                sometimes followed by this:

                Code:
                samtools sort out.bam out-sorted
                Bowtie doesn't properly sort its output and I don't remember if BWA does either. If you use the BAM file for any downstream analysis you usually need it to be sorted by chromosome and position.
                /* Shawn Driscoll, Gene Expression Laboratory, Pfaff
                Salk Institute for Biological Studies, La Jolla, CA, USA */

                Comment

                • dmacmillan
                  Member
                  • Jan 2012
                  • 49

                  #9
                  Interesting tips, I will try both, thanks!

                  Comment

                  • arvid
                    Senior Member
                    • Jul 2011
                    • 156

                    #10
                    To reduce the I/O load (and total CPU time as well) even further, this is my favourite:

                    Code:
                    bwa sampe ref.fa r1.sai r2.sai r1.fq r2.fq | samtools view -bSu -F 0x04 - | samtools sort -m 4294967296 - out.sorted 
                    samtools index out.sorted.bam
                    Set -m as high as you can afford; in my hands samtools sort needs RAM up to 2x the value specified there in bytes (I set this to 16 GB when running on a server, which is enough for most BAMs to be sorted without writing temporary files to disk). -u removes the compression/decompression overhead in the pipe between view and sort.

                    Comment

                    • swbarnes2
                      Senior Member
                      • May 2008
                      • 910

                      #11
                      piping into samtools sort works? I was afraid that that would get ugly.

                      How can I ask the server I'm on how much memory I can devote to sort?

                      Comment

                      • nilshomer
                        Nils Homer
                        • Nov 2008
                        • 1283

                        #12
                        Use the "-m" option in samtools sort instead.

                        Comment

                        Latest Articles

                        Collapse

                        ad_right_rmr

                        Collapse

                        News

                        Collapse

                        Topics Statistics Last Post
                        Started by SEQadmin2, 06-05-2026, 10:09 AM
                        0 responses
                        12 views
                        0 reactions
                        Last Post SEQadmin2  
                        Started by SEQadmin2, 06-04-2026, 08:59 AM
                        0 responses
                        24 views
                        0 reactions
                        Last Post SEQadmin2  
                        Started by SEQadmin2, 06-02-2026, 12:03 PM
                        0 responses
                        28 views
                        0 reactions
                        Last Post SEQadmin2  
                        Started by SEQadmin2, 06-02-2026, 11:40 AM
                        0 responses
                        22 views
                        0 reactions
                        Last Post SEQadmin2  
                        Working...