Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • BWA sam -> bam in pipeline

    I am trying to use BWA mem to align pair-end reads to a single genome and pipe the sam file output to a bam and then a bedfile. The pipeline doesn't seem to be working.

    It creates the output sam, bam and bed files but the bam and bed files are empty. It completes the alignment but an error message keeps appearing: Failed to open BAM file stdin.

    Here is the code I am using:
    Code:
    bwa mem NC_005148.fa ./seqtk_1/subsample_1/sub_NC_005148_1.fq ./seqtk_1/subsample_1/sub_NC_005148_2.fq > NC_005148_BWA.sam | \
    samtools view -S -h -u - | \
    samtools sort - | \
    samtools rmdup -s - | \
    tee NC_005148_BWA_sorted.bam | \
    bamToBed > NC_005148_BWA_sorted.bed
    I am new to coding so any help would be appreciated.

  • #2
    This is probably based on this Biostars thread. Compare your command with the one there and you will notice that you are missing a couple of inputs (-).

    Comment


    • #3
      That the thread that I used for the basis of the pipeline. I tried with the '- -' like it said in the thread the first time but it didn't work also.

      Also, do I need to index the resulting bam files?

      Comment


      • #4
        Since new version of samtools has slightly different options try this variation

        bwa mem index R1.fq.gz R2.fq.gz | samtools view -Shu - | samtools sort - | samtools rmdup -s - - | tee NC_005148_BWA_sorted.bam | bamToBed > NC_005148_BWA_sorted.bed
        When using pipes you should keep adding one additional operation to see where things are going wrong to debug starting at left side of the command.
        Last edited by GenoMax; 05-17-2017, 05:52 AM.

        Comment


        • #5
          Thank you. I've tried the code that was posted. It provided the sam file and a sorted bam file. However, it appears to be empty.

          I used samtools flagstat on the sorted bam file to just to check the alignment. This was the output:
          0 + 0 in total (QC-passed reads + QC-failed reads)
          0 + 0 secondary
          0 + 0 supplementary
          0 + 0 duplicates
          0 + 0 mapped (N/A : N/A)
          0 + 0 paired in sequencing
          0 + 0 read1
          0 + 0 read2
          0 + 0 properly paired (N/A : N/A)
          0 + 0 with itself and mate mapped
          0 + 0 singletons (N/A : N/A)
          0 + 0 with mate mapped to a different chr
          0 + 0 with mate mapped to a different chr (mapQ>=5)

          If I use samtools on flagstat on the sam file. This is the output:
          2000000 + 0 in total (QC-passed reads + QC-failed reads)
          0 + 0 secondary
          0 + 0 supplementary
          0 + 0 duplicates
          2000000 + 0 mapped (100.00% : N/A)
          2000000 + 0 paired in sequencing
          1000000 + 0 read1
          1000000 + 0 read2
          1999996 + 0 properly paired (100.00% : N/A)
          2000000 + 0 with itself and mate mapped
          0 + 0 singletons (0.00% : N/A)
          0 + 0 with mate mapped to a different chr
          0 + 0 with mate mapped to a different chr (mapQ>=5)


          I am struggling to think what is happening to the bam file.

          Comment


          • #6
            The command is working for me with Samtools v.1.4 and BedTools v.2.25.0. If you are using older version of these packages then try to upgrade.

            As for debugging why the command is not working for you, start at the left and keep adding one pipe at a time to figure out which component fails. I assume you are using real index/file names as applicable in your case.
            Last edited by GenoMax; 05-17-2017, 06:28 AM.

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Recent Innovations in Spatial Biology
              by seqadmin


              Spatial biology is an exciting field that encompasses a wide range of techniques and technologies aimed at mapping the organization and interactions of various biomolecules in their native environments. As this area of research progresses, new tools and methodologies are being introduced, accompanied by efforts to establish benchmarking standards and drive technological innovation.

              3D Genomics
              While spatial biology often involves studying proteins and RNAs in their...
              01-01-2025, 07:30 PM
            • seqadmin
              Advancing Precision Medicine for Rare Diseases in Children
              by seqadmin




              Many organizations study rare diseases, but few have a mission as impactful as Rady Children’s Institute for Genomic Medicine (RCIGM). “We are all about changing outcomes for children,” explained Dr. Stephen Kingsmore, President and CEO of the group. The institute’s initial goal was to provide rapid diagnoses for critically ill children and shorten their diagnostic odyssey, a term used to describe the long and arduous process it takes patients to obtain an accurate...
              12-16-2024, 07:57 AM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, 01-09-2025, 04:04 PM
            0 responses
            433 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 01-09-2025, 09:42 AM
            0 responses
            441 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 01-08-2025, 03:17 PM
            0 responses
            455 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 01-03-2025, 11:18 AM
            1 response
            50 views
            1 like
            Last Post Tonia
            by Tonia
             
            Working...
            X