Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • bwa mem don't set PROPER_PAIR flag in pair end align

    $ cat r1.fq
    @1
    GGGCTAAGAGACAGCAAATACACATGAACAGAAAGAAGAGGGCAAAGAAAAGGCTGACGG
    +
    CCCFFFFFHHFHHJIJIJIJJJJJJJJDIJJJJJIJJIJJIIGIJJJJHIHJJJJIHIHI

    $ cat r2.fq
    @1
    TAGTTTAAACGAGATTGCCAGGACCGGGTATCATTCACCATTTTTCTTTTCGTTAACTTG
    +
    CCCFFFFFHHHHHJJJIIIHIIJGGIGIJEIJIFGGGIJJJGFIIIJIFFEF?CFFEDCB

    $ bwa mem ../ref/ref.fa r1.fq r2.fq > mem.sam;cat mem.sam
    @SQ SN:1 LN:100000
    @PG ID:bwa PN:bwa VN:0.7.10-r1017-dirty CL:bwa mem ../ref/ref.fa r1.fq r2.fq
    1 81 1 1741 60 60M = 1681 -120 CCGTCAGCCTTTTCTTTGCCCTCTTCTTTCTGTTCATGTGTATTTGCTGTCTCTTAGCCC IHIHIJJJJHIHJJJJIGIIJJIJJIJJJJJIDJJJJJJJJIJIJIJHHFHHFFFFFCCC NM:i:1 MD:Z:18A41 AS:i:55 XS:i:0
    1 161 1 1681 60 60M = 1741 120 TAGTTTAAACGAGATTGCCAGGACCGGGTATCATTCACCATTTTTCTTTTCGTTAACTTG CCCFFFFFHHHHHJJJIIIHIIJGGIGIJEIJIFGGGIJJJGFIIIJIFFEF?CFFEDCB NM:i:1 MD:Z:21C38 AS:i:55 XS:i:0

    $ bwa aln ../ref/ref.fa r1.fq >r1.sai
    $ bwa aln ../ref/ref.fa r2.fq >r2.sai
    $ bwa sampe -r "@RG\tID:foo\tSM:bar" ../ref/ref.fa r1.sai r2.sai r1.fq r2.fq |\
    samtools view -bS - > aln.bam
    $ samtools view aln.bam
    1 83 1 1741 60 60M = 1681 -120 CCGTCAGCCTTTTCTTTGCCCTCTTCTTTCTGTTCATGTGTATTTGCTGTCTCTTAGCCC IHIHIJJJJHIHJJJJIGIIJJIJJIJJJJJIDJJJJJJJJIJIJIJHHFHHFFFFFCCC RG:Z:foo XT:A:U NM:i:1 SM:i:37 AM:i:37 X0:i:1 X1:i:0 XM:i:1 XO:i:0 XG:i:0 MD:Z:18A41
    1 163 1 1681 60 60M = 1741 120 TAGTTTAAACGAGATTGCCAGGACCGGGTATCATTCACCATTTTTCTTTTCGTTAACTTG CCCFFFFFHHHHHJJJIIIHIIJGGIGIJEIJIFGGGIJJJGFIIIJIFFEF?CFFEDCB RG:Z:foo XT:A:U NM:i:1 SM:i:37 AM:i:37 X0:i:1 X1:i:0 XM:i:1 XO:i:0 XG:i:0 MD:Z:21C38

    the problem is:
    samtools mpileup can't output valid results when the PROPER_PAIR flag not set.

  • #2
    If you want mpileup to include these cases use the -A option (-A: count anomalous read pairs). However, the question of way bwa mem doesn't like this pair remains...

    Comment


    • #3
      thanks. got it. Maybe bwa mem just like paired reads overlapped? I'll have a try.

      Comment


      • #4
        Originally posted by gada View Post
        Maybe bwa mem just like paired reads overlapped?
        Mmm.. I doubt it... Non overlapping pairs make perfect sense and I don't think such a bug could have sneaked into bwa mem undetected.

        Comment


        • #5
          $ bwa mem ../ref/ref.fa r1.fq r2.fq
          [M::bwa_idx_load_from_disk] read 0 ALT contigs
          @SQ SN:1 LN:100000
          @PG ID:bwa PN:bwa VN:0.7.10-r1017-dirty CL:bwa mem ../ref/ref.fa r1.fq r2.fq
          [M:rocess] read 2 sequences (182 bp)...
          [M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (0, 1, 0, 0)
          [M::mem_pestat] skip orientation FF as there are not enough pairs
          [M::mem_pestat] skip orientation FR as there are not enough pairs
          [M::mem_pestat] skip orientation RF as there are not enough pairs
          [M::mem_pestat] skip orientation RR as there are not enough pairs
          [M::mem_process_seqs] Processed 2 reads in 0.000 CPU sec, 0.000 real sec
          1 97 1 1861 60 120M = 1979 180 CTTCATCTGCAGGTGTCTGACTTCCAGCAACTGCTGGCCTGTGCCAGGGTGCAAGCTGAGCACTGGAGTGGAGTTTTCCTGTGGAGAGGAGCCATGCCTAGAGTGGGATGGGCCATTGTT CCCCFFFFFHHFHHJIJIJIJJJJJJJJDIJJJJJIJJIJJIIGIJJJJHIHJJJJIHIHICCFFFFFHHFHHJIJIJIJJJJJJJJDIJJJJJIJJIJJIIGIJJJJHIHJJJJIHIHI NM:i:0 MD:Z:120 AS:i:120 XS:i:0
          1 145 1 1979 60 62M = 1861 -180 TTCATCTTCTGGCCCCTGTTGTCTGCATGTAACTTAATACCACAACCAGGCATAGGGGAAAG BBBCDEFFC?FEFFIJIIIFGJJJIGGGFIJIEJIGIGGJIIHIIIJJJHHHHHFFFFFCCC NM:i:0 MD:Z:62 AS:i:62 XS:i:0
          [main] Version: 0.7.10-r1017-dirty
          [main] CMD: bwa mem ../ref/ref.fa r1.fq r2.fq
          [main] Real time: 0.019 sec; CPU: 0.004 sec

          $ ruby -e 'puts 97.to_s(2)'
          1100001
          ruby -e 'puts 145.to_s(2)'
          10010001

          nothing changed.

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Strategies for Sequencing Challenging Samples
            by seqadmin


            Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
            03-22-2024, 06:39 AM
          • seqadmin
            Techniques and Challenges in Conservation Genomics
            by seqadmin



            The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

            Avian Conservation
            Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
            03-08-2024, 10:41 AM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, Yesterday, 06:37 PM
          0 responses
          11 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, Yesterday, 06:07 PM
          0 responses
          10 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 03-22-2024, 10:03 AM
          0 responses
          51 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 03-21-2024, 07:32 AM
          0 responses
          68 views
          0 likes
          Last Post seqadmin  
          Working...
          X