Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • BWA paired end mapping quality

    I used BWA to map my PE sequencing data to reference genome. I try to use paired mapping quality to filter bad read pairs out for downstream analysis.
    How BWA calculate paired mapping quality? I understand it calculates single-end mapping quality like MAQ does. But I am not sure how it proceeds after having the single mapping quality for both ends? Simply add up or something more complicated? I’ve checked the source code, but the program does not make much sense without a good understanding of the variable names/notations. FYI, the relevant source code is located in the ‘static int pairing’ function of the bwape.c file.
    I would really appreciate your input.
    pparg

  • #2
    Hello, does anybody have any ideas on this? Thank you!

    Comment


    • #3
      Hi all, I'm interested too! Could someone post a link or a brief description of BWA quality mapping scoring ?

      Thanks in advance.

      Comment


      • #4
        +1, I have the exact same question, too

        +1, I have the exact same question, too

        I'd also like to know how the mapping quality for paired end reads is computed, is it just the sum of the quality of the two separate reads?

        Comment


        • #5
          Unfortunately, the best documentation is from the original paper (single end) as well as the code (paired end). Try modifying the code to print out the relevant variables to understand the calculation etc.

          Comment


          • #6
            Hey I'm interested in this too. In particular, what if one read maps to one location on the reference, but the the other read maps to somewhere differently (such that it does not have the correct orientation and/or distance)? What I really want to know if such pairs are down weighted by low mapping quality in some way?

            Comment


            • #7
              It says in the paper that BWA will find all single-end alignments for each mate and sort them in ascending order of chromosomal coordinates. Then it uses an estimated insert size to determine which of the chromosomal coordinates are best for both mates.

              The insert size is determined in the function infer_isize, and I believe the pairing is determined in the function pairing :-) both are contained in bwape.c.

              Comment


              • #8
                Hello All,

                I have a WholeExome paired end sample and I reached the step where I am performing the alignment to human genome (hg19.fa) on a 10 node cluster.

                I am running the command:
                bwa aln hg19.fa sample1_1.fastq > sample1_1.sai
                bwa aln hg19.fa sample1_2.fastq > sample1_2.sai

                But it's taking forever. I understand this could due to couple of reasons, main reason being that I am not doing any pre-filtering. I saw that packages like GenomeQuest do lot of pre-filtering which can make the alignment faster.

                I am total new-bie and i am wondering if I can get help here regarding how and what kind of pre-filtering can I run with this sample before using bwa for alignment. I am kind of in a hurry to get some results so any result will be extremely appreciated.

                Thanks,
                angel

                Comment


                • #9
                  I've run bwa on exome capture DNA with no filtering at all. And takes a while, but it doesn't take forever, and every minute or so it updates the screen telling me how many more reads its finished processing.

                  Using multiple processors with the -t option would certainly speed things along, if your computer has that capacity.

                  Comment


                  • #10
                    Thanks swbarnes2 very much for your reply.

                    I hope my files will finish by tomorrow. The size of one paired-end fastq file in my case is 63GB.

                    I will try the multi-threading mode you mentioned tomorrow.

                    Angel

                    Comment

                    Latest Articles

                    Collapse

                    • seqadmin
                      Strategies for Sequencing Challenging Samples
                      by seqadmin


                      Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                      03-22-2024, 06:39 AM
                    • seqadmin
                      Techniques and Challenges in Conservation Genomics
                      by seqadmin



                      The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                      Avian Conservation
                      Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                      03-08-2024, 10:41 AM

                    ad_right_rmr

                    Collapse

                    News

                    Collapse

                    Topics Statistics Last Post
                    Started by seqadmin, 03-27-2024, 06:37 PM
                    0 responses
                    13 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 03-27-2024, 06:07 PM
                    0 responses
                    12 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 03-22-2024, 10:03 AM
                    0 responses
                    53 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 03-21-2024, 07:32 AM
                    0 responses
                    69 views
                    0 likes
                    Last Post seqadmin  
                    Working...
                    X