Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • MAQ paired end mapping statistic

    Here is a sample stdout when using maq for paired-end reads.
    # reads mapped in pairs is an odd number. How can that be?
    94% reads mapped, but only 41% mapped in pairs, and about 1% can be rescued by 'moving' the reads. So the remaining 50% of 'mapped' reads, map to different chromosomes?
    Are only the reads mapped in pairs used for SNP calling using downstream MAQ analysis?

    -- 48 potential soa-indels pass the filter.
    -- 3551 potential pe-indels pass the filter.

    -- == statmap report ==

    -- # single end (SE) reads: 0
    -- # mapped SE reads: 0 (/ 0 = NA%)
    -- # paired end (PE) reads: 21532606
    -- # mapped PE reads: 20163428 (/ 21532606 = 93.64%)
    -- # reads that are mapped in pairs: 8326749 (/ 20163428 = 41.29%)
    -- # Q>=30 reads that are moved to meet mate-pair requirement: 899 (/ 8326749 = 0.01%)
    -- # Q<30 reads that are moved to meet mate-pair requirement: 5519 (0.06%)
    --
    bioinfosm

  • #2
    bioinfosm, did you ever work out an answer to this? I also have an odd number of reads mapped as pairs and am wondering if it is a sign that something has gone wrong.

    You have probably already worked this other part out, but the answer to your question about whether only pair-mapped reads are used for SNP calling is that it depends on which flags you set. If you use Maq easyrun, I believe that the default setting is to include reads that are not mapped in pairs. You can change this behaviour if you run the scripts yourself, by setting the -p flag for assemble.

    Comment


    • #3
      Did you have a look at what the 'maq map' step itself told about the mapping?
      (The last view lines of the ... I think stderr-output)
      Perhaps there is some discrepancy when reading the paired-information to produce your output.

      (Btw. how was it generated? I haven't seen this report yet)

      Comment


      • #4
        Hi Jonathan,

        Thanks for the suggestion. I've seen this same output consistently for two of my resequencing datasets, but never for a third generated from the same barcoded Illumina lane, no matter whether I map the reads using easyrun or by running the scripts individually (all with default values, except for maq map, which I have run with both default and specificed max insert sizes, and maq assemble, which I have run both with and without -p flag). No matter which way I do it, I always get an odd number of reads mapped in pairs for those two datasets.

        I can't see anything obviously odd in the maq map output, but I am also not totally sure what I am looking for... Here are the tails of two stderr outputs, the first for the well-behaved dataset, and the second for one of the odd-number datasets. Thanks for any insights!


        [ma_load_reads] loading reads...
        [ma_load_reads] 2451785*2 reads loaded.
        [mapping_count_single] 5, 17, 22, 35
        [match_data2mapping] 218 pairs are added afterwards.
        [match_data2mapping] (328642, 316222) reads are moved to meet paired-end requirement.
        [match_data2mapping] quality counts of the first reads: (305, 328337); second reads (265, 315957)
        [cal_insert_size] 624562 read pairs counted. insert size: 262.259648 +/- 26.880905
        [maq_make_aln_candidate] 144376 candidates added
        [maq_indel_pe] 144376 candidates for alignment
        [maq_indel_pe] CPU time: 10.790 sec
        [match_data2mapping] 4447176 out of 4903570 raw reads are mapped with 4217234 in pairs.
        -- (total, isPE, mapped, paired) = (4903570, 1, 4447176, 4217234)




        [ma_load_reads] loading reads...
        [ma_load_reads] 3018849*2 reads loaded.
        [mapping_count_single] 5, 17, 22, 36
        [match_data2mapping] 269 pairs are added afterwards.
        [match_data2mapping] (459993, 455444) reads are moved to meet paired-end requirement.
        [match_data2mapping] quality counts of the first reads: (203, 459790); second reads (168, 455276)
        [cal_insert_size] 811214 read pairs counted. insert size: 247.372625 +/- 17.279953
        [maq_make_aln_candidate] 178357 candidates added
        [maq_indel_pe] 178357 candidates for alignment
        [maq_indel_pe] CPU time: 13.950 sec
        [match_data2mapping] 5856680 out of 6037698 raw reads are mapped with 5625861 in pairs.
        -- (total, isPE, mapped, paired) = (6037698, 1, 5856680, 5625861)

        Comment


        • #5
          I'm bumping this back up - hopefully someone has an answer.

          I have a dataset generated from 12 bar-codes in a a 2x50 Illumina lane. Following MAQ Easyrun alignment, I'm getting a similar error, and wondering if anyone has a solution. Any suggestions would be greatly appreciated.

          The stat report for one of the bar-coded samples looks like the following:

          -- == statmap report ==

          -- # single end (SE) reads: 0
          -- # mapped SE reads: 0 (/ 0 = NA%)
          -- # paired end (PE) reads: 1985630
          -- # mapped PE reads: 1805746 (/ 1985630 = 90.94%)
          -- # reads that are mapped in pairs: 92591 (/ 1805746 = 5.12%)
          -- # Q>=30 reads that are moved to meet mate-pair requirement: 5 (/ 92591 = 0%)
          -- # Q<30 reads that are moved to meet mate-pair requirement: 14 (0.01%)

          Comment


          • #6
            I have the same problem

            -- == statmap report ==

            -- # single end (SE) reads: 0
            -- # mapped SE reads: 0 (/ 0 = NA%)
            -- # paired end (PE) reads: 30130376
            -- # mapped PE reads: 28705774 (/ 30130376 = 95.27%)
            -- # reads that are mapped in pairs: 7988607 (/ 28705774 = 27.82%)
            -- # Q>=30 reads that are moved to meet mate-pair requirement: 660 (/ 7988607 = 0%)
            -- # Q<30 reads that are moved to meet mate-pair requirement: 76318 (0.95%)


            Very confusing!

            Anyone solve this issue?

            Thanks.

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Strategies for Sequencing Challenging Samples
              by seqadmin


              Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
              03-22-2024, 06:39 AM
            • seqadmin
              Techniques and Challenges in Conservation Genomics
              by seqadmin



              The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

              Avian Conservation
              Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
              03-08-2024, 10:41 AM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, Yesterday, 06:37 PM
            0 responses
            10 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, Yesterday, 06:07 PM
            0 responses
            9 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 03-22-2024, 10:03 AM
            0 responses
            50 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 03-21-2024, 07:32 AM
            0 responses
            67 views
            0 likes
            Last Post seqadmin  
            Working...
            X