Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • indelpe from MAQ

    Hi,

    I would like to know how does the function indelpe (from the MAQ software) calls consistent indels from paired end reads. Is it when a read has an insert size bigger or smaller than 3 standard deviations of the insert size distribution of the reads?

    Thanks in advance,
    Fadista

  • #2
    Hi Fadista,

    indelpe calls small indels (typically ranging from 1-1x nucleotides). It does so by checking for reads pairs where one read is mapped but but its mate could not be mapped. It then performs a Smith-Waterman alignment of the unpaired read in the approximate insert size distance, so it can only detect small indels.

    For the identification of large scale insertion/deletion events you have to use the maq.pl sv command.

    Cheers,

    Sebastian

    Comment


    • #3
      approximate insert size

      Hi,

      Thanks for the reply. I didn´t even know that there was a maq.pl sv, because it´s not on the manual. So I was just doing my own manual sv callings. But I still have some questions:

      1 - What do you mean by "approximate insert size" and how can maq compute that?

      2 - Concerning the maq.pl sv command, does it mean that all the paired-end reads with an insert size above the -i parameter (maximum insert size) will be called deletions? And how can it detect large insertions? And what does the -s parameter (minimum length of a region) means?

      Thanks in advance,
      Fadista

      Comment


      • #4
        Hi,

        1. I think maq computes it the same way Eland does in the summary.htm output, the average distance of all correctly mapped readpairs. And with approximate insert size I just meant that if you have an average insert size of 200 maq will search with Smith-Waterman in distance maybe 150-250 of the first read. But I don't know the exact parameters maq uses for Smith-Waterman, would be nice if somebody actually knew details

        2.Concerning maq.pl sv I also do not know much details, since as you said it is not in the manual, but I found this a while back in the maq sourceforge mailing list :

        "maq.pl sv" implements a very simple SV detector for paired end reads
        (less than 100 lines in Perl). Basically, it first tries to group
        reads of abnormal pairs in a short region (specified by -i and -l) and
        then builds a graph by taking a node as a group of reads and adding an
        edge if two groups of reads are linked by mate-pair relationship.
        "maq.pl sv" then regards each connected component on the graph as a
        potential structural variation.

        The output of "maq.pl sv" is TAB delimited. Each line consists of:
        FLAG, number of read pairs that support the SV, chr of node1, start of
        node1, end of node1, useless field, number of reads in node1 mapped on
        the forward strand, # reads mapped in node1 on the reverse strand, chr
        of node2, start of node2, end of node2, useless field, # reads in
        node2 on the forward strand and # reads in node2 on the reverse
        strand. A FLAG can be DEL for deletion, DIF for translocation between
        chr, AMB or LOP for ambiguous cases. Also in case of complex SV, a
        node may appear several times in the output.

        This function is not perfect. It only finds SVs located in unique
        regions. It does not resolve complex SV (deletion followed by
        inversion, for example), either. Nonetheless, this function may give
        you a fairly good list of confident SVs and show hints of complex SV.

        regards,

        Heng
        The insertion size is limited by the insert size, so large insertions can afaik not be detected using this approach.
        Hope that helps a bit,

        Cheers,

        Sebastian

        Comment


        • #5
          Ken Chen from WashU has a better script for SV detection that works on maq output. You can download it here:



          If you use it, do not forget to acknowledge Ken.

          Comment


          • #6
            Here are 3 lines from maq sv output, but I cannot find much documentation to interpret..

            $ perl maq.sv.r8.pl -q -1 -r 2 -i 500 lane1/reads-1.map.mapview > lane1.sv
            $ sort -nr lane1.sv | head
            3686 AMB 5 99414644 99417337 . 7 63204008 63204022 *Y 9039505 9039535 * 7 57257754 57258137 * 1 554324 560134 . 287905568 87905895 * 4 12251121 12251135 * 5 99418238 99418614 *
            3544 DIF 1 554324 560134 . 12 40043706 40043761 *
            3537 AMB 2 147739240 147739270 * 17 21946716 21948790 .1 554324 560134 .
            ...
            Last edited by bioinfosm; 03-05-2009, 01:16 PM. Reason: Upgraded problem :)
            --
            bioinfosm

            Comment


            • #7
              __________
              <bump>
              --
              bioinfosm

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Current Approaches to Protein Sequencing
                by seqadmin


                Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                04-04-2024, 04:25 PM
              • seqadmin
                Strategies for Sequencing Challenging Samples
                by seqadmin


                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                03-22-2024, 06:39 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, 04-11-2024, 12:08 PM
              0 responses
              18 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-10-2024, 10:19 PM
              0 responses
              22 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-10-2024, 09:21 AM
              0 responses
              16 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-04-2024, 09:00 AM
              0 responses
              47 views
              0 likes
              Last Post seqadmin  
              Working...
              X