Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Unique mapped reads definition confusing...

    Hi,

    I just read few assembly paper this few days. All of those papers got mention a lot about the "unique mapped reads".
    Can anybody willing to share with me about the unique mapped reads definition?
    I will appreciate if someone can provide some simple examples that can help me more understanding about the definition of unique mapped reads in assembly and bioinformatics.
    Thanks a lot for your advise and suggestion

  • #2
    I would use this term to describe a read which mapped only once in a genome with a given number of mismatches. Hopefully the match would be a unique exact match, but if there was a single SNP then so long as there was no other place in the genome which the read could match with only one mismatch then it would still count as a uniquely mapped read.

    You normally find that once you get above 2 mismatches in a 36bp read you're very unlikely to be able to map it uniquely so the majority of uniquely mapping reads will be exact matches or have just 1 or 2 mismatches.

    Comment


    • #3
      Hi simon,

      Thanks a lot for your suggestion. It is very clear and easy to understand.
      Thanks for helping me solved my doubts
      In your opinion, the definition for the uniquely mapped reads that you explained to me just now. Is it also applied for the long base pair read, like 454,Sanger read,etc?
      I got read some bioinformatics journal paper recently.
      Some scientist will use the uniquely mapped read to assemble a high-quality consensus sequence of some specific organism's genome.
      Do you know what is the purpose that scientist use the uniquely mapped read to assemble a high-quality consensus sequence of some specific organism's genome?
      Thanks again for your help

      Originally posted by simonandrews View Post
      I would use this term to describe a read which mapped only once in a genome with a given number of mismatches. Hopefully the match would be a unique exact match, but if there was a single SNP then so long as there was no other place in the genome which the read could match with only one mismatch then it would still count as a uniquely mapped read.

      You normally find that once you get above 2 mismatches in a 36bp read you're very unlikely to be able to map it uniquely so the majority of uniquely mapping reads will be exact matches or have just 1 or 2 mismatches.

      Comment


      • #4
        In your opinion, the definition for the uniquely mapped reads that you explained to me just now. Is it also applied for the long base pair read, like 454,Sanger read,etc?
        Certainly it can. If I had a bunch of long 1000-base Sanger reads they could be mapped either uniquely or non-uniquely to a reference genome. Depending on the number of SNPs expected then the number of allowed mismatches may need to be raised.

        Do you know what is the purpose that scientist use the uniquely mapped read to assemble a high-quality consensus sequence of some specific organism's genome?
        Probably for the same reason that anyone wants a sequence -- in order to find out what makes that specific organism's genome different than other genomes ... SNPs, InDels, unique genes, unique control mechanisms, etc.

        It may be obvious but you can assemble a sequence either via:

        1) De-novo assembly
        or
        2) Mapping unique reads onto a reference
        or
        3) Mapping unique and non-unique reads onto a reference
        or
        4) A combination of the above

        Comment


        • #5
          Thanks a lot, westerman.
          Your reply makes me more understanding about how the scientist analyze the data.

          Sad to said that I still not very clear about why the scientist will use the uniquely mapped read of specific organism genome A to assemble a high-quality consensus sequence of some specific organism genome B?
          What is the purpose that they doing these method to analyze the data?
          Do you know what is the general pipeline to analyze the 454 or Illumina data?
          I very appreciate and thanks for your suggestion and opinion

          Originally posted by westerman View Post
          Certainly it can. If I had a bunch of long 1000-base Sanger reads they could be mapped either uniquely or non-uniquely to a reference genome. Depending on the number of SNPs expected then the number of allowed mismatches may need to be raised.


          Probably for the same reason that anyone wants a sequence -- in order to find out what makes that specific organism's genome different than other genomes ... SNPs, InDels, unique genes, unique control mechanisms, etc.

          It may be obvious but you can assemble a sequence either via:

          1) De-novo assembly
          or
          2) Mapping unique reads onto a reference
          or
          3) Mapping unique and non-unique reads onto a reference
          or
          4) A combination of the above

          Comment


          • #6
            It is easy to define uniqueness when you require the entire read to be aligned without gaps. But things get complicated when you allow clipping and gaps, both of which are related to the underlying scoring system and therefore uniqueness is related to scoring system. In addition, although we may define a read being unique when its best two matches have identical scores according to a scoring system, such a definition is not useful in practice. What if the second best match has a lower score just by 1 or 2?

            Comment


            • #7
              Thanks for your reply...
              What you mention,make senses too...
              I will try to find out more about the "unique mapped read" and share it with everybody

              Originally posted by lh3 View Post
              It is easy to define uniqueness when you require the entire read to be aligned without gaps. But things get complicated when you allow clipping and gaps, both of which are related to the underlying scoring system and therefore uniqueness is related to scoring system. In addition, although we may define a read being unique when its best two matches have identical scores according to a scoring system, such a definition is not useful in practice. What if the second best match has a lower score just by 1 or 2?

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Strategies for Sequencing Challenging Samples
                by seqadmin


                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                03-22-2024, 06:39 AM
              • seqadmin
                Techniques and Challenges in Conservation Genomics
                by seqadmin



                The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                Avian Conservation
                Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                03-08-2024, 10:41 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, Yesterday, 06:37 PM
              0 responses
              11 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, Yesterday, 06:07 PM
              0 responses
              10 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 03-22-2024, 10:03 AM
              0 responses
              51 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 03-21-2024, 07:32 AM
              0 responses
              68 views
              0 likes
              Last Post seqadmin  
              Working...
              X