Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • colorspace mapping

    If I do not work with SNP detection, can I just translate the colorspace to basespace, then do the mapping? One thing concerning me is that a single mismatch in colorspace not only affects the particular base, but also affects all the downstream bases. So the single mismatch in colorspace is likely to be misalignment in basespace. Does this mean more false positive for colorspace mapping?

    What is the advantage for mapping in colorspace for no SNP detection projects?

    Any suggestions and comments?

    Many thanks

  • #2
    You are correct that a true sequencing error can affect the interpretation of all downstream bases in _DNAspace_, however in colorspace, the bases would still show correct alignment to a colorspace reference. You would only see a 1nt mm. This is actually one of the reasons that colorspace can be useful outside of SNP detection. Depending on your application, alignment to a reference sequence in colorspace can allow you not only to detect the difference between a sequencing error (one nt mm in colorspace alignment) and a true SNP (a limited subset of specific _2nt_ mm in colorspace), but can also allow you to 'correct' a true sequencing error (again depending on your reference sequence and application) prior to decoding the sequence to DNAspace.
    I would caution anyone against arbitrarily decoding CS to DNA prior to alignment. You are certainly going to introduce errors in DNA space that can cause spurious alignments. However, if you are simply counting tags, this is not necessarily a problem as long as you have a good estimate of background noise that a true signal should stand out against.
    Long story short, it definitely depends on the application, but in most cases, if you can avoid immediately jumping to DNA space you will get the most out of your dataset.

    -Loyal

    Comment


    • #3
      I agree that we need to aviod immediately jumping to base space for alignment. Here here is what I am concerning.

      For example, I have reference sequence
      Base space: TCGAGCAGCACGTC
      color space: T2322312311212

      If we allow 1 mismatch in color space, then the readA will be mapped to this reference sequence

      Reference: 2322312311212
      readA: T2312312311212

      The base space for readA is CGTCGTCGTGACT, which is different from reference sequence in base space.

      Reference: CGAGCAGCACGTC
      readA: CGtcgtcgtgact

      How do reads mapped in color space?

      Comment


      • #4
        I think you make your point exactly...the difference in colorspace as you point out is:

        Reference: 2322312311212
        readA: 23*1*2312311212

        If you were to do the alignment of this read on a colorspace version of the genome, there would only be one mismatch. Since two consecutive mismatches in colorspace are required for a SNP, you know this is a sequencing error. You have two options at this point. Flag the read as 'bad' and do not report this alignment, or using the colorspace reference, you can 'correct' the read and change the mm to reflect the reference sequence. This correction would then resolve the DNAspace sequence issue. The point is, the alignment is done in colorspace, the DNAspace translation of the read is irrelevent until after the sequence has been aligned to the colorspace reference.

        Comment


        • #5
          Many thanks, I got it.

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Essential Discoveries and Tools in Epitranscriptomics
            by seqadmin


            The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist on Modified Bases...
            Today, 07:01 AM
          • seqadmin
            Current Approaches to Protein Sequencing
            by seqadmin


            Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
            04-04-2024, 04:25 PM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, 04-11-2024, 12:08 PM
          0 responses
          37 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 10:19 PM
          0 responses
          41 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 09:21 AM
          0 responses
          35 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-04-2024, 09:00 AM
          0 responses
          54 views
          0 likes
          Last Post seqadmin  
          Working...
          X