Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to explain this scenario ?

    When I was assembling the reads , I found this scenario:

    TAA-CCTCCCCC-AAANTT-CAGA Consensus
    TAA-CCTCCCCC-AAACTT
    TAA-CCTCCCCC-AAACTTACAGA
    TAA-CCT-CCCC-AAACTTACAGA
    TAA-CCTCCCCCAAAACTT-CAGA
    TAACCCTCCCCC-AAAATT-CAGA
    TAA-CCTCCCCC-AAAATT-CAGA
    TAA-CCTCCCCA-AAACTTACAGA
    TAA-CCTCCCCC-AAAATT-CAGA
    TAA-CCTCCCCC-AAAATT-CAGA
    ---A-CCTCCCCC-AAAATT-CAGA

    The first line is the consensus sequence. You can find a N.
    Which was caused by 5C and 5A mapped to that position.
    Someone told me this was caused by the homopolymer, the
    C observed at the position is likely to be one part of the homopolymer
    ahead. Have you met this problem before? Do you think it is possible?

  • #2
    You really need to give a lot more information than what you've supplied for any reasonable hypothesis to be provided. Just off the top of my head, I would say there are several plausible explanations.

    Homopolymers are definitely possible - but the likelihood depends on the platform. (Oops! I didn't realize this was in the 454 forum! Homopolymers are more common with 454 than some of the other platforms, so yes, this is possible. However, I think my other comments stand; homopolymers are far from the only reason you would see the above scenario.)

    If it's from a diploid organism, there could be two alleles - and one of them has a SNP.

    If it's from a haploid organism, there could be paralogs, once of which has a single base difference compared to the other, while the reference genome has only one copy.

    I'm sure there are many other biological explanations. Since you haven't given probability scores or any other useful information, all we can do is guess.

    Good luck figuring it out.
    Last edited by apfejes; 01-15-2009, 07:50 AM. Reason: didn't realize thiis was posted to the 454 forum!
    The more you know, the more you know you don't know. —Aristotle

    Comment


    • #3
      Hi apfejes,

      Thanks for your reply. This is a pilot study on how to assemble the genome by 454 data, we found that through 454 software(runAssembly, runMapping), the consensus is too long to be true which due to the influence of the homopolymer, the result is even worse for Seqman, therefore, we write our own script to do the assebling work, until now, we haven't integrated the quality value(quality score, flow value), so we met the problem mentioned above(by 454 software, no N, but instead, these positions would have very low quality score).

      Now, I am trying to figure out the algorithm of 454 softwares how they make use of the "flow value " and "quality score", could anyone give me some reference about it, seems not mentioned in the manuals.

      I am kind of feeling that "quality score" is derived from "flow value" is that true?

      Comment


      • #4
        Mingkunli,
        Any hits with your aligner of 454 for mito data? We are also looking at mt using 454 ..
        --
        bioinfosm

        Comment


        • #5
          mingkunli

          454 Titanium and most recent FLX uses quality score algorithm are based on Broad Institute paper. 454 offline toolkit also has a script called "sffrescore" to allow you to rescore the old read quality scores into new Broad Institute's one.

          Here is Broad Instite paper that 454 read quality score is based on:
          An international, peer-reviewed genome sciences journal featuring outstanding original research that offers novel insights into the biology of all organisms
          Last edited by hlu; 02-19-2009, 09:25 AM.

          Comment


          • #6
            A suggestion, to judge whether the 5th 'A' was homopolymer or SNP, you can amplify this fragment using PCR and clone the product to a T vector, then picking 10 clones to sequence using ABI3730. And I think you'll get the corrct answer.

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Strategies for Sequencing Challenging Samples
              by seqadmin


              Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
              03-22-2024, 06:39 AM
            • seqadmin
              Techniques and Challenges in Conservation Genomics
              by seqadmin



              The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

              Avian Conservation
              Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
              03-08-2024, 10:41 AM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, Yesterday, 06:37 PM
            0 responses
            11 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, Yesterday, 06:07 PM
            0 responses
            10 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 03-22-2024, 10:03 AM
            0 responses
            51 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 03-21-2024, 07:32 AM
            0 responses
            67 views
            0 likes
            Last Post seqadmin  
            Working...
            X