Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Ion Torrent – Rapid Accuracy Improvements (independent analysis)

    The new blog series covers improvements over a 3 month time frame using the various data releases and includes in house PGM runs.

    The periodic public release of data sets by Life Technologies and others in the scientific community has allowed me to perform a “longitudinal study” of the improvements made on the Ion…

  • #2
    Very interesting. About the quality value discrepancy: for your empirical QVs, were you using a gapped alignment method? If so, I wonder if that is the source of the marked difference between projected QVs and actual accuracy?

    --
    Phillip

    Comment


    • #3
      Yes, a gapped alignment was used. Using the following command line options
      -g 20 -x5

      which corresponds to a gap opening penalty of 20 and extension penalty of 5.
      The default is -g 40, -x 15.


      I may be wrong but I thought the predicted QV (or projected) is based on a prediction algorithm/method. For Ion Torrent, it reads in a phredTable of precalculated values for a range of QVs. I'll blog about this one day.

      Therefore, the predicted QV is independent on what goes on with the empirical QVs based on alignment. FYI: Illumina MiSeq is pretty much spot on with their predicted vs actual. I think that's because the technology it's built on has been around for a while thus their predictive algorithms have had time to mature.... Some one correct me if I am wrong, could be just full of crap

      Comment


      • #4
        Yes that sound right to me.

        What I meant was that if you are calibrating your quality values based on agreement between a set of your reads and a known sequence, then you would need to do an alignment to that reference.

        Then you check each base in a given QV bin and see if it is was right or wrong. If the proportion of wrong:right is correct, great. If not, adjust your QV table.

        Right? But what if you used something other than the default Novoalign settings. Or used another aligner altogether? Without gaps in your alignment, every indel causes every base downstream to disagree with the reference. It would cause you to recalibrate your QV table and drastically reduce the quality of your bases.

        You give a link to Novoalign -- is that the official alignment engine of the Ion Torrent, or is it possible they are using a more gap-draconian alignment methodology?

        I don't have any particular expertise in calibrating quality values. But it just occurs to me that indels may be the cause in the descrepancies between the QVs assigned and the actual accuracy of the reads with your aligner.

        What do you think?

        --
        Phillip

        Comment


        • #5
          Novoalign is not the official alignment program. tmap is the official one created by Nils Homer so he would be the best to answer any questions on that.

          Regarding QV calibration, I'm quite new to the concept so would need to brush up before I comment. I will look at the QV prediction code then ask on the Ion Community to make sure I've understood it before blogging what I've learned.

          Comment


          • #6
            I will be interested to read what you find.

            Seems like there is a real issue here caused by the high indel error frequencies of 454 and Ion Torrent. To a first approximation it is completely legitimate to allow indel miscalls resulting from incorrect estimations of the numbers of bases in a run to produce much lower quality values. But because utilizing gapped alignment procedures can side-step most of the issues caused by indels of this sort, one doesn't want to penalize the QVs too much.

            --
            Phillip

            Comment


            • #7
              Broadly, this is how QVs are predicted using six metrics and a phred lookup table. This is detailed in my latest blog post.

              In this stand alone blog post, I will attempt to detail the predicted quality value (phred scoring) algorithm that the Ion Torrent is currently using. As the quality values is one of the battlegrou…

              Comment


              • #8
                Hi lek2k,
                Would it be possible to generate the plots in the figures using a gapless alignment method? Just for comparison?
                One possibility here is that Novoalign is calling correct bases that Ion Torrent would score as miscalls. Every base call after an indel is a miscall if you do not use a gapped aligner.

                --
                Phillip

                Comment


                • #9
                  Great EdgeBio blog post on Predicted vs Empirical QV and presents recalibration using GATK

                  Comment


                  • #10
                    Originally posted by pmiguel View Post
                    Hi lek2k,
                    Would it be possible to generate the plots in the figures using a gapless alignment method? Just for comparison?
                    One possibility here is that Novoalign is calling correct bases that Ion Torrent would score as miscalls. Every base call after an indel is a miscall if you do not use a gapped aligner.

                    --
                    Phillip
                    Yes that would be interesting for comparison. I'll give it a go soon.

                    Comment

                    Latest Articles

                    Collapse

                    • seqadmin
                      Strategies for Sequencing Challenging Samples
                      by seqadmin


                      Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                      03-22-2024, 06:39 AM
                    • seqadmin
                      Techniques and Challenges in Conservation Genomics
                      by seqadmin



                      The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                      Avian Conservation
                      Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                      03-08-2024, 10:41 AM

                    ad_right_rmr

                    Collapse

                    News

                    Collapse

                    Topics Statistics Last Post
                    Started by seqadmin, 03-27-2024, 06:37 PM
                    0 responses
                    13 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 03-27-2024, 06:07 PM
                    0 responses
                    11 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 03-22-2024, 10:03 AM
                    0 responses
                    53 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 03-21-2024, 07:32 AM
                    0 responses
                    69 views
                    0 likes
                    Last Post seqadmin  
                    Working...
                    X