Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Re-calling the same event model -- improvements over 10 months

    People seem to be interested in the error rate of the MinION. I'd like to put this image up to demonstrate one of the reasons why error rate is a fickle beast to calculate:



    This is exactly the same event signal model (combination of current and dwell time inside the pore) recalled at three separate times over the past year. I've selected a small region covering a homopolymer sequence to make the mapping changes more impressive and easier to see. The reference sequence is shown in the middle (at the 0 line), with changes shown above and below the sequence.

  • #2
    How do I make sense of this graph? What software was used to generate it?

    What are the versions of Flowcell, SQK-MAP, Metrichor for each of them?

    Comment


    • #3
      I used a custom R script for generating the graph, which works on a pairwise alignment of two sequences. The reference sequence appears at the 0 line in the graph (the top letters), and any substitutions appears underneath that, colour-coded depending on the three different types of substitutions (purine/pyrimidine, methyl/keto, strong/weak). Insertions appear as chartreuse wedges above the reference sequence, and deletions are steel blue triangles that exclude reference sequences.

      I've attached the script I used to create an earlier graph with the same appearance (Image 5 in that script).

      Flow cell and sequencing kits are obviously the same for all sequences, and were current on 2014-Oct-03: R7.3 flow cell, and I think SQK-MAP003.

      I'm not sure about Metrichor, it was just whatever was current at the time. According to the Fast5 files, the first sequence was chimaera v1.2.2, the middle sequence was chimaera v1.6.3, and the third sequence was chimaera v1.14.4 with dragonet v1.14.2.
      Attached Files
      Last edited by gringer; 09-14-2015, 06:32 PM.

      Comment


      • #4
        Thanks for your reply. The third graph doesn't seem to be an obvious improvement over the second one. It seems to me it just substituted one type of error with another type.

        Comment


        • #5
          The improvement is that it has detected a single base insertion in the homopolymer region, which is a nice result given that our sample had a single base insertion in that region. There are substitution errors, and the inserted base is incorrect (T instead of A), but it suggests to me that things are moving in the right direction. It also demonstrates that it might be possible to call sequences across long homopolymer regions after all, despite the theoretical model suggesting that there should be no difference in signal between adjacent events in the middle of the region.

          Comment


          • #6
            Do you mean the T insertion between 9825 and 9826 is real? I thought you were just re-sequencing a reference sample. Did you actually sequence a sample from the same strain of the reference but was not the same sample?

            Comment


            • #7
              It should be an 'A' insertion, but yes, it's real. We were sequencing 4T1 cancer cells, which have a few variants different from the reference sequence. You can see the paper for more details:



              ResearchGate link if you don't have direct access to the paper through Cell:

              Comment


              • #8
                Originally posted by gringer View Post
                People seem to be interested in the error rate of the MinION. I'd like to put this image up to demonstrate one of the reasons why error rate is a fickle beast to calculate:



                This is exactly the same event signal model (combination of current and dwell time inside the pore) recalled at three separate times over the past year. I've selected a small region covering a homopolymer sequence to make the mapping changes more impressive and easier to see. The reference sequence is shown in the middle (at the 0 line), with changes shown above and below the sequence.
                I think it would be a good idea to declare conflict of interest when praising a platform. Are you involved in MinIon Analysis and Reference Consortium (MARC)?

                Comment


                • #9
                  Originally posted by nucacidhunter View Post
                  I think it would be a good idea to declare conflict of interest when praising a platform. Are you involved in MinION Analysis and Reference Consortium (MARC)?
                  Yes, and I've also been part of the MAP since the start, and have mentioned my involvement with MAP previously on SEQanswers. It's silly to repeat that every time I talk about the MinION, because everyone who has access to a MinION sequencer has received some amount of shipping-cost-only flow cells and reagents from Oxford Nanopore.

                  The only way you're going to find an interest-free analysis is if someone from outside MAP takes some of the publically-available data and does their own analysis on that. Based on how much feedback I've got on the mitochondrial data I released last year (i.e. none), don't get your hopes up on that.

                  It's also currently impossible to re-call event data without having access to Metrichor, so unless someone from outside MAP writes their own base caller everyone is stuck with what ONT throws at them.

                  Perhaps our MARC paper will change that, because it's a bit more public and has a lot more pre-analysed and mapped data for other people to look at.
                  Last edited by gringer; 10-17-2015, 04:23 AM.

                  Comment


                  • #10
                    Originally posted by gringer View Post
                    It should be an 'A' insertion, but yes, it's real. We were sequencing 4T1 cancer cells, which have a few variants different from the reference sequence. You can see the paper for more details:



                    ResearchGate link if you don't have direct access to the paper through Cell:

                    https://www.researchgate.net/publication/270582858
                    Does it make sense to use long read technology to study somatic mutations?

                    I think the Illumina and X10 combo should work better because I have yet encountered a somatic repeat that can take advantage of the true long read technology.

                    Comment


                    • #11
                      Originally posted by ymc View Post
                      Does it make sense to use long read technology to study somatic mutations?
                      Yes, because we were able to do a whole-mitochondria run on two amplified 8kb fragments of mitochondrial DNA for about $100 (approximate cost of non-ONT reagents and shipping-cost-only flow cells). Illumina is overkill for mitochondrial sequencing, so it makes sense to use something cheaper when available. Even without barcoding, we can get at least 4 mitochondrial runs done on the MinION by using wash buffer between runs and running for 1-4 hours.

                      Originally posted by ymc View Post
                      I think the Illumina and X10 combo should work better because I have yet encountered a somatic repeat that can take advantage of the true long read technology.
                      The MinION does a reasonable job with SNPs and small INDELs. It's just not (yet) great for long homopolymers as demonstrated here. I found a few other mitochondrial SNPs that did work well with the MinION, and were supported by IonTorrent sequencing.

                      Comment


                      • #12
                        Originally posted by gringer View Post
                        Yes, and I've also been part of the MAP since the start, and have mentioned my involvement with MAP previously on SEQanswers. It's silly to repeat that every time I talk about the MinION, because everyone who has access to a MinION sequencer has received some amount of shipping-cost-only flow cells and reagents from Oxford Nanopore.

                        The only way you're going to find an interest-free analysis is if someone from outside MAP takes some of the publically-available data and does their own analysis on that. Based on how much feedback I've got on the mitochondrial data I released last year (i.e. none), don't get your hopes up on that.

                        It's also currently impossible to re-call event data without having access to Metrichor, so unless someone from outside MAP writes their own base caller everyone is stuck with what ONT throws at them.

                        Perhaps our MARC paper will change that, because it's a bit more public and has a lot more pre-analysed and mapped data for other people to look at.
                        Only a subset of MAP participants and ONT paid consultants are involved with MARC and I think this makes it different from ordinary MAPers.

                        Comment


                        • #13
                          Originally posted by nucacidhunter View Post
                          Only a subset of MAP participants and ONT paid consultants are involved with MARC and I think this makes it different from ordinary MAPers.
                          This comment suggests that MARC is some exclusive club, but it's not. Anyone can be part of MARC, even those outside MAP. There's no fee to pay, and no one cares if members don't say anything on the mailing lists or check in during the meetings.

                          MARC is no different from the rest of MAP in that ONT will give free-excluding-shipping flow cells to anyone who wants to try out a big experiment and publish a paper or present at a meeting. There is some collective bargaining advantage, but we're still all waiting for flow cells to arrive, and are stuck behind the queue of commercial customers just like everyone else in MAP. Any people who pay $1000 (or $500 in bulk) for each flow cell will get faster access to ONT services than anyone in MARC.

                          Anyone inside MAP can see the results that MARC is producing (they're on the MAP wiki), and (when I've got a bit of spare time to write) can also see the minutes of the teleconferences that we have.

                          If anyone else wants to join, just let Ewan Birney know (birney at ebi.ac.uk), and he can add another email address to the mailing list.

                          Comment


                          • #14
                            Thanks for providing more info on MARC.

                            Comment


                            • #15
                              "behind the queue of commercial customers" - What does this mean? Does it mean if you pay (how much?), then you can get a box really quick? Can you elaborate? Thanks

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                Strategies for Sequencing Challenging Samples
                                by seqadmin


                                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                                03-22-2024, 06:39 AM
                              • seqadmin
                                Techniques and Challenges in Conservation Genomics
                                by seqadmin



                                The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                                Avian Conservation
                                Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                                03-08-2024, 10:41 AM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, Yesterday, 06:37 PM
                              0 responses
                              8 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, Yesterday, 06:07 PM
                              0 responses
                              8 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 03-22-2024, 10:03 AM
                              0 responses
                              49 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 03-21-2024, 07:32 AM
                              0 responses
                              67 views
                              0 likes
                              Last Post seqadmin  
                              Working...
                              X