Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Pre-Capture Pooling with Nimblegen SeqCap EZ v3: SNP detection quality?

    Dear all,

    Does anyone have already some experience with the new Nimblegen SeqCap EZ v3 targeted Exome Enrichment kit, concerning pre-capture pooling?
    The kit explicitly supports pre-capture pooling of samples (barcoded) and a subsequent pooled targeted enrichment.
    (In a test at my institution, The v3 kit itself performed satisfactorily concerning on-target rate and target coverage).

    Now I am especially looking for information concerning the specificity of SNP-calls when performing such a pre-capture pooling approach.

    The enrichment involves some PCR cycles when the samples are already pooled (as far as I gathered), and I wondered if or to what degree cross-hybridization between the captured fragments of different samples occurs or can occur at this step. However, my knowledge (and so far also my understanding) about this technology is limited.

    If such cross-hybridization occurs, wouldn’t the specificity of the SNP calls become much worse compared to a single sample enrichment? I would expect that there is a considerable fraction of reads from a specific sample (assigned via the sample-specific barcodes) which bear SNPs or InDels stemming from cross-hybridization events with the fragments from another sample.

    Or do I maybe misunderstand something in general, and cross-hybridization cannot happen? Or can those events easily be identified and filtered? Or does that only play a very minor role and can be neglected?

    I asked Nimblegen about it... They seem to have a hard time to find someone in the company who can provide any informaton on that topic (I am asking repeatedly and waiting for weeks now). I also haunted the competitors from Agilent, but they just stated (of course) that the SNP detection specificity with the Nimblegen pre-capture pooling enrichment is worse than with their enrichment with the Human All Exon 50 MB or v4 kits, and that they would advise against pre-capture pooling with their kits, but they did not provide any arguments or data.

    I studied the publications on in-solution targeted exome enrichment kit comparisons (see this thread: http://seqanswers.com/forums/showthread.php?t=14617), and they largely agree that the Nimblegen capture probe design (DNA probes, shorter, but many) is in the end slightly more efficient than Agilent’s design (RNA probes and longer, meaning higher binding specificity, but fewer of them) for SNP calling (Agilent won for the overall detection counts because of the larger target region compared to the older Nimblegen kits).
    Although also the older Nimblegen version (v2) apparently also supports pre-capture enrichment, the two studies that compared that kit with Agilent’s SureSelect Human All Exon 50 MB both used single sample enrichment as far as I can tell.

    I should mention that I am comparably new to NGS, and I am an end user, but I tried getting myself read into field as good as possible during the past few months (btw, SeqAnswers was a great help). I am not directly involved in the practical steps concerning enrichment and sequencing, which is done by our NGS core facility. However, they also cannot answer the pre-capture enrichment question.

    I thoroughly searched for available information on the topic in the web and on SeqAnswers, but I couldn’t find any. If I missed or completely misunderstood something, I would be glad if you could point me towards it.

    Any information or opinion is very welcome!

    Thanks a lot!

  • #2
    You might want to read through this paper: http://nar.oxfordjournals.org/content/40/1/e3.abstract

    I have never used Nimblegen so I can't comment specifically regarding it.

    Comment


    • #3
      Hi Eviltwin

      Could you share how much on-target and avg coverage you get from SeqCap EZ v3?

      I am about to get the result using that kit and I need a sort of comparison with other's data. I've previously done SeqCap EZ V2 exome with post-multiplexing and got around 70% on target and 100X coverages. Thus, I am expecting that V3 should be higher than 70% on target rate and 70X coverages (64Mb vs. 40 Mb target regions) in the end.

      Comment


      • #4
        Hi sehrrot,

        how do you exactly calculate your on-target and coverage data? Maybe I can learn something :-)

        We tried the SeqCap EZ 3.0 and got ~ 70x average coverage per Exome when pooling 4 samples on an Illumina HiSeq 2000 lane (GATK DoC Walker), with ~ 60 % bases strictly on-target (Picard CollectHSMetrics with the supplied "capture.bed" as target/bait file).

        I had a talk with some Roche/Nimblegen guys a while ago, and they stated that Nimblegen calculate their on-target values by defining anything within 150 bp +/- the target regions as on-target, which of course increases that value. Also, the primary focus in developing SeqCap V3 was to increase the target region size, while it is allegedly very hard to increase the on-target efficiency. So I would not expect that there is much difference to V2 in that respect.

        Comment


        • #5
          @ Heisman,

          I just realized I never thanked you for the hint on that paper, but still… thanks a lot!

          As I understand, our sequencing core facility had at that time already applied the (or at least some) double-indexing method, so erroneous read assignment should be reduced.

          As for the pre-capture pooling, there doesn’t seem to be artificial variant enrichment in the capture pools (e.g. if one sample bears a rare variation, it stays specific for that sample and doesn't show up in the others).

          Comment


          • #6
            Hi EvilTwin

            I just got my nimblegen v3 exome data from the sequencer. But I am shocked when I checked the seuqence duplication level on the FastQC, which is nearly 50%... I will do mapping onward and check it how good the sequencing quality is..

            Comment


            • #7
              Hi sehrrot,

              that is strange, we typically got 5-10 % (MarkDuplicates in Picard Tools). One run was exceptional with 25 % duplication, but as I understand there was some technical problem...

              Comment


              • #8
                Hi EvilTwin

                I think so. I am still waiting my pipeline for on-target rate and coverages. Duplicate level on Picard is around 20-25%, which is lower than FastQC one but higher than previously I've seen in the NimbleGen exome V2 (which as around 5-8%). I've done NimbleGen V3 exome with pre-multiplexing as I've done this for V2 as well (the performance actually same between pre-multiplexing and post-multiplexing; I've tested with NimbleGen, Illumina and also compared with Agilent post-multiplexing) and got the nice result. I am still not sure why the duplicate level is high ..

                Anyway thanks for your reply.

                Comment


                • #9
                  Hi sehrrot,

                  Initially we also compared Agilent50MB post-capture multiplexing and NimbleGen V3 pre-capture multiplexing on the same set of samples, and there were no dramatic differences concerning duplication in that run (5 vs 6 %), but NimbleGen doing slightly better concerning on-target rate and also per-interval coverage (and naturally overall coverage due to the larger target size).
                  I checked some samples with FastQC, it also displays higher duplication levels, around 15-20 % (so perhaps much of it comes from unmapped reads?)

                  I will ask the sequencing core facility what exactly the problem was with the run displaying the exceptional high level of duplication

                  Comment


                  • #10
                    Hi EvilTwin

                    Thanks for your sharing. I am guessing my sample would have the problem in sample prep or capture efficiency. Otherwise, it might be a problem in my HiSeq, as I've experienced in the dramatic quality drops after the cycle 80 and subsequently got a loss of cluster in read 2.

                    Anyway, apart from that, I think I got an answer why higher duplicate level in Fastqc than picard. FastQc find the identical sequences compared to others but they could be an enriched fragments, not solely for duplicates. Thus, Picard take information of paired end and if two sequences are identical as well as having the same start position of paired end sequences, Picard calls them as duplicates.

                    Comment

                    Latest Articles

                    Collapse

                    • seqadmin
                      Strategies for Sequencing Challenging Samples
                      by seqadmin


                      Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                      03-22-2024, 06:39 AM
                    • seqadmin
                      Techniques and Challenges in Conservation Genomics
                      by seqadmin



                      The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                      Avian Conservation
                      Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                      03-08-2024, 10:41 AM

                    ad_right_rmr

                    Collapse

                    News

                    Collapse

                    Topics Statistics Last Post
                    Started by seqadmin, Yesterday, 06:37 PM
                    0 responses
                    10 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, Yesterday, 06:07 PM
                    0 responses
                    9 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 03-22-2024, 10:03 AM
                    0 responses
                    51 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 03-21-2024, 07:32 AM
                    0 responses
                    67 views
                    0 likes
                    Last Post seqadmin  
                    Working...
                    X