Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • EvilTwin
    End User
    • Jul 2011
    • 8

    Pre-Capture Pooling with Nimblegen SeqCap EZ v3: SNP detection quality?

    Dear all,

    Does anyone have already some experience with the new Nimblegen SeqCap EZ v3 targeted Exome Enrichment kit, concerning pre-capture pooling?
    The kit explicitly supports pre-capture pooling of samples (barcoded) and a subsequent pooled targeted enrichment.
    (In a test at my institution, The v3 kit itself performed satisfactorily concerning on-target rate and target coverage).

    Now I am especially looking for information concerning the specificity of SNP-calls when performing such a pre-capture pooling approach.

    The enrichment involves some PCR cycles when the samples are already pooled (as far as I gathered), and I wondered if or to what degree cross-hybridization between the captured fragments of different samples occurs or can occur at this step. However, my knowledge (and so far also my understanding) about this technology is limited.

    If such cross-hybridization occurs, wouldn’t the specificity of the SNP calls become much worse compared to a single sample enrichment? I would expect that there is a considerable fraction of reads from a specific sample (assigned via the sample-specific barcodes) which bear SNPs or InDels stemming from cross-hybridization events with the fragments from another sample.

    Or do I maybe misunderstand something in general, and cross-hybridization cannot happen? Or can those events easily be identified and filtered? Or does that only play a very minor role and can be neglected?

    I asked Nimblegen about it... They seem to have a hard time to find someone in the company who can provide any informaton on that topic (I am asking repeatedly and waiting for weeks now). I also haunted the competitors from Agilent, but they just stated (of course) that the SNP detection specificity with the Nimblegen pre-capture pooling enrichment is worse than with their enrichment with the Human All Exon 50 MB or v4 kits, and that they would advise against pre-capture pooling with their kits, but they did not provide any arguments or data.

    I studied the publications on in-solution targeted exome enrichment kit comparisons (see this thread: http://seqanswers.com/forums/showthread.php?t=14617), and they largely agree that the Nimblegen capture probe design (DNA probes, shorter, but many) is in the end slightly more efficient than Agilent’s design (RNA probes and longer, meaning higher binding specificity, but fewer of them) for SNP calling (Agilent won for the overall detection counts because of the larger target region compared to the older Nimblegen kits).
    Although also the older Nimblegen version (v2) apparently also supports pre-capture enrichment, the two studies that compared that kit with Agilent’s SureSelect Human All Exon 50 MB both used single sample enrichment as far as I can tell.

    I should mention that I am comparably new to NGS, and I am an end user, but I tried getting myself read into field as good as possible during the past few months (btw, SeqAnswers was a great help). I am not directly involved in the practical steps concerning enrichment and sequencing, which is done by our NGS core facility. However, they also cannot answer the pre-capture enrichment question.

    I thoroughly searched for available information on the topic in the web and on SeqAnswers, but I couldn’t find any. If I missed or completely misunderstood something, I would be glad if you could point me towards it.

    Any information or opinion is very welcome!

    Thanks a lot!
  • Heisman
    Senior Member
    • Dec 2010
    • 534

    #2
    You might want to read through this paper: http://nar.oxfordjournals.org/content/40/1/e3.abstract

    I have never used Nimblegen so I can't comment specifically regarding it.

    Comment

    • sehrrot
      Member
      • Jul 2010
      • 58

      #3
      Hi Eviltwin

      Could you share how much on-target and avg coverage you get from SeqCap EZ v3?

      I am about to get the result using that kit and I need a sort of comparison with other's data. I've previously done SeqCap EZ V2 exome with post-multiplexing and got around 70% on target and 100X coverages. Thus, I am expecting that V3 should be higher than 70% on target rate and 70X coverages (64Mb vs. 40 Mb target regions) in the end.

      Comment

      • EvilTwin
        End User
        • Jul 2011
        • 8

        #4
        Hi sehrrot,

        how do you exactly calculate your on-target and coverage data? Maybe I can learn something :-)

        We tried the SeqCap EZ 3.0 and got ~ 70x average coverage per Exome when pooling 4 samples on an Illumina HiSeq 2000 lane (GATK DoC Walker), with ~ 60 % bases strictly on-target (Picard CollectHSMetrics with the supplied "capture.bed" as target/bait file).

        I had a talk with some Roche/Nimblegen guys a while ago, and they stated that Nimblegen calculate their on-target values by defining anything within 150 bp +/- the target regions as on-target, which of course increases that value. Also, the primary focus in developing SeqCap V3 was to increase the target region size, while it is allegedly very hard to increase the on-target efficiency. So I would not expect that there is much difference to V2 in that respect.

        Comment

        • EvilTwin
          End User
          • Jul 2011
          • 8

          #5
          @ Heisman,

          I just realized I never thanked you for the hint on that paper, but still… thanks a lot!

          As I understand, our sequencing core facility had at that time already applied the (or at least some) double-indexing method, so erroneous read assignment should be reduced.

          As for the pre-capture pooling, there doesn’t seem to be artificial variant enrichment in the capture pools (e.g. if one sample bears a rare variation, it stays specific for that sample and doesn't show up in the others).

          Comment

          • sehrrot
            Member
            • Jul 2010
            • 58

            #6
            Hi EvilTwin

            I just got my nimblegen v3 exome data from the sequencer. But I am shocked when I checked the seuqence duplication level on the FastQC, which is nearly 50%... I will do mapping onward and check it how good the sequencing quality is..

            Comment

            • EvilTwin
              End User
              • Jul 2011
              • 8

              #7
              Hi sehrrot,

              that is strange, we typically got 5-10 % (MarkDuplicates in Picard Tools). One run was exceptional with 25 % duplication, but as I understand there was some technical problem...

              Comment

              • sehrrot
                Member
                • Jul 2010
                • 58

                #8
                Hi EvilTwin

                I think so. I am still waiting my pipeline for on-target rate and coverages. Duplicate level on Picard is around 20-25%, which is lower than FastQC one but higher than previously I've seen in the NimbleGen exome V2 (which as around 5-8%). I've done NimbleGen V3 exome with pre-multiplexing as I've done this for V2 as well (the performance actually same between pre-multiplexing and post-multiplexing; I've tested with NimbleGen, Illumina and also compared with Agilent post-multiplexing) and got the nice result. I am still not sure why the duplicate level is high ..

                Anyway thanks for your reply.

                Comment

                • EvilTwin
                  End User
                  • Jul 2011
                  • 8

                  #9
                  Hi sehrrot,

                  Initially we also compared Agilent50MB post-capture multiplexing and NimbleGen V3 pre-capture multiplexing on the same set of samples, and there were no dramatic differences concerning duplication in that run (5 vs 6 %), but NimbleGen doing slightly better concerning on-target rate and also per-interval coverage (and naturally overall coverage due to the larger target size).
                  I checked some samples with FastQC, it also displays higher duplication levels, around 15-20 % (so perhaps much of it comes from unmapped reads?)

                  I will ask the sequencing core facility what exactly the problem was with the run displaying the exceptional high level of duplication

                  Comment

                  • sehrrot
                    Member
                    • Jul 2010
                    • 58

                    #10
                    Hi EvilTwin

                    Thanks for your sharing. I am guessing my sample would have the problem in sample prep or capture efficiency. Otherwise, it might be a problem in my HiSeq, as I've experienced in the dramatic quality drops after the cycle 80 and subsequently got a loss of cluster in read 2.

                    Anyway, apart from that, I think I got an answer why higher duplicate level in Fastqc than picard. FastQc find the identical sequences compared to others but they could be an enriched fragments, not solely for duplicates. Thus, Picard take information of paired end and if two sequences are identical as well as having the same start position of paired end sequences, Picard calls them as duplicates.

                    Comment

                    Latest Articles

                    Collapse

                    • GATTACAT
                      Reply to Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                      by GATTACAT
                      Love this - good data definitely starts from good input, and poor input can only give relatively poor data. I particularly like the mention of Nanodrop/absorbance based methods for quantification. It's such a toss up if you'll get an accurate reading or what amounts to a randomly generated number, and a lot of library/sequencing related issues can be traced back to poor quant.
                      07-01-2026, 11:43 AM
                    • SEQadmin2
                      Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                      by SEQadmin2


                      I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.

                      Here are nine questions we think about, in roughly the order they matter, before...
                      06-18-2026, 07:11 AM

                    ad_right_rmr

                    Collapse

                    News

                    Collapse

                    Topics Statistics Last Post
                    Started by SEQadmin2, Yesterday, 11:08 AM
                    0 responses
                    6 views
                    0 reactions
                    Last Post SEQadmin2  
                    Started by SEQadmin2, 06-30-2026, 05:37 AM
                    0 responses
                    11 views
                    0 reactions
                    Last Post SEQadmin2  
                    Started by SEQadmin2, 06-26-2026, 11:10 AM
                    0 responses
                    19 views
                    0 reactions
                    Last Post SEQadmin2  
                    Started by SEQadmin2, 06-17-2026, 06:09 AM
                    0 responses
                    53 views
                    0 reactions
                    Last Post SEQadmin2  
                    Working...