Performance comparison of exome DNA sequencing technologies
Clark MJ, Chen R, Lam HY, Karczewski KJ, Chen R, Euskirchen G, Butte AJ, Snyder M.
Nat Biotechnol. 2011 Sep 25. doi: 10.1038/nbt.1975. [Epub ahead of print]
Abstract
Whole exome sequencing by high-throughput sequencing of target-enriched genomic DNA (exome-seq) has become common in basic and translational research as a means of interrogating the interpretable part of the human genome at relatively low cost. We present a comparison of three major commercial exome sequencing platforms from Agilent, Illumina and Nimblegen applied to the same human blood sample. Our results suggest that the Nimblegen platform, which is the only one to use high-density overlapping baits, covers fewer genomic regions than the other platforms but requires the least amount of sequencing to sensitively detect small variants. Agilent and Illumina are able to detect a greater total number of variants with additional sequencing. Illumina captures untranslated regions, which are not targeted by the Nimblegen and Agilent platforms. We also compare exome sequencing and whole genome sequencing (WGS) of the same sample, demonstrating that exome sequencing can detect additional small variants missed by WGS.
--------------------------------------------------------
A comparative analysis of exome capture.
Parla JS, Iossifov I, Grabill I, Spector MS, Kramer M, McCombie WR.
Genome Biol. 2011 Sep 29;12(9):R97. [Epub ahead of print]
Abstract
Human exome resequencing using commercial target capture kits has been and is being used for sequencing large numbers of individuals to search for variants associated with various human diseases. We rigorously evaluated the capabilities of two solution exome capture kits. These analyses help clarify the strengths and limitations of those data as well as systematically identify variables that should be considered in the use of those data
--------------------------------------------------------
Comparison of solution-based exome capture methods for next generation sequencing.
Sulonen AM, Ellonen P, Almusa H, Lepisto M, Eldfors S, Hannula S, Miettinen T, Tyynismaa H, Salo P, Heckman C, Joensuu H, Raivio T, Suomalainen A, Saarela J.
Genome Biol. 2011 Sep 28;12(9):R94. [Epub ahead of print]
Abstract:
Background
Techniques enabling targeted re-sequencing of the protein coding sequences of the human genome on next generation sequencing instruments are of great interest. We conducted a systematic comparison of the solution-based exome capture kits provided by Agilent and Roche NimbleGen. A control DNA sample was captured with all four capture methods and prepared for Illumina GAII sequencing. Sequence data from additional samples prepared with the same protocols were also used in the comparison.
Results
We developed a bioinformatics pipeline for quality control, short read alignment, variant identification and annotation of the sequence data. In our analysis, larger percentage of the high quality reads from the NimbleGen captures than from the Agilent captures aligned to the capture target regions. High GC-content of the target sequence was associated with poor capture success in all exome enrichment methods. Comparison of mean allele balances for heterozygous variants indicated a tendency to have more reference bases than variant bases in the heterozygous variant positions within the target regions in all methods. There was virtually no difference in the genotype concordance compared to genotypes derived from SNP arrays. A minimum of 11x coverage was required to make a heterozygote genotype call with 99% accuracy when compared to common SNPs on GWA arrays.
Conclusions
Libraries captured with NimbleGen kits aligned more accurately to the target regions. The updated NimbleGen kit most efficiently covered the exome with a minimum coverage of 20x, yet none of the kits captured all the Consensus Coding Sequence annotated exons.
--------------------------------------------------------
Comprehensive comparison of three commercial human whole-exome capture platforms.
Asan NF, Xu Y, Jiang H, Tyler-Smith C, Xue Y, Jiang T, Wang J, Wu M, Liu X, Tian G, Wang J, Wang J, Yang H, Zhang X.
Genome Biol. 2011 Sep 28;12(9):R95. [Epub ahead of print]
Abstract
BACKGROUND:
Exome sequencing, which allows the global analyses of protein coding sequences in the human genome, has become an effective and affordable approach to detecting causative genetic mutations in diseases. Currently, there are several commercial human exome capture platforms; however, the relative performances of these have not been characterized sufficiently to know which is best for a particular study.
RESULTS:
We comprehensively compared three platforms: NimbleGen's Sequence Capture Array and SeqCap EZ, and Agilent's SureSelect. We assessed their performance in a variety of ways, including number of genes covered and capture efficacy. Differences that may impact on the choice of platform were that Agilent Sureselect covered approximately 1,100 more genes, while NimbleGen provided better flanking sequencing capture. Although all three platforms achieved similar capture specificity of targeted regions, the NimbleGen platforms showed better uniformity of coverage and greater genotype sensitivity at 30-100 folds sequencing depth. All three platforms showed similar power in exome SNP calling, including medically-relevant SNPs. Compared with genotyping and whole-genome sequencing data, the three platforms achieved a similar accuracy of genotype assignment and SNP detection. Importantly, all three platforms showed similar levels of reproducibility, GC bias and reference allele bias.
CONCLUSIONS:
We demonstrated key differences between the three platforms, particularly advantages of solutions over array capture and the importance of a large gene target set.
--------------------------------------------------------
There are also some other interesting exome-seq articles in this issue of Genome Biology, including a fun QA session with experts and a review covering the analysis of exome-seq data. I am also working on a series of blog posts about exome-seq that I kicked off with today's post responding to the same questions posed in the QA, and I intend to follow up with discussion of all four of these exome-seq platform comparison papers and finally I'd like to share my analytical approach in more detail. I can try to keep the forum posted if there's interest in those.
I think the exome-seq Genome Biology issue was a reasonable success. I have to say that I think a call for submissions in January with submissions required by April for publication in September is frankly too slow for our field. Our paper in Nature Biotech is the only one out of these that looked at the Illumina TruSeq exome capture system, and I have little doubt that's related to the timing--I finished our analysis of everything in early August, well after Genome Biology's cut-off date.
Clark MJ, Chen R, Lam HY, Karczewski KJ, Chen R, Euskirchen G, Butte AJ, Snyder M.
Nat Biotechnol. 2011 Sep 25. doi: 10.1038/nbt.1975. [Epub ahead of print]
Abstract
Whole exome sequencing by high-throughput sequencing of target-enriched genomic DNA (exome-seq) has become common in basic and translational research as a means of interrogating the interpretable part of the human genome at relatively low cost. We present a comparison of three major commercial exome sequencing platforms from Agilent, Illumina and Nimblegen applied to the same human blood sample. Our results suggest that the Nimblegen platform, which is the only one to use high-density overlapping baits, covers fewer genomic regions than the other platforms but requires the least amount of sequencing to sensitively detect small variants. Agilent and Illumina are able to detect a greater total number of variants with additional sequencing. Illumina captures untranslated regions, which are not targeted by the Nimblegen and Agilent platforms. We also compare exome sequencing and whole genome sequencing (WGS) of the same sample, demonstrating that exome sequencing can detect additional small variants missed by WGS.
--------------------------------------------------------
A comparative analysis of exome capture.
Parla JS, Iossifov I, Grabill I, Spector MS, Kramer M, McCombie WR.
Genome Biol. 2011 Sep 29;12(9):R97. [Epub ahead of print]
Abstract
Human exome resequencing using commercial target capture kits has been and is being used for sequencing large numbers of individuals to search for variants associated with various human diseases. We rigorously evaluated the capabilities of two solution exome capture kits. These analyses help clarify the strengths and limitations of those data as well as systematically identify variables that should be considered in the use of those data
--------------------------------------------------------
Comparison of solution-based exome capture methods for next generation sequencing.
Sulonen AM, Ellonen P, Almusa H, Lepisto M, Eldfors S, Hannula S, Miettinen T, Tyynismaa H, Salo P, Heckman C, Joensuu H, Raivio T, Suomalainen A, Saarela J.
Genome Biol. 2011 Sep 28;12(9):R94. [Epub ahead of print]
Abstract:
Background
Techniques enabling targeted re-sequencing of the protein coding sequences of the human genome on next generation sequencing instruments are of great interest. We conducted a systematic comparison of the solution-based exome capture kits provided by Agilent and Roche NimbleGen. A control DNA sample was captured with all four capture methods and prepared for Illumina GAII sequencing. Sequence data from additional samples prepared with the same protocols were also used in the comparison.
Results
We developed a bioinformatics pipeline for quality control, short read alignment, variant identification and annotation of the sequence data. In our analysis, larger percentage of the high quality reads from the NimbleGen captures than from the Agilent captures aligned to the capture target regions. High GC-content of the target sequence was associated with poor capture success in all exome enrichment methods. Comparison of mean allele balances for heterozygous variants indicated a tendency to have more reference bases than variant bases in the heterozygous variant positions within the target regions in all methods. There was virtually no difference in the genotype concordance compared to genotypes derived from SNP arrays. A minimum of 11x coverage was required to make a heterozygote genotype call with 99% accuracy when compared to common SNPs on GWA arrays.
Conclusions
Libraries captured with NimbleGen kits aligned more accurately to the target regions. The updated NimbleGen kit most efficiently covered the exome with a minimum coverage of 20x, yet none of the kits captured all the Consensus Coding Sequence annotated exons.
--------------------------------------------------------
Comprehensive comparison of three commercial human whole-exome capture platforms.
Asan NF, Xu Y, Jiang H, Tyler-Smith C, Xue Y, Jiang T, Wang J, Wu M, Liu X, Tian G, Wang J, Wang J, Yang H, Zhang X.
Genome Biol. 2011 Sep 28;12(9):R95. [Epub ahead of print]
Abstract
BACKGROUND:
Exome sequencing, which allows the global analyses of protein coding sequences in the human genome, has become an effective and affordable approach to detecting causative genetic mutations in diseases. Currently, there are several commercial human exome capture platforms; however, the relative performances of these have not been characterized sufficiently to know which is best for a particular study.
RESULTS:
We comprehensively compared three platforms: NimbleGen's Sequence Capture Array and SeqCap EZ, and Agilent's SureSelect. We assessed their performance in a variety of ways, including number of genes covered and capture efficacy. Differences that may impact on the choice of platform were that Agilent Sureselect covered approximately 1,100 more genes, while NimbleGen provided better flanking sequencing capture. Although all three platforms achieved similar capture specificity of targeted regions, the NimbleGen platforms showed better uniformity of coverage and greater genotype sensitivity at 30-100 folds sequencing depth. All three platforms showed similar power in exome SNP calling, including medically-relevant SNPs. Compared with genotyping and whole-genome sequencing data, the three platforms achieved a similar accuracy of genotype assignment and SNP detection. Importantly, all three platforms showed similar levels of reproducibility, GC bias and reference allele bias.
CONCLUSIONS:
We demonstrated key differences between the three platforms, particularly advantages of solutions over array capture and the importance of a large gene target set.
--------------------------------------------------------
There are also some other interesting exome-seq articles in this issue of Genome Biology, including a fun QA session with experts and a review covering the analysis of exome-seq data. I am also working on a series of blog posts about exome-seq that I kicked off with today's post responding to the same questions posed in the QA, and I intend to follow up with discussion of all four of these exome-seq platform comparison papers and finally I'd like to share my analytical approach in more detail. I can try to keep the forum posted if there's interest in those.
I think the exome-seq Genome Biology issue was a reasonable success. I have to say that I think a call for submissions in January with submissions required by April for publication in September is frankly too slow for our field. Our paper in Nature Biotech is the only one out of these that looked at the Illumina TruSeq exome capture system, and I have little doubt that's related to the timing--I finished our analysis of everything in early August, well after Genome Biology's cut-off date.
Comment