Syndicated from PubMed RSS Feeds
Related Articles Sequence and structural variation in a human genome uncovered by short-read, massively parallel ligation sequencing using two base encoding.
Genome Res. 2009 Jun 22;
Authors: McKernan KJ, Peckham HE, Costa G, McLaughlin S, Tsung E, Fu Y, Clouser C, Dunkan C, Ichikawa J, Lee C, Zhang Z, Sheridan A, Fu H, Ranade S, Dimilanta E, Sokolsky T, Zhang L, Hendrickson C, Li B, Kotler L, Stuart J, Malek J, Manning J, Antipova A, Perez D, Moore M, Hayashibara K, Lyons M, Beaudoin R, Coleman B, Laptewicz M, Sanicandro A, Rhodes M, De La Vega F, Gottimukkala RK, Hyland F, Reese M, Yang S, Bafna V, Bashir A, Macbride A, Aklan C, Kidd JM, Eichler EE, Blanchard AP
We describe the genome sequencing of an anonymous individual of African origin using a novel ligation based sequencing assay that enables a unique form of error correction that improves the raw accuracy of the aligned reads to >99.9% allowing us to accurately call SNPs with as few as 2 reads per allele. We collected several billion mate-pair reads yielding ~18x haploid coverage of aligned sequence and close to 300x clone coverage. Over 98% of the reference genome is covered with at least one uniquely placed read and 99.65% spanned by at least one uniquely placed mate-paired clone. We identify over 3.8 million SNPs, 19% of which are novel. Mate-paired data is used to physically resolve haplotype phases of nearly 2/3 of the genotypes obtained and produce phased segments of up to 210 Kb. We detect 226,529 intra-read indels, 5,590 indels between mate-paired reads, 91 inversions and 4 gene fusions. We use a novel approach for detecting indels between mate-paired reads that are smaller than the standard deviation of the insert size of the library and discover deletions in common with those detected with our intra-read approach. Dozens of previously described disease susceptibility mutations and thousands of novel potentially functional variants, both single-nucleotide and structural, are identified in this individual which suggests a higher than expected load of deleterious variants that can be tolerated in the human genome. There is more genetic variation in the human genome still to be uncovered and we provide guidance for future surveys in populations and cancer biopsies.
PMID: 19546169 [PubMed - as supplied by publisher]
More...
Related Articles Sequence and structural variation in a human genome uncovered by short-read, massively parallel ligation sequencing using two base encoding.
Genome Res. 2009 Jun 22;
Authors: McKernan KJ, Peckham HE, Costa G, McLaughlin S, Tsung E, Fu Y, Clouser C, Dunkan C, Ichikawa J, Lee C, Zhang Z, Sheridan A, Fu H, Ranade S, Dimilanta E, Sokolsky T, Zhang L, Hendrickson C, Li B, Kotler L, Stuart J, Malek J, Manning J, Antipova A, Perez D, Moore M, Hayashibara K, Lyons M, Beaudoin R, Coleman B, Laptewicz M, Sanicandro A, Rhodes M, De La Vega F, Gottimukkala RK, Hyland F, Reese M, Yang S, Bafna V, Bashir A, Macbride A, Aklan C, Kidd JM, Eichler EE, Blanchard AP
We describe the genome sequencing of an anonymous individual of African origin using a novel ligation based sequencing assay that enables a unique form of error correction that improves the raw accuracy of the aligned reads to >99.9% allowing us to accurately call SNPs with as few as 2 reads per allele. We collected several billion mate-pair reads yielding ~18x haploid coverage of aligned sequence and close to 300x clone coverage. Over 98% of the reference genome is covered with at least one uniquely placed read and 99.65% spanned by at least one uniquely placed mate-paired clone. We identify over 3.8 million SNPs, 19% of which are novel. Mate-paired data is used to physically resolve haplotype phases of nearly 2/3 of the genotypes obtained and produce phased segments of up to 210 Kb. We detect 226,529 intra-read indels, 5,590 indels between mate-paired reads, 91 inversions and 4 gene fusions. We use a novel approach for detecting indels between mate-paired reads that are smaller than the standard deviation of the insert size of the library and discover deletions in common with those detected with our intra-read approach. Dozens of previously described disease susceptibility mutations and thousands of novel potentially functional variants, both single-nucleotide and structural, are identified in this individual which suggests a higher than expected load of deleterious variants that can be tolerated in the human genome. There is more genetic variation in the human genome still to be uncovered and we provide guidance for future surveys in populations and cancer biopsies.
PMID: 19546169 [PubMed - as supplied by publisher]
More...