Hi All
Just a quick note to say that our software and statistics for de novo assembly of variants from individuals and from populations are now pubished at Nature Genetics
"De novo assembly and genotyping of variants using colored de Bruijn graphs",
Iqbal, Caccamo, Turner, Flicek, McVean (doi:10.1038/ng.1028)
This link will work for a bit
Software is available here
which includes a link for how to join our mailing list.
Some highlights for you
1. Cortex allows you to do de novo assembly of multiple samples simultaneously in order to call variants WITHOUT having to build or use a consensus reference. It's a very memory efficient de Bruijn assembler (if you're assembling bacteria you can assemble thousands simultaneously on a standard 32Gb server for example, or if you're doing humans you can do 10 on a 256Gb RAM server).
2. The paper gives a bunch of example cases.
- Variant calling in a high coverage human and comparing results with 1000 Genomes calls by validating both sets using fully sequenced fosmids.
- Variant calling in a population of chimps not using the reference at all
- HLA typing
- assembling 164 humans from the 1000 Genomes project into a 4 colour graph (Europe, Africa, Asia, and the reference) and then pulling out novel sequence and estimating population frequency
3. We provide a mathematical model (validated with simulations and on human and chimp data) that allows you to predict discovery power given experimental parameters (read length, depth) and informatic parameters (kmer) and biological parameters (repeat content), and on the length of the variant you want to call. This allows you to design your experiment based on your goals (eg high sensitivity SNP calling or high specificity SV calling would have different designs)
Anyway - if you are analysing a species which has no reference or has a bad reference, or if you are analysing a population of bacteria some of which you think are highly diverged from the reference, or if you are interested in getting an unbiased view of variation in a species, you might be interested in giving it a try. I have used it successfully on human, chimp, plasmodium and s. aureus so far.
Happy New Year and best wishes!
Zam Iqbal
Just a quick note to say that our software and statistics for de novo assembly of variants from individuals and from populations are now pubished at Nature Genetics
"De novo assembly and genotyping of variants using colored de Bruijn graphs",
Iqbal, Caccamo, Turner, Flicek, McVean (doi:10.1038/ng.1028)
This link will work for a bit
Software is available here
which includes a link for how to join our mailing list.
Some highlights for you
1. Cortex allows you to do de novo assembly of multiple samples simultaneously in order to call variants WITHOUT having to build or use a consensus reference. It's a very memory efficient de Bruijn assembler (if you're assembling bacteria you can assemble thousands simultaneously on a standard 32Gb server for example, or if you're doing humans you can do 10 on a 256Gb RAM server).
2. The paper gives a bunch of example cases.
- Variant calling in a high coverage human and comparing results with 1000 Genomes calls by validating both sets using fully sequenced fosmids.
- Variant calling in a population of chimps not using the reference at all
- HLA typing
- assembling 164 humans from the 1000 Genomes project into a 4 colour graph (Europe, Africa, Asia, and the reference) and then pulling out novel sequence and estimating population frequency
3. We provide a mathematical model (validated with simulations and on human and chimp data) that allows you to predict discovery power given experimental parameters (read length, depth) and informatic parameters (kmer) and biological parameters (repeat content), and on the length of the variant you want to call. This allows you to design your experiment based on your goals (eg high sensitivity SNP calling or high specificity SV calling would have different designs)
Anyway - if you are analysing a species which has no reference or has a bad reference, or if you are analysing a population of bacteria some of which you think are highly diverged from the reference, or if you are interested in getting an unbiased view of variation in a species, you might be interested in giving it a try. I have used it successfully on human, chimp, plasmodium and s. aureus so far.
Happy New Year and best wishes!
Zam Iqbal
Comment