Hi there,
I'm a newbie with sequencing, and I'm currently thinking about new ideas for a grant application. I work on non-model organisms with smallish genomes, and I'm interested in the causes and consequences of DNA methylation.
Existing work on DNA methylation in my field has used whole genome bisulphite sequencing, which is expensive to do at high coverage, so past work often has zero biological replication. Alternatively, people have used poorly repeatable and low resolution methods, like methylation-sensitive AFLP or densitometry to get at the general 'vibe' of methylation in a species.
I had an idea involving ddRAD that I think might be great, though I may be missing something crucial (apologies for any serious newbie mistakes). I'd be keen to hear your thoughts on the idea, and suggestions regarding improvements and specifics. Here's the rough protocol:
1. Double digest the genomic DNA from one individual.
2. Add methylated adapters then do size selection.
3. React the sample with BiS - the unmethylated parts of the genomic DNA in each RAD marker will be converted, but the methylated adapter parts will not. Alternatively, use adapters where it doesn't matter if the Cs become Ts in some of the adapter sequences (e.g. it will still stick to the flowcell).
4. Sequence with e.g. HiSeq. Aim for higher coverage than usual in ddRAD work since we are looking for methylation, and so we need consistent high coverage across individuals to make good inferences.
5a. Idea 1: Sequence some of the DNA from a handful of individuals without BiS conversion. Use this as a reference to identify methylation (i.e. places where there is a C in the reference and a T in the sample).
5b. Idea 2: Sequence every sample twice: half the library is BiS-treated and half is non-BiS. Then for each sample, use the non-BiS sample as a reference. Less room for error than idea 1?
The outcome is hopefully:
1. Identification of methylated loci in randomly distributed bits of the genome for lots of individuals for minimal cost
2. Identification of methylation marks that vary between individuals
3. Simultaneous gathering of SNP and methylation data on the same set of individuals
4. Room for fun analyses: build a phylogenetic tree based on methylome similarity (compare populations/species/ecotypes/treatment groups); ask whether genetically variable regions are more diverse in methylation status, and whether they have more methylation; get a very precise estimate of the proportion of methylated loci in the genome in lots of species (likely more accurate than existing methods for doing this, e.g. HPLC, densitometry)
Thanks for reading!
Luke
I'm a newbie with sequencing, and I'm currently thinking about new ideas for a grant application. I work on non-model organisms with smallish genomes, and I'm interested in the causes and consequences of DNA methylation.
Existing work on DNA methylation in my field has used whole genome bisulphite sequencing, which is expensive to do at high coverage, so past work often has zero biological replication. Alternatively, people have used poorly repeatable and low resolution methods, like methylation-sensitive AFLP or densitometry to get at the general 'vibe' of methylation in a species.
I had an idea involving ddRAD that I think might be great, though I may be missing something crucial (apologies for any serious newbie mistakes). I'd be keen to hear your thoughts on the idea, and suggestions regarding improvements and specifics. Here's the rough protocol:
1. Double digest the genomic DNA from one individual.
2. Add methylated adapters then do size selection.
3. React the sample with BiS - the unmethylated parts of the genomic DNA in each RAD marker will be converted, but the methylated adapter parts will not. Alternatively, use adapters where it doesn't matter if the Cs become Ts in some of the adapter sequences (e.g. it will still stick to the flowcell).
4. Sequence with e.g. HiSeq. Aim for higher coverage than usual in ddRAD work since we are looking for methylation, and so we need consistent high coverage across individuals to make good inferences.
5a. Idea 1: Sequence some of the DNA from a handful of individuals without BiS conversion. Use this as a reference to identify methylation (i.e. places where there is a C in the reference and a T in the sample).
5b. Idea 2: Sequence every sample twice: half the library is BiS-treated and half is non-BiS. Then for each sample, use the non-BiS sample as a reference. Less room for error than idea 1?
The outcome is hopefully:
1. Identification of methylated loci in randomly distributed bits of the genome for lots of individuals for minimal cost
2. Identification of methylation marks that vary between individuals
3. Simultaneous gathering of SNP and methylation data on the same set of individuals
4. Room for fun analyses: build a phylogenetic tree based on methylome similarity (compare populations/species/ecotypes/treatment groups); ask whether genetically variable regions are more diverse in methylation status, and whether they have more methylation; get a very precise estimate of the proportion of methylated loci in the genome in lots of species (likely more accurate than existing methods for doing this, e.g. HPLC, densitometry)
Thanks for reading!
Luke
Comment