Hi,
I'm new to bioinformatics and doing some experimental work in machine learning. I'm looking for sample variant data with the following characteristics:
- human
- ideally vcf format
- samples tagged with phenotypes or marked as case/control for a given phenotype. Alternately, pairs of cancer/normal samples would probably do
- more samples is better
if there are any publicly accessible datasets with these characteristics, it would be super helpful. I've looked at 1000G, but they seem to be all apparently normal, and clinvar, but they seem to collect data about variants, but not individuals. I'm really looking for sets of samples where some are affected and some not, for a given feature.
Thank you!!
I'm new to bioinformatics and doing some experimental work in machine learning. I'm looking for sample variant data with the following characteristics:
- human
- ideally vcf format
- samples tagged with phenotypes or marked as case/control for a given phenotype. Alternately, pairs of cancer/normal samples would probably do
- more samples is better
if there are any publicly accessible datasets with these characteristics, it would be super helpful. I've looked at 1000G, but they seem to be all apparently normal, and clinvar, but they seem to collect data about variants, but not individuals. I'm really looking for sets of samples where some are affected and some not, for a given feature.
Thank you!!
Comment