Hi all,
I am working with the 1000 Genomes data and as a part of the analysis one has to input the reference genome file, this is provided by 1000 Genomes as a FASTA file.
When a .VCF file is generated it does not contain homozygous wildtype calls (this is by design).
I have a database of variants, some have rs numbers, others do not. However, i do have the chromosomal number and coordinates for all my variants and the sequence as well.
What i want to accomplish is if a variant is not found in the VCF file then i want to find out the wildtype allele from the reference FASTA file.
Can anyone suggest a way to accomplish this or if there are any tools out there which can be used for this.
Alternative, is there a file from 1K genomes that outlines all the variants that were ever detected?
Thanks in advance.
Ashwin
I am working with the 1000 Genomes data and as a part of the analysis one has to input the reference genome file, this is provided by 1000 Genomes as a FASTA file.
When a .VCF file is generated it does not contain homozygous wildtype calls (this is by design).
I have a database of variants, some have rs numbers, others do not. However, i do have the chromosomal number and coordinates for all my variants and the sequence as well.
What i want to accomplish is if a variant is not found in the VCF file then i want to find out the wildtype allele from the reference FASTA file.
Can anyone suggest a way to accomplish this or if there are any tools out there which can be used for this.
Alternative, is there a file from 1K genomes that outlines all the variants that were ever detected?
Thanks in advance.
Ashwin