Hi folks,
I know the question was already asked here (as in http://seqanswers.com/forums/showthr...hlight=gff+vcf)..
But my problem is that the data I have are on plants, and are massive... I mean, I have performed a GATK SNP calling for ALL sites, including indel, and the final file is of 49Go (VCF).
My aim is to convert it to GFF, and in this GFF to include the large deletions (missing regions in queries either from non-existing sequence or from technical aspects) as well as the small SNP and indel. Moreover I would like to conserve the RG infos as I have multiple samples in the VCF
I wonder if someon already starts working on something like that ?
Thanks
Francois
I know the question was already asked here (as in http://seqanswers.com/forums/showthr...hlight=gff+vcf)..
But my problem is that the data I have are on plants, and are massive... I mean, I have performed a GATK SNP calling for ALL sites, including indel, and the final file is of 49Go (VCF).
My aim is to convert it to GFF, and in this GFF to include the large deletions (missing regions in queries either from non-existing sequence or from technical aspects) as well as the small SNP and indel. Moreover I would like to conserve the RG infos as I have multiple samples in the VCF
I wonder if someon already starts working on something like that ?
Thanks
Francois
Comment