I am working with Complete genomics data from pipeline version 2.5. I need to add 1000 genome data to my sample and make a multigenome vcf file. Since the 1K genome project data are from 2.0.0 version, I was wondering if this is something I should be concerned about? If there is any batch effect, what would you normally expect in the CG data with 2.0.0 vs 2.5 pipeline version?
Additionally, I would also like to know if mkvcf is the right tool to merge multi genome data and make a combined vcf. Is there a proper tool to annotate that vcf ?
Additionally, I would also like to know if mkvcf is the right tool to merge multi genome data and make a combined vcf. Is there a proper tool to annotate that vcf ?