View Single Post
Old 04-29-2019, 11:34 AM   #1
XeroxHero69
Member
 
Location: USA

Join Date: Apr 2019
Posts: 11
Default Accessing Genomic data in a large vcf.gz file

Hello all, I am very new to bioinformatics; I'm a high school intern. I have a .vcf.gz file that is 332 GB. It contains the whole genomes of around 800 dogs and all I need to do is check for a mutation for each of them. But before I can do that, I need to figure out how to crack the file open. I have heard of ways to possibly access them without decompressing the .vcf.gz because someone told me its possible that the uncompressed file could be up to three terabytes.

If anyone has any suggestions on how to proceed, I would be eternally grateful as I am nearing the end of my internship and need to figure this out.
XeroxHero69 is offline   Reply With Quote