I recently got some whole genome sequencing data from my platform that would constitute some validation data for my other RNA-seq data. The idea was that the platform would do all the bioinformatic analyses for this data, so that I could keep my focus on RNA rather than DNA, but it turns out that they don't routinely do the last part of the analysis that I want: finding any differing mutations between my two sample types. I now have to do that myself, so I come here for help.
I have two isogenic cell lines, theoretically differing in a single mutation, and that is what we want to confirm (or that, at least, any other mutatations are in non-functional or otherwise non-relevant regions). As far as I understand it, the platform has done the alignment and the heaviest parts of the analysis, as well as a variant calling relative the reference. I'm really only interested in the difference between the cell lines, and not between cell lines / reference.
I was pointed towards the GenotypeGVCFs function of GATK. If I understand it correctly, what I do is input my two samples GVCFs and run it. I'm not sure why I need a reference, though. I made some rudimentary code:
Is this what I'm looking for? If not, what am I missing, or am I completely off course?
I have two isogenic cell lines, theoretically differing in a single mutation, and that is what we want to confirm (or that, at least, any other mutatations are in non-functional or otherwise non-relevant regions). As far as I understand it, the platform has done the alignment and the heaviest parts of the analysis, as well as a variant calling relative the reference. I'm really only interested in the difference between the cell lines, and not between cell lines / reference.
I was pointed towards the GenotypeGVCFs function of GATK. If I understand it correctly, what I do is input my two samples GVCFs and run it. I'm not sure why I need a reference, though. I made some rudimentary code:
Code:
java -jar GenomeAnalysisTK.jar \ -T GenotypeGVCFs \ -nt 16 \ -R .../human_g1k_v37.fasta \ --variant .../sample1.clean.dedup.recal.bam.genomic.vcf \ --variant .../sample2.clean.dedup.recal.bam.genomic.vcf \ -o .../output.vcf