Hi,
I am looking at the difference in variant calls that I get having aligned my resequencing data to both the 1000 genomes reference genome human_g1k_v37.fasta and a major allle reference genome: Dewey et al, Plos Genetics.
For both standard and major reference genome alignments, I had planned to use GATK to call variants.
My question relates to whether I should be pointing GATK to either respective reference genome for each iteration, eg:
for major allele refernce alignment
java -Xmx4g -jar /path_to.../GenomeAnalysisTK/GenomeAnalysisTK.jar \
-I Major_allele_aligned_input_bam.bam \
-R /PATH_To_MAJOR_ALLELE_REF/CEUref.fasta \ <<----------- THIS
-T IndelRealigner \
-targetIntervals forIndelRealigner.intervals \
-o output_bam.bam \
--known /.../1000G_indels/1000G_biallelic.indels.b37.vcf \
--consensusDeterminationModel KNOWNS_ONLY \
-LOD 0.4
for 1000 genome ref alignment
java -Xmx4g -jar /path_to.../GenomeAnalysisTK/GenomeAnalysisTK.jar \
-I 1000_genomes_aligned_input_bam.bam \
-R /PATH_To_1000_GENOMES/human_g1k_v37.fasta \ <<----------- THIS
-T IndelRealigner \
-targetIntervals forIndelRealigner.intervals \
-o output_bam.bam \
--known /.../1000G_indels/1000G_biallelic.indels.b37.vcf \
--consensusDeterminationModel KNOWNS_ONLY \
-LOD 0.4
Thanks
I am looking at the difference in variant calls that I get having aligned my resequencing data to both the 1000 genomes reference genome human_g1k_v37.fasta and a major allle reference genome: Dewey et al, Plos Genetics.
For both standard and major reference genome alignments, I had planned to use GATK to call variants.
My question relates to whether I should be pointing GATK to either respective reference genome for each iteration, eg:
for major allele refernce alignment
java -Xmx4g -jar /path_to.../GenomeAnalysisTK/GenomeAnalysisTK.jar \
-I Major_allele_aligned_input_bam.bam \
-R /PATH_To_MAJOR_ALLELE_REF/CEUref.fasta \ <<----------- THIS
-T IndelRealigner \
-targetIntervals forIndelRealigner.intervals \
-o output_bam.bam \
--known /.../1000G_indels/1000G_biallelic.indels.b37.vcf \
--consensusDeterminationModel KNOWNS_ONLY \
-LOD 0.4
for 1000 genome ref alignment
java -Xmx4g -jar /path_to.../GenomeAnalysisTK/GenomeAnalysisTK.jar \
-I 1000_genomes_aligned_input_bam.bam \
-R /PATH_To_1000_GENOMES/human_g1k_v37.fasta \ <<----------- THIS
-T IndelRealigner \
-targetIntervals forIndelRealigner.intervals \
-o output_bam.bam \
--known /.../1000G_indels/1000G_biallelic.indels.b37.vcf \
--consensusDeterminationModel KNOWNS_ONLY \
-LOD 0.4
Thanks
Comment