Seqanswers Leaderboard Ad

**kgulukota** · 06-21-2013, 07:55 AM

I have a similar problem. I get my sequence data in batches (not all samples at once) and would like to have a running list of variants called on samples thus far.

As you said, the biggest issue with straightforward merging of VCFs is that we need to differentiate between

evidence of absence ("there is sufficient depth at this locus and this sample is reference homozygous") and
absence of evidence ("this sample does not have enough coverage to infer whether there is a variant at this locus").

I am still searching for solutions and will post if I find one.

**kgulukota** · 06-21-2013, 01:17 PM

Re: Create a VCF with your first bam file, say 1.vcf

OK. There is a 3-step procedure that can accomplish what you want (I think).

Step 1. Create VCF's with your first and second bam files separately, say old.vcf and new.vcf.

Step 2. Next create a combined vcf with the two. I used the CombineVariants walker in GATK like so:

PHP Code:


java -jar GenomeAnalysisTK.jar -T CombineVariants -R GRCh37.fa --variant old.vcf --variant new.vcf -o joined.vcf -genotypeMergeOptions  UNIQUIFY

But presumably you can do similar with bedtools.

Step 3. Finally, run the GATK UnifiedGenotyper by using the joined vcf as the target file i.e. with the -L option, like so:

PHP Code:


java -jar GenomeAnalysisTK.jar  -T UnifiedGenotyper -R GRCh37.fa -L joined.vcf -I old.bam -I new.bam -o final.vcf

I have combined 30 old bams with 50 new bams using this method and seems to work well.

However, allow me to hasten to add that the best practice would be to run variant calling on all samples together. The above procedure might be quick and dirty. I think it will be mostly accurate but there will be differences between this procedure and redoing the whole shebang.

Topics	Statistics	Last Post
Genetic Variants and Diabetes Risk in Childhood Cancer Survivors by seqadmin Started by seqadmin, Today, 08:47 AM	0 responses 12 views 0 likes	Last Post by seqadmin Today, 08:47 AM
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 60 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 59 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 54 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM

Seqanswers Leaderboard Ad

Announcement

exome/vcf merge question

Comment

Comment

Latest Articles

ad_right_rmr

News