I'm using PacBio's "variantCaller" tool (v6) to call variants on some older RSII subread data I aligned with blasr. The reference is 14845 bp.
My variantCaller command is:
Code:
variantCaller --algorithm arrow --log-file variantCaller.log --annotateGFF --reportEffectiveCoverage --noEvidenceConsensusCall lowercasereference --minCoverage 40 --coverage 100 --minMapQV 40 --minConfidence 10 --minReadScore 0.75 --minSnr 3.75 --minZScore -3.5 --minAccuracy 0.82 --numWorkers 5 -r ref.fa -o 129C02-vs-ref.fa.sort.vcf -o 129C02-vs-ref.fa.sort.gff -o 129C02-vs-ref.fa.sort.fasta -o 129C02-vs-ref.fa.sort.fastq 129C02-vs-ref.fa.sort.bam
Now it is my understanding that the fasta (and fastq) output this command should reflect the differences between the reference and the reads as reported in the vcf output.
My vcf file reports:
Code:
#CHROM POS ID REF ALT QUAL FILTER INFO
ref 11438 . CA C 25 PASS DP=100
However when I align the reference and this fasta output 2 1-bp deletions are observed. One is the deletion in the vcf. The other is several thousand base pairs away and is not called by variantCaller.
This, of course, is very disturbing. Can you explain this?
Thanks for your help