View Single Post
Old 03-22-2017, 01:41 AM   #5
splaisan
senior molecular biologist
 
Location: Belgium

Join Date: Jun 2009
Posts: 31
Default

In my hands, the VCF v3.3 (exotic version if one with weird call syntax for the ALT field) format produced by SMRTv4 accompanying gffToVcf (v3.0 - pbgenomicconsensus) is not conform to the VCF4 specs and when used with VCF compatible tools, leads to errors.

Here an example from a very simple run
<pre>
##fileformat=VCFv3.3
##fileDate=2017121
##source=gffToVcf --resolved-tool-contract /opt/pacbio/userdata/jobs_root/000/000096/tasks/genomic_consensus.tasks.gff2vcf-0/resolved-tool-contract.json
##INFO=NS,1,Integer,"Number of Samples with Data"
##INFO=DP,1,Integer,"Total Depth of Coverage"
#CHROM POS ID REF ALT QUAL FILTER INFO
chromosome_2 486515 . C T 93.00 0 NS=1;DP=47
chromosome_2 487451 . C D1 93.00 0 NS=1;DP=47
chromosome_2 511331 . . IA 41.00 0 NS=1;DP=52
chromosome_2 537571 . . IA 55.00 0 NS=1;DP=40
chromosome_2 636693 . A G 93.00 0 NS=1;DP=31
chromosome_2 643391 . G T 93.00 0 NS=1;DP=46
chromosome_2 643959 . A D1 93.00 0 NS=1;DP=50
</pre>

Before I adventure in this, does anyone have a GFF3 to VCF4 converter that works on Sequel data?
Fields required to make a VCF from their GFF3 are there, it is 'only' a matter of operating a smart conversion between the two tabular formats and fixing coordinate issues and alternate allele cases (if present?!).
Thanks
splaisan is offline   Reply With Quote