SEQanswers

SEQanswers (http://seqanswers.com/forums/index.php)
-   Bioinformatics (http://seqanswers.com/forums/forumdisplay.php?f=18)
-   -   Is there a tool that converts TXT, BED, GFF format to VCF? (http://seqanswers.com/forums/showthread.php?t=16016)

LauraSmith 12-05-2011 08:51 AM

Is there a tool that converts TXT, BED, GFF format to VCF?
 
Hi,

I would like to ask if there is tool out there that would convert variants in a certain file format (such as .txt, .gff, .bed) to VCF format?

Thank you for your help.
Laura

mbblack 12-05-2011 09:14 AM

PacBio's SMRT suite has a python script to supposedly go from GFFv4 to VCF.

But, I recall on the BEDTools discussion board, Aaron Quinlan has mentioned that going from GFF or BED to VCF is not a simple task, unless the input files were originally created to track all the information required for the VCF output. That makes it difficult to write generic scripts for conversion.

maubp 12-05-2011 11:30 AM

Quote:

Originally Posted by mbblack (Post 58756)
PacBio's SMRT suite has a python script to supposedly go from GFFv4 to VCF.

Was that a typo? Did you mean GFF v3 perhaps?

mbblack 12-05-2011 11:32 AM

Quote:

Originally Posted by maubp (Post 58762)
Was that a typo? Did you mean GFF v3 perhaps?

I was just glancing at their website, but I think it means their script is v.4 (or the entire SMRT suite is v.4), not that they've created their own GFF version!

http://www.pacbiodevnet.com/SMRT-Ana...-to-VCF-Python

splaisan 03-22-2017 01:41 AM

In my hands, the VCF v3.3 (exotic version if one with weird call syntax for the ALT field) format produced by SMRTv4 accompanying gffToVcf (v3.0 - pbgenomicconsensus) is not conform to the VCF4 specs and when used with VCF compatible tools, leads to errors.

Here an example from a very simple run
<pre>
##fileformat=VCFv3.3
##fileDate=2017121
##source=gffToVcf --resolved-tool-contract /opt/pacbio/userdata/jobs_root/000/000096/tasks/genomic_consensus.tasks.gff2vcf-0/resolved-tool-contract.json
##INFO=NS,1,Integer,"Number of Samples with Data"
##INFO=DP,1,Integer,"Total Depth of Coverage"
#CHROM POS ID REF ALT QUAL FILTER INFO
chromosome_2 486515 . C T 93.00 0 NS=1;DP=47
chromosome_2 487451 . C D1 93.00 0 NS=1;DP=47
chromosome_2 511331 . . IA 41.00 0 NS=1;DP=52
chromosome_2 537571 . . IA 55.00 0 NS=1;DP=40
chromosome_2 636693 . A G 93.00 0 NS=1;DP=31
chromosome_2 643391 . G T 93.00 0 NS=1;DP=46
chromosome_2 643959 . A D1 93.00 0 NS=1;DP=50
</pre>

Before I adventure in this, does anyone have a GFF3 to VCF4 converter that works on Sequel data?
Fields required to make a VCF from their GFF3 are there, it is 'only' a matter of operating a smart conversion between the two tabular formats and fixing coordinate issues and alternate allele cases (if present?!).
Thanks


All times are GMT -8. The time now is 09:25 PM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.