SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Java library to manipulate VCF files binfoUser Bioinformatics 6 01-05-2016 12:32 PM
Convert VCF files to BCF files janaahan13 Bioinformatics 3 09-16-2015 05:47 AM
multiple vcf files to one multisampled vcf file Jetse Bioinformatics 2 06-27-2013 05:34 AM
vcf files merging marco12345 Bioinformatics 3 06-13-2013 05:30 AM
vcflib, a C++ library for reading and writing VCF files ekg Bioinformatics 0 09-29-2010 01:12 PM

Reply
 
Thread Tools
Old 01-06-2016, 01:32 AM   #1
binfoUser
Member
 
Location: Portugal

Join Date: Jan 2016
Posts: 22
Default Using library htsjdk vcf files

I want to analyse VCF files that have annotations. I used programs like VEP to annotate the file and now I want to see and manipulate these annotations that are in the file.
I'm using a Java library htsjdk but until now I didn't find any method to manipulate one column (the annotations are added in the INFO column).

Can someone help?
binfoUser is offline   Reply With Quote
Old 01-06-2016, 04:34 AM   #2
lindenb
Senior Member
 
Location: France

Join Date: Apr 2010
Posts: 143
Default

there is currently no automatic method to parse the ouptut of VEP. you need to find the description of the INFO field for VEP from the https://samtools.github.io/htsjdk/ja...VCFHeader.html ( getInfoHeaderLine("CSQ") )

Code:
##INFO=<ID=CSQ,Number=.,Type=String,Description="Consequence type as predicted by VEP. Format: Allele|Gene|Feature|Feature_type|Consequence|cDNA_position|CDS_position|Protein_position|Amino_acids|Codons|Existing_variation|DISTANCE|STRAND|SYMBOL|SYMBOL_SOURCE|HGNC_ID">
and then, using something like "variantContext.getAttribute("CSQ")" , get the CSQ strings and split the result.
lindenb is offline   Reply With Quote
Old 01-08-2016, 12:45 AM   #3
binfoUser
Member
 
Location: Portugal

Join Date: Jan 2016
Posts: 22
Default

But the VCFHeader only access the file header and it gives me that line that you showed with id "CSQ". But now I want to extract the info added that is present in the data lines after the header.
I don't understand how to use the "variantContext.getAttribute("CSQ")" that you said. Does that give me that information?
binfoUser is offline   Reply With Quote
Reply

Tags
annotation, htsjdk, java, vcf

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 03:59 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2018, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO