SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Merging SNPs and indels VCF from strelka for ANNOVAR annotation lbeltrame Bioinformatics 0 12-11-2012 01:37 AM
annovar output - question dkrtndhkd Bioinformatics 3 04-29-2012 10:23 AM
annovar question kenietz Bioinformatics 5 02-06-2012 01:20 AM
converting from Annovar to VCF kjaja Bioinformatics 1 12-14-2011 07:24 PM
ANNOVAR question prayingmantis Bioinformatics 2 12-09-2011 01:35 PM

Reply
 
Thread Tools
Old 07-31-2013, 10:52 AM   #1
quantrix
Member
 
Location: Pennsylvania

Join Date: Jan 2011
Posts: 21
Default VCF-Annovar question

Hi there,
I am relatively new and working through stuff. I have looked around and high and low for an answer and could not find one. So here goes.

I am trying to use annovar to convert a VCF file for analysis. When I run convert2annovar.pl, I get the following error.

Error: invalid record in VCF file: the GT specifier is not present in the FORMAT string: <chr3 41265950 . ATTCTTTT ATTTT 68.5 . INDEL;IS=10,0.072993;DP=136;VDB=7.627334e-21;AF1=0.5;AC1=1;DP4=8,5,2,20;MQ=44;FQ=68.9;PV4=0.0017,1,1,1 PL 106,0,107>

Needless to say, I examined my .vcf file and here is a field
chr3 41274764 . C A 222 . DP=4288;VDB=1.423052e-28;AF1=1;AC1=2;DP4=0,0,2246,2027;MQ=44;FQ=-282 PL 255,255,0

It is missing the "GT" tag.
My question is, at what step do I introduce the GT tag.

Is it with Bowtie alignment or Samtools mpileup or do I have to use separately vcftools to get the GT format in there?

I know this is a basic question, but I am trying to figure stuff out here. Thanks for the favor of a reply. I felt the vcf documentation was not quite clear about this.
quantrix is offline   Reply With Quote
Old 07-31-2013, 11:42 AM   #2
vivek_
PhD Student
 
Location: Denmark

Join Date: Jul 2012
Posts: 163
Default

It should be generated in the VCF file produced after samtools mpileup variant calling stage. WHich version of samtools are you using?

I see this from the samtools webpage:

Quote:
The VCF file produced by BCFtools does not strictly conform the VCF spec. For example, the GT genotype information is not always present because for the purpose of BCF, GT is unnecessary and takes disk space. In addition, GT is not the first as is required by the VCF spec. This can be fixed by the bcf-fix.pl script that comes with the source code package, and will be fixed in future (fixed in r880+).
http://samtools.sourceforge.net/mpileup.shtml

Last edited by vivek_; 07-31-2013 at 11:43 AM. Reason: Added link to webpage
vivek_ is offline   Reply With Quote
Old 07-31-2013, 12:49 PM   #3
quantrix
Member
 
Location: Pennsylvania

Join Date: Jan 2011
Posts: 21
Default

Hi Vivek,
Thanks for the quick reply. I am using samtools-0.1.19. I shall look more into this and try to figure out. I would be glad for any further guidance as well.
Regards
quantrix is offline   Reply With Quote
Old 07-31-2013, 02:52 PM   #4
vivek_
PhD Student
 
Location: Denmark

Join Date: Jul 2012
Posts: 163
Default

I'm not aware of your variant calling methods but I think the basic idea should be using samtools mpileup on the bam file -> output BCF -> use BCFtools to generate a VCF file (Here use -g option to generate GTs)

You could also use GATK's unified genotyper on the BAM, which produces a VCF output by default.
vivek_ is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 10:34 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2018, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO