Go Back   SEQanswers > Bioinformatics > Bioinformatics

Similar Threads
Thread Thread Starter Forum Replies Last Post
Sequence Bases in XSQ file. ccaceman RNA Sequencing 0 04-10-2012 10:49 AM
How to get list of column in vcf file using jessada Bioinformatics 0 01-20-2012 07:22 AM
why the vcf has so little high_quality bases mapping on the reference? wswill Bioinformatics 3 12-13-2011 10:05 AM
Converting Dindel VCF file to GATK BED file MolecularToast Bioinformatics 2 09-24-2011 06:38 PM
Unique bases from .gff file Khanjan Bioinformatics 2 12-01-2010 12:21 PM

Thread Tools
Old 07-04-2012, 01:47 AM   #1
Location: Spain

Join Date: Jul 2010
Posts: 68
Default IUPAC ambiguous bases in vcf file?

Dear all,

I have been variant calling from a reference that contains IUPAC ambiguous bases (such as K, W etc...), using samtools/bcftools.
In the mpileup file, these special base characters are maintained in the reference column, while the read bases are one of the classical 4 bases, ACGT.
However, in the vcf file, two things happen:

1. at SNP positions, that coincide with the ambiguous base positions in the reference, the vcf file says "N" at the reference and "ACGT" (or a comma-separated combination of those) at the alternative field. This seems to agree with the vcf4.1 format specifications saying that the reference field may be only ACGTN.

2. at INDEL positions, however, that include an ambiguous base position, these ambiguous bases are displayed in the reference field (and also inside the sequence of the alternative field, if the indel includes that position), such as:

Lg10    29366679        .       K       KCG,KG  49.5    PASS    INDEL;DP=19;VDB=0.0318;AF1=1;AC1=2;DP4=0,0,6,9;MQ=20;FQ=-58.5;MPB=U;   GT:PL:DP:SP:GQ  1/1:154,88,64,104,0,83:15:0:45
Lg10    29832925        .       TTAWAKWTATA     TTA     98.5    mrd15   INDEL;DP=17;VDB=0.0404;AF1=1;AC1=2;DP4=0,0,4,6;MQ=20;FQ=-64.5;MPB=U     GT:PL:DP:SP:GQ  1/1:139,30,0:10:0:57
How is this possible? Is the original reference.fasta read for the INDEL positions? Does the vcf4.1 restriction to ref=ACGTN not apply to INDEL positions?

Any of your comments will be very much appreciated.

sdvie is offline   Reply With Quote

indel, iupac, reference, vcf

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

All times are GMT -8. The time now is 01:58 AM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO