SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Does anybody know how to convert dbsnp to .vcf (variant calling format)? alexbmp Bioinformatics 13 04-05-2015 10:24 PM
mpileup multiple samples - individual DP4/AF1 values catriona Bioinformatics 3 07-22-2014 03:49 PM
count number of A,C,G,T using DP4 in VCF gcrdb Bioinformatics 4 03-18-2013 03:28 AM
strand bias variant call yinshe Bioinformatics 2 10-10-2011 06:01 AM
Variant call using BAM from Tophat emilyjia2000 Bioinformatics 5 08-01-2011 04:37 PM

Reply
 
Thread Tools
Old 03-18-2011, 12:44 PM   #1
gcrdb
Junior Member
 
Location: usa

Join Date: Jan 2009
Posts: 9
Default what is DP4 in VCF (Variant Call Format) and AF1?

Hi,
Anyone knows what is DP4 in VCF? (I got from mpileup->bcftools view->)
In this VCF line:
chr1 1467416 . G C,T,X 35 . DP=105;AF1=0.5;CI95=0.5,0.5;DP4=36,33,30,5;MQ=47;PV4=0.001,3.5e-05,3.2e-21,0.0073
I undersatnd that:
DP=105 means Depth (or Coverage) is 105.
then what is DP4? DP4=36,33,30,5
I counted number of A/C/G/C , total is 105 but only about half A half C:
(see this pileup)
chr1 1467416 G G 62 0 49 105 .$C$C$C.CC,CCCCC.C.C.CCCC.C.C.C.CCCCCC,T..C...,.CCC.,
.,..,Ct,...,.c,c.cc,,,,,,,,.,,,,.,,,,.,..,.,.,,.,.,,^].^].^]. D!0,2))I3+8)3F+H%44'#2I4II-III22I4II7,I0IIIIFIIIIII/II03%HI<0III2IIIIHI%I#FIII)I1III56IIIIIIIBII/I6IE7@EE




thanks a lot!
gcrdb is offline   Reply With Quote
Old 03-18-2011, 02:00 PM   #2
iansealy
Member
 
Location: Hitchin, UK

Join Date: Oct 2010
Posts: 15
Default

There's a definition of DP4 on http://samtools.sourceforge.net/mpileup.shtml
iansealy is offline   Reply With Quote
Old 03-24-2011, 05:38 AM   #3
gcrdb
Junior Member
 
Location: usa

Join Date: Jan 2009
Posts: 9
Default

thanks, I saw the definition of DF4, but still not solving my problem:
for example: how many C and G in this line?

chr1 10725 . A C,G,X 34 . DP=17;AF1=1;CI95=1,1;DP4=0,0,17,0;MQ=18 PL 67,45,72,67,0,29,45,66,7
2,67
gcrdb is offline   Reply With Quote
Old 02-20-2012, 03:38 PM   #4
Tally
Member
 
Location: Sydney

Join Date: Aug 2011
Posts: 12
Default

DP4 refers to the reads covering the reference forward, reference reverse, alternate forward, alternate reverse bases. Eg DP4=2,3,0,0 means 2 reads are the reference base, forward strand, 3 reference base, reverse strand, and no covering reads have an alternate at that position. The sum of DP4 will not always equal DP due to some reads being of too low quality.
Tally is offline   Reply With Quote
Old 10-08-2014, 07:18 AM   #5
wetSEQer
Member
 
Location: TX

Join Date: Dec 2013
Posts: 15
Default

Quote:
Originally Posted by gcrdb View Post
thanks, I saw the definition of DF4, but still not solving my problem:
for example: how many C and G in this line?

chr1 10725 . A C,G,X 34 . DP=17;AF1=1;CI95=1,1;DP4=0,0,17,0;MQ=18 PL 67,45,72,67,0,29,45,66,7
2,67
I think for that reason, you can generate the pileup without -g or -v flag and look at that loci to see how many C and Gs.
wetSEQer is offline   Reply With Quote
Old 01-19-2015, 02:51 PM   #6
lethalfang
Member
 
Location: San Francisco, CA

Join Date: Aug 2011
Posts: 91
Default

Quote:
Originally Posted by Tally View Post
DP4 refers to the reads covering the reference forward, reference reverse, alternate forward, alternate reverse bases. Eg DP4=2,3,0,0 means 2 reads are the reference base, forward strand, 3 reference base, reverse strand, and no covering reads have an alternate at that position. The sum of DP4 will not always equal DP due to some reads being of too low quality.
Anyone knows exactly what constitutes "high quality" in DP4?
lethalfang is offline   Reply With Quote
Old 01-19-2015, 11:49 PM   #7
dpryan
Devon Ryan
 
Location: Freiburg, Germany

Join Date: Jul 2011
Posts: 3,480
Default

There's no strict definition of high quality, it'll depend on the program used to call the sites (there's no guarantee that a DP4 field will even exist in any given VCF file).
dpryan is offline   Reply With Quote
Old 01-20-2015, 08:06 AM   #8
lethalfang
Member
 
Location: San Francisco, CA

Join Date: Aug 2011
Posts: 91
Default

Quote:
Originally Posted by dpryan View Post
There's no strict definition of high quality, it'll depend on the program used to call the sites (there's no guarantee that a DP4 field will even exist in any given VCF file).
What about samtools | mpileup | bcftools in particular?
I know mpileup has a default base quality of 13 and mapping quality of 0, but that's "raw depth" right?
What does samtools' high-quality mean?
lethalfang is offline   Reply With Quote
Old 01-20-2015, 09:37 AM   #9
dpryan
Devon Ryan
 
Location: Freiburg, Germany

Join Date: Jul 2011
Posts: 3,480
Default

You'd have to go through the source code to find out, I've never seen the specifics described anywhere.
dpryan is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 09:17 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO