SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
PubMed: The NTD Nanoscope: potential applications and implementations. Newsbot! Literature Watch 0 12-23-2011 03:00 AM
Tophat v1.1.4 potential error with sam to bam conversion? jb2 Bioinformatics 6 11-17-2011 01:52 AM
Dindel problem: overlapping windows and non-uniquely re-aligned reads Yilong Li Bioinformatics 5 03-07-2011 03:10 PM
samtools flagstats bug or it does not support multi-reads? xinwu Bioinformatics 0 12-22-2010 10:33 PM
how critical is the filtering of potential PCR duplicates? julien General 3 03-26-2010 10:24 AM

Reply
 
Thread Tools
Old 11-28-2011, 08:18 AM   #1
Heisman
Senior Member
 
Location: St. Louis

Join Date: Dec 2010
Posts: 535
Default Dindel is seeing too many reads, potential bug?

Hey all,

I have just figured out how to download and use dindel and I am trying to compare it to samtools mpileup. Most of the calls are the same. However, when looking at a dindel call in IGV, I noticed that there is only 1 read covering that position. Yet, dindel gave the following output line in the VCF file:

Code:
chr19   11243209        .       c       cG      128     PASS    DP=12;NF=0;NR=4;NRS=3;NFS=1;HP=1        GT:GQ   1/1:12
When looking at the depthofcoverage file, it states that there is only 1 read. Yet, dindel sees 4 reads. Does anybody have any idea why this could happen? I've attached a screenshot of this position in IGV.
Attached Images
File Type: png Insertion.PNG (19.4 KB, 11 views)
Heisman is offline   Reply With Quote
Old 01-09-2013, 09:43 AM   #2
ratope
Junior Member
 
Location: Madrid & La Rioja

Join Date: Apr 2011
Posts: 6
Default

I had the same doubt. After checking the intermediate results of Dindel, I realized that DP value in the VCF is the number of reads that cover the window that is processed by Dindel.

For example, in my case the VCF contains an indel in the position 6680:

Code:
#CHROM	POS	ID	REF	ALT	QUAL	FILTER	INFO	FORMAT	SAMPLE
chromosome_II	6680	.	TATA	T	118	PASS	DP=230;NF=0;NR=3;NRS=13;NFS=23;HP=1	GT:GQ	0/1:118
The "depth" is:
  • DP=230 according to Dindel (INFO column in the VCF file), but
  • depth=45 according to IGV (in fact, according to the pileup file).

The information displayed by the "step 3" of Dindel showed this:

Code:
(...)
****
 tid: chromosome_II pos: 6681 leftPos: 6620  rightPos: 6742
Fetching reads....
Number of reads: 230 out of 77463 # unmapped reads: 0 numReadsUnknownLib: 0 numChrMismatch: 0 numMappedWithoutMate: 2 numUnmappedWithoutMate: 0
candidate_var@pos: 6681 6680,-ATA
aligned_var@pos 6681 6656 A=>G
aligned_var@pos 6681 6657 T=>A
aligned_var@pos 6681 6680 -ATA
[empiricalDistributionMethod] Number of haplotypes: 8
Filtered 0 haplotypes.
ll_ref: -1085.49 max_ll_indel: -1058.3 qual: 118.099
(...)
My interpretation is that DP is the number of reads covering the positions (window) 6620-6742, and not only those covering the "starting" point of the indel (6680).

Hope it is useful.
ratope is offline   Reply With Quote
Old 01-10-2013, 05:11 AM   #3
ratope
Junior Member
 
Location: Madrid & La Rioja

Join Date: Apr 2011
Posts: 6
Default

More simple!

My suspicion was correct, but there is a more straightforward way to confirm it.

From the header of the VCF file produced by Dindel:

Code:
##INFO=<ID=DP,Number=1,Type=Integer,Description="Total number of reads in haplotype window">
ratope is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 11:16 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO