Go Back   SEQanswers > Bioinformatics > Bioinformatics

Similar Threads
Thread Thread Starter Forum Replies Last Post
samtools mpileup calls way too less SNPs TuA Bioinformatics 17 03-01-2018 04:17 PM
Impact on quality of SNP calls using samtools mpileup myi Bioinformatics 3 03-03-2014 07:18 AM
samtools mpileup consensus calls A instead of a gap StefanP Bioinformatics 6 12-01-2012 05:55 AM
How samtools mpileup uses paired end data to determine SNP phasing bioinformer Bioinformatics 0 03-06-2012 12:19 PM
Samtools mpileup calls drastically more SNPs with -I agel Bioinformatics 0 01-20-2012 01:20 PM

Thread Tools
Old 08-01-2013, 11:29 PM   #1
Junior Member
Location: Innsbruck

Join Date: Feb 2013
Posts: 6
Default Samtools mpileup for Solid Data - calls wrong and too less SNP/INDELS


i simulated SOLiD-PE Reads with dwgsim and use BWA (0.5-Version) for mapping in Colorspace. When use the mpileup command from Samtools i got no ja wrong and strange output. I use the same Bamfile for other callers like freebayes or gatk and there i got my expected results (round about 18000 Indels/SNP in EXOM)

I added CS:Z: and CQ:Z: tag and READ-Groups to my samfile because GATK_BQSR need it and it works well. So i think this should not be the problem.

Here are my Inputs:
SAM-file TEST_SOLID_5x_header.sam:
@SQ SN:chr21 LN:48129895
@SQ SN:chr22 LN:51304566
@SQ SN:chrX LN:155270560
@SQ SN:chrY LN:59373566
@RG ID:five_fold_test PL:solid PU:test_unit LB:solid_test SM:five_fold_test
@PG ID:bwa PN:bwa VN:0.5.9-r26-dev
chr1_110292753_110292997_0_1_0_0_0:0:0_1:0:0_0 97 chr1 110292780 37 73M = 110293024 277 TTTGGGAAAGAGGTAAAATAAATAGGTGGTTACTGGGGAGGCTCCAACACAGCCAGAAGGGACACTGTTTGCT ]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]WY]]]]T]]]]]Z]]]L//////////// RG:Z:five_fold_test XT:A:U CM:i:0 SM:i:37 AM:i:37 X0:i:1 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:73 CS:Z:A210010020022201300033003320110103121000220322010111123012202002111211001320 CQ:Z:FGHFGHHEGGEGGFHBGHBHHH?HHFHBHGFFEHGGCECBADBAC+EDC=74E>AD:7A5@##############

-> convert to Bam-file -> Sort -> Index

Samtools mpileup:

samtools mpileup -uf unmutiert_ucsc_chr1-22XY.cs.fa TEST_SOLID_5x_header_sorted.bam | bcftools view -bvcg - > var.raw.bcf

I got a breakdown:
[mpileup] 1 samples in 1 input files
<mpileup> Set max per-file depth to 8000
[afs] 0:2107.299 1:27.577 2:29.124

I expect round about 18000 vcf-entries and samtools call 45_ Example:
chr1 36931497 . gga g 7.57 . INDEL;DP=3;VDB=0.0295;AF1=0.5336;CI95=0.5,1;DP4=1,0,1,1;MQ=24;FQ=-25.5;PV4=1,0.13,1,0.2 GT:PL:GQ 0/1:44,0,9:13
chr1 197093832 . tacaca taca 14.4 . INDEL;DP=2;VDB=0.0588;AF1=1;CI95=0.5,1;DP4=0,0,0,2;MQ=29;FQ=-40.5 GT:PL:GQ 1/1:53,6,0:10
chr1 205131047 . tata t 19.3 . INDEL;DP=2;VDB=0.059

Any suggestions?

Best regards
scondo is offline   Reply With Quote
Old 08-02-2013, 01:25 PM   #2
Rick Westerman
Location: Purdue University, Indiana, USA

Join Date: Jun 2008
Posts: 1,104

I suspect that BWA is not handling colorspace correctly. You'll notice that they dropped support for it in the most recent versions. While I am not sure about the following, I suspect that what they are outputting is the so-called "double-encoded" version of color space where the colorspace 0=A, 1=C and so on. If so this would obviously make any SNP calling go haywire unless you are using a double-encoded reference in Samtools. Even then I am not sure if you would get good results.

As I said I could be wrong about the above. But I do suggest using more recent tools that are made to work with native colorspace.
westerman is offline   Reply With Quote
Old 08-02-2013, 01:27 PM   #3
Rick Westerman
Location: Purdue University, Indiana, USA

Join Date: Jun 2008
Posts: 1,104

Also, now that I look more closely, it appears that your reference file is 'unmutiert_ucsc_chr1-22XY.cs.fa'. What is the format of that file? Color-space? double-encoded? Something else?
westerman is offline   Reply With Quote

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

All times are GMT -8. The time now is 10:30 AM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO