Hello,
i simulated SOLiD-PE Reads with dwgsim and use BWA (0.5-Version) for mapping in Colorspace. When use the mpileup command from Samtools i got no ja wrong and strange output. I use the same Bamfile for other callers like freebayes or gatk and there i got my expected results (round about 18000 Indels/SNP in EXOM)
I added CS:Z: and CQ:Z: tag and READ-Groups to my samfile because GATK_BQSR need it and it works well. So i think this should not be the problem.
Here are my Inputs:
SAM-file TEST_SOLID_5x_header.sam:
....
@SQ SN:chr21 LN:48129895
@SQ SN:chr22 LN:51304566
@SQ SN:chrX LN:155270560
@SQ SN:chrY LN:59373566
@RG ID:five_fold_test PL:solid PU:test_unit LB:solid_test SM:five_fold_test
@PG ID:bwa PN:bwa VN:0.5.9-r26-dev
chr1_110292753_110292997_0_1_0_0_0:0:0_1:0:0_0 97 chr1 110292780 37 73M = 110293024 277 TTTGGGAAAGAGGTAAAATAAATAGGTGGTTACTGGGGAGGCTCCAACACAGCCAGAAGGGACACTGTTTGCT ]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]WY]]]]T]]]]]Z]]]L//////////// RG:Z:five_fold_test XT:A:U CM:i:0 SM:i:37 AM:i:37 X0:i:1 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:73 CS:Z:A210010020022201300033003320110103121000220322010111123012202002111211001320 CQ:Z:FGHFGHHEGGEGGFHBGHBHHH?HHFHBHGFFEHGGCECBADBAC+EDC=74E>AD:7A5@##############
-> convert to Bam-file -> Sort -> Index
Samtools mpileup:
samtools mpileup -uf unmutiert_ucsc_chr1-22XY.cs.fa TEST_SOLID_5x_header_sorted.bam | bcftools view -bvcg - > var.raw.bcf
I got a breakdown:
[mpileup] 1 samples in 1 input files
<mpileup> Set max per-file depth to 8000
[afs] 0:2107.299 1:27.577 2:29.124
I expect round about 18000 vcf-entries and samtools call 45_ Example:
chr1 36931497 . gga g 7.57 . INDEL;DP=3;VDB=0.0295;AF1=0.5336;CI95=0.5,1;DP4=1,0,1,1;MQ=24;FQ=-25.5;PV4=1,0.13,1,0.2 GT:PL:GQ 0/1:44,0,9:13
chr1 197093832 . tacaca taca 14.4 . INDEL;DP=2;VDB=0.0588;AF1=1;CI95=0.5,1;DP4=0,0,0,2;MQ=29;FQ=-40.5 GT:PL:GQ 1/1:53,6,0:10
chr1 205131047 . tata t 19.3 . INDEL;DP=2;VDB=0.059
Any suggestions?
Best regards
i simulated SOLiD-PE Reads with dwgsim and use BWA (0.5-Version) for mapping in Colorspace. When use the mpileup command from Samtools i got no ja wrong and strange output. I use the same Bamfile for other callers like freebayes or gatk and there i got my expected results (round about 18000 Indels/SNP in EXOM)
I added CS:Z: and CQ:Z: tag and READ-Groups to my samfile because GATK_BQSR need it and it works well. So i think this should not be the problem.
Here are my Inputs:
SAM-file TEST_SOLID_5x_header.sam:
....
@SQ SN:chr21 LN:48129895
@SQ SN:chr22 LN:51304566
@SQ SN:chrX LN:155270560
@SQ SN:chrY LN:59373566
@RG ID:five_fold_test PL:solid PU:test_unit LB:solid_test SM:five_fold_test
@PG ID:bwa PN:bwa VN:0.5.9-r26-dev
chr1_110292753_110292997_0_1_0_0_0:0:0_1:0:0_0 97 chr1 110292780 37 73M = 110293024 277 TTTGGGAAAGAGGTAAAATAAATAGGTGGTTACTGGGGAGGCTCCAACACAGCCAGAAGGGACACTGTTTGCT ]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]WY]]]]T]]]]]Z]]]L//////////// RG:Z:five_fold_test XT:A:U CM:i:0 SM:i:37 AM:i:37 X0:i:1 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:73 CS:Z:A210010020022201300033003320110103121000220322010111123012202002111211001320 CQ:Z:FGHFGHHEGGEGGFHBGHBHHH?HHFHBHGFFEHGGCECBADBAC+EDC=74E>AD:7A5@##############
-> convert to Bam-file -> Sort -> Index
Samtools mpileup:
samtools mpileup -uf unmutiert_ucsc_chr1-22XY.cs.fa TEST_SOLID_5x_header_sorted.bam | bcftools view -bvcg - > var.raw.bcf
I got a breakdown:
[mpileup] 1 samples in 1 input files
<mpileup> Set max per-file depth to 8000
[afs] 0:2107.299 1:27.577 2:29.124
I expect round about 18000 vcf-entries and samtools call 45_ Example:
chr1 36931497 . gga g 7.57 . INDEL;DP=3;VDB=0.0295;AF1=0.5336;CI95=0.5,1;DP4=1,0,1,1;MQ=24;FQ=-25.5;PV4=1,0.13,1,0.2 GT:PL:GQ 0/1:44,0,9:13
chr1 197093832 . tacaca taca 14.4 . INDEL;DP=2;VDB=0.0588;AF1=1;CI95=0.5,1;DP4=0,0,0,2;MQ=29;FQ=-40.5 GT:PL:GQ 1/1:53,6,0:10
chr1 205131047 . tata t 19.3 . INDEL;DP=2;VDB=0.059
Any suggestions?
Best regards
Comment