Hello,
I want to use ReadBackedPhasing module of GATK on my data, but I have a problem which I think is beacuse in my bam file I have both records from BWA aligner and Novoalign. Just as a reminder, this is the command to run ReadBackedPhasing:
java
-jar GenomeAnalysisTK.jar
-T ReadBackedPhasing
-R reference.fasta
-I reads.bam <--- Input seq BAM file
--variant SNPs.vcf <--- SNP VCF file
-L Chr.list
-o phased_SNPs.vcf
--phaseQualityThresh 20.0
So when I try to run ReadBackedPhasing the output contains very few results because it crashed. I assume that happens when it hits a novoalign record in the bam file. An example of two records(one from BWA and the other from Novoalign) in the same bam file is shown below:
1)HWI-ST395_B80MA2ABXX_1:6:6:3197:146608#0 99 chrM 4874 60 104M = 5329 559 AACTAGCCCCCATCTCAATCATATACCAAATCTCTCCCTCACTAAACGTAAGCCTTCTCCTCACTCTCTCAATCTTATCCATCATAGCAGGCAGT
TGAGGTGGA
..fastq seq...
X0:i:1 X1:i:0 MD:Z:104 RG:Z:2_2-600-1st XG:i:0 AM:i:37 NM:i:0 SM:i:37 XM:i:0 XO:i:0 XT:A:U
2)HWI-ST395_B81AMYABXX_0:1:65:18036:20406#0 99 chrM 4874 150 2S93M9S = 2384 -2430 NNAACTAGCCCCCATCTCAATCATATACCAAATCTCTCCCTCACTAAACGTAAGCCTTCTCCTCACTCTCTCAATCTTATCCATCATAGCAGGCA
GTTGAGGTG
..fastq seq..
MD:Z:93 PG:Z:novoalign.23 RG:Z:2_2-MP-novoalign AM:i:150 NM:i:0 SM:i:150 ZO:Z:-+ PQ:i:245 UQ:i:66 AS:i:66
The difference is that even though both are in SAM format, Novoalign uses different flags which might cause this problem. If you could please help me, I would greatly appreciate it. Also would you suggest a way to convert Novoalign's SAM file into a regular SAM file with the proper flags?
Thanks a lot,
Mike
I want to use ReadBackedPhasing module of GATK on my data, but I have a problem which I think is beacuse in my bam file I have both records from BWA aligner and Novoalign. Just as a reminder, this is the command to run ReadBackedPhasing:
java
-jar GenomeAnalysisTK.jar
-T ReadBackedPhasing
-R reference.fasta
-I reads.bam <--- Input seq BAM file
--variant SNPs.vcf <--- SNP VCF file
-L Chr.list
-o phased_SNPs.vcf
--phaseQualityThresh 20.0
So when I try to run ReadBackedPhasing the output contains very few results because it crashed. I assume that happens when it hits a novoalign record in the bam file. An example of two records(one from BWA and the other from Novoalign) in the same bam file is shown below:
1)HWI-ST395_B80MA2ABXX_1:6:6:3197:146608#0 99 chrM 4874 60 104M = 5329 559 AACTAGCCCCCATCTCAATCATATACCAAATCTCTCCCTCACTAAACGTAAGCCTTCTCCTCACTCTCTCAATCTTATCCATCATAGCAGGCAGT
TGAGGTGGA
..fastq seq...
X0:i:1 X1:i:0 MD:Z:104 RG:Z:2_2-600-1st XG:i:0 AM:i:37 NM:i:0 SM:i:37 XM:i:0 XO:i:0 XT:A:U
2)HWI-ST395_B81AMYABXX_0:1:65:18036:20406#0 99 chrM 4874 150 2S93M9S = 2384 -2430 NNAACTAGCCCCCATCTCAATCATATACCAAATCTCTCCCTCACTAAACGTAAGCCTTCTCCTCACTCTCTCAATCTTATCCATCATAGCAGGCA
GTTGAGGTG
..fastq seq..
MD:Z:93 PG:Z:novoalign.23 RG:Z:2_2-MP-novoalign AM:i:150 NM:i:0 SM:i:150 ZO:Z:-+ PQ:i:245 UQ:i:66 AS:i:66
The difference is that even though both are in SAM format, Novoalign uses different flags which might cause this problem. If you could please help me, I would greatly appreciate it. Also would you suggest a way to convert Novoalign's SAM file into a regular SAM file with the proper flags?
Thanks a lot,
Mike
Comment