hey all,
So I've been banging my head against this for a little while now and I hope someone can help. My .bam file was generated by using BWA-MEM to align the data to the reference, sorted and indexed using samtools, and now I'm trying to validate the file using picard-tools.
The reporting file is returning a VERY large number of errors which (it seems to me) are that the .bam sequence data is not matching up to the reference. Why is it that I am encountering these errors when the sequence information was aligned to that reference using BWA?
I'm not sure if there's been a problem with the actual alignment, or conversion, or with picard itself. Any input would be much appreciated.
Example of the errors below:
---------------
ERROR: Record 811, Read name HWI-ST0733:209:C0CDKACXX:2:1106:13607:160009, NM tag (nucleotide differences) in file [2] does not match reality [3]
ERROR: Record 812, Read name HWI-ST0733:209:C0CDKACXX:2:1301:12877:138617, NM tag (nucleotide differences) in file [2] does not match reality [3]
ERROR: Record 813, Read name HWI-ST0733:209:C0CDKACXX:2:1204:12454:19349, NM tag (nucleotide differences) in file [2] does not match reality [3]
---------------
Command for validation:
Cheers
So I've been banging my head against this for a little while now and I hope someone can help. My .bam file was generated by using BWA-MEM to align the data to the reference, sorted and indexed using samtools, and now I'm trying to validate the file using picard-tools.
The reporting file is returning a VERY large number of errors which (it seems to me) are that the .bam sequence data is not matching up to the reference. Why is it that I am encountering these errors when the sequence information was aligned to that reference using BWA?
I'm not sure if there's been a problem with the actual alignment, or conversion, or with picard itself. Any input would be much appreciated.
Example of the errors below:
---------------
ERROR: Record 811, Read name HWI-ST0733:209:C0CDKACXX:2:1106:13607:160009, NM tag (nucleotide differences) in file [2] does not match reality [3]
ERROR: Record 812, Read name HWI-ST0733:209:C0CDKACXX:2:1301:12877:138617, NM tag (nucleotide differences) in file [2] does not match reality [3]
ERROR: Record 813, Read name HWI-ST0733:209:C0CDKACXX:2:1204:12454:19349, NM tag (nucleotide differences) in file [2] does not match reality [3]
---------------
Command for validation:
Code:
picard-tools ValidateSamFile INPUT=C0CDKACXX-2_aln-pe_sorted_backconverted_withreadgroup_AddOrReplace.bam OUTPUT=C0CDKACXX-2_aln-pe_sorted_backconverted_withreadgroup_AddOrReplace_Validated.bam REFERENCE_SEQUENCE=/data/reference/Oarv3.1.alldna.fasta
Comment