Hi,
I would like to count duplicates in bam files. I am comparing two mapping tools and looking at the total counts. Typically, it should be the same
value. My assumption is that BWA mem counts duplicates.
What I did:
1) convert sam -> bam file
2) sort bam file
3) use the Picard tool MarkDuplicates.jar
4) use the BuildBamIndex.jar
Mapped Reads (CLC) 6,876,285
Mapped Reads (BWA) 6,375,889
Unmapped Reads (CLC) 231,927
Unmapped Reads (BWA) 7,367,30
Total count (CLC) 7,108,212
Total count (BWA) 7,112,619
What would be the next step ? I ve tried to use the .bai files ... but do they have information about the number of duplicates ?
Do you have any suggestions ?
Best,
Flo
I would like to count duplicates in bam files. I am comparing two mapping tools and looking at the total counts. Typically, it should be the same
value. My assumption is that BWA mem counts duplicates.
What I did:
1) convert sam -> bam file
2) sort bam file
3) use the Picard tool MarkDuplicates.jar
4) use the BuildBamIndex.jar
Mapped Reads (CLC) 6,876,285
Mapped Reads (BWA) 6,375,889
Unmapped Reads (CLC) 231,927
Unmapped Reads (BWA) 7,367,30
Total count (CLC) 7,108,212
Total count (BWA) 7,112,619
What would be the next step ? I ve tried to use the .bai files ... but do they have information about the number of duplicates ?
Do you have any suggestions ?
Best,
Flo
Comment