Hi,
I ran an RNA-Seq data set using the RUM-pipeline to align the data. I tried to use the Picard's MarkDuplicates - and it tagged very single read in the bam files as a duplicate.
I used the following parameters for MarkDuplicates:
java -jar /private/software/packages/picard-tools-1.84/MarkDuplicates.jar I=RUM-sorted.bam O=RUM-sorted-dups_marked.bam METRICS_FILE=dups_metrics AS=true VALIDATION_STRINGENCY=SILENT
(I had sorted the bam using samtools- but it was not recognized by MarkDuplicates that is why I used AS=true)
In the dups_metrics file it lists the percent_duplication at 43%
Any ideas?
Thanks!
Tirza
I ran an RNA-Seq data set using the RUM-pipeline to align the data. I tried to use the Picard's MarkDuplicates - and it tagged very single read in the bam files as a duplicate.
I used the following parameters for MarkDuplicates:
java -jar /private/software/packages/picard-tools-1.84/MarkDuplicates.jar I=RUM-sorted.bam O=RUM-sorted-dups_marked.bam METRICS_FILE=dups_metrics AS=true VALIDATION_STRINGENCY=SILENT
(I had sorted the bam using samtools- but it was not recognized by MarkDuplicates that is why I used AS=true)
In the dups_metrics file it lists the percent_duplication at 43%
Any ideas?
Thanks!
Tirza
Comment