I know samtool and picard can remove duplicates. But is it really necessary? A duplicate could be PCR effect or reading same fragment twice, there is no way to tell.
Also how do you define a duplicte? Why do both sametools and picard take in bam files as input? In theory, you can remove duplicate from raw data already. Is it because they only check the aligned location not the actual read?
Also how do you define a duplicte? Why do both sametools and picard take in bam files as input? In theory, you can remove duplicate from raw data already. Is it because they only check the aligned location not the actual read?
Comment