View Single Post
Old 01-18-2010, 01:56 PM   #2
Nils Homer
nilshomer's Avatar
Location: Boston, MA, USA

Join Date: Nov 2008
Posts: 1,285

Originally Posted by julien View Post

I don't really have a good appreciation for how critical it is to remove potential PCR duplicates from the alignments (either via samtools rmdup or picard). It would appear (based on posts in this forum and pipelines that I have found online) that many ??? do not worry about this issue, am I wrong ? It did not seem to affect my data very much but I would like to know how important an issue this is.

Thank you !!!!
Most whole-genome human resequencing papers try to detect PCR duplicates and I find that it is extremely important to remove them (or flag them), especially when dealing with low-complexity libraries. The latter can happen with small amounts of input DNA or re-using the same library. When searching for variants, it is assumed that each read is conditionally independent given the underlying DNA sequence (i.e. not the same DNA fragment read twice).
nilshomer is offline   Reply With Quote