View Single Post
Old 06-06-2011, 07:32 AM   #2
Heisman
Senior Member
 
Location: St. Louis

Join Date: Dec 2010
Posts: 535
Default

PCR duplicates are sequences of DNA that arise from the same parent molecule throughout the course of many PCR cycles. Thus, after sequencing one of them, you do not learn any new biological information from sequencing more of them as you are just repetitively obtaining the sequence of the same parent molecule.

If two reads map to the exact same location and have the same sequence, that is evidence they are PCR duplicates. It's also possible that this will occur by random chance, especially if you obtain high coverage. If you use paired end reads, then it's easier to pick out duplicates as both reads have to start at the same location.

You should definitely remove PCR duplicates as they do not yield more information. In fact, they will artificially give you more information, possibly misrepresenting the actual sample.
Heisman is offline   Reply With Quote