SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
How to differentiate between PCR duplicates and real data? frymor Bioinformatics 5 09-15-2011 01:05 PM
Library PCR enrichment questions (with image!) ZAAB Sample Prep / Library Generation 6 06-07-2011 07:47 PM
PCR duplicates increase when excess of beads tdm SOLiD 10 03-31-2011 09:48 AM
Picard - MakeDuplicates (remove pcr duplicates) dmb Bioinformatics 2 03-16-2011 08:56 AM
how critical is the filtering of potential PCR duplicates? julien General 3 03-26-2010 10:24 AM

Reply
 
Thread Tools
Old 06-05-2011, 05:46 PM   #1
slny
Member
 
Location: FL

Join Date: Mar 2011
Posts: 53
Default PCR duplicates questions

Hi,

I'm still confused about PCR duplicates removal and have some questions about it.

1. What is PCR duplicates? Can I say that all reads mapped to the same genome location are PCR duplicates?

2. Is PCR duplicates removal necessary for mRNA Seq and Genome DNA Seq?

Thanks a lot!
Slny
slny is offline   Reply With Quote
Old 06-06-2011, 07:32 AM   #2
Heisman
Senior Member
 
Location: St. Louis

Join Date: Dec 2010
Posts: 535
Default

PCR duplicates are sequences of DNA that arise from the same parent molecule throughout the course of many PCR cycles. Thus, after sequencing one of them, you do not learn any new biological information from sequencing more of them as you are just repetitively obtaining the sequence of the same parent molecule.

If two reads map to the exact same location and have the same sequence, that is evidence they are PCR duplicates. It's also possible that this will occur by random chance, especially if you obtain high coverage. If you use paired end reads, then it's easier to pick out duplicates as both reads have to start at the same location.

You should definitely remove PCR duplicates as they do not yield more information. In fact, they will artificially give you more information, possibly misrepresenting the actual sample.
Heisman is offline   Reply With Quote
Old 06-06-2011, 07:53 AM   #3
slny
Member
 
Location: FL

Join Date: Mar 2011
Posts: 53
Default

For mRNA Seq, if we remove the PCR duplicates, which actually occurred by random chance, then we will get wrong read counts. Is removal of PCR duplicates also recommended in mRNA Seq?
slny is offline   Reply With Quote
Old 06-06-2011, 11:09 AM   #4
JohnK
Senior Member
 
Location: Los Angeles, China.

Join Date: Feb 2010
Posts: 106
Default

Quote:
Originally Posted by slny View Post
For mRNA Seq, if we remove the PCR duplicates, which actually occurred by random chance, then we will get wrong read counts. Is removal of PCR duplicates also recommended in mRNA Seq?
Essentially, you're removing them because you can't disambiguate whether the read came from a unique bead source versus PCR. IMO, I removed PCR dups based on start and stop alone, because PCR has an inherent error rate. As before, you can't tell whether the base-differences (mm) came from independent events, or PCR-error.
JohnK is offline   Reply With Quote
Old 06-06-2011, 11:45 AM   #5
JohnK@Genome_Quest
Junior Member
 
Location: Worcester, MA

Join Date: Jun 2011
Posts: 7
Default

Quote:
Originally Posted by slny View Post
For mRNA Seq, if we remove the PCR duplicates, which actually occurred by random chance, then we will get wrong read counts. Is removal of PCR duplicates also recommended in mRNA Seq?
Also, I removed PCR duplicates for all applications- even RNA-Seq. Clearly, if you're trying to estimate transcript abundance, or estimate splicing-efficiency then PCR duplicates will have some sort of effect on your results. Now I'm not saying it'll be terrible, but subtle- yes. This of course is a matter of opinion, and I've seen people put up 'ok' arguments both ways. I'd say you'd have to get down into the finer details of your experiment as well as see how much PCR was done.
JohnK@Genome_Quest is offline   Reply With Quote
Old 06-06-2011, 03:37 PM   #6
kopi-o
Senior Member
 
Location: Stockholm, Sweden

Join Date: Feb 2008
Posts: 319
Default

If you have paired-end data for RNA-seq, PCR duplicates should be removed. There is a very low probability to get identically mapping paired-end reads and the bias from leaving PCR duplicates will almost certainly be worse than the removal of a few genuine fragments.
kopi-o is offline   Reply With Quote
Old 06-06-2011, 06:34 PM   #7
slny
Member
 
Location: FL

Join Date: Mar 2011
Posts: 53
Default

Does removal of PCR duplicates mean that all the reads are removed or only one read is kept?

If only one read is kept, then it won't influence the de novo assembling result no matter removing PCR duplicates or not. If all the reads are removed, then bias is created.
slny is offline   Reply With Quote
Old 06-06-2011, 08:47 PM   #8
JohnK@Genome_Quest
Junior Member
 
Location: Worcester, MA

Join Date: Jun 2011
Posts: 7
Default

One read is kept.
JohnK@Genome_Quest is offline   Reply With Quote
Old 06-07-2011, 04:06 AM   #9
slny
Member
 
Location: FL

Join Date: Mar 2011
Posts: 53
Default

Got it. Thanks a lot for all the helps.
slny is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 04:39 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO