View Single Post
Old 09-17-2010, 08:53 PM   #2
malachig
Senior Member
 
Location: WashU

Join Date: Aug 2010
Posts: 117
Default

Your first question regarding duplicate/identical reads is an interesting one that has been discussed at great length in this forum. Among the issues discussed are how to identify duplicate reads, where they come from, how many to expect in particular types of libraries, if they should removed, and ways to remove them.

One discussion of read duplicates and links to other posts discussing them further can be found here:
Removing duplicates...

What type of libraries are you sequencing (whole genome, exome, ChIP-Seq, RNA-seq, miRNA, mitochondrial genomes, pooled PCR amplicons, etc.)?? The type of library and the corresponding analysis goals (measure expression, identify mutations, peak finding, etc.) can a have major impact on the way duplicates are handled. If you provide specific details of you libraries, someone with similar data may respond...
malachig is offline   Reply With Quote