View Single Post
Old 05-28-2014, 07:54 AM   #2
Senior Member
Location: Sweden

Join Date: Mar 2008
Posts: 324

Originally Posted by Fernas View Post
Hi all,

My question is about processing RNASeq data in order to extract the normal gene expression count matrix:

if two reads are mapped uniquely to the same genome position and both of them have identical alignment region. Shall we count them as 2 reads or simply 1.

If you say that we count them as 2 reads, then my question: why in Chipseq processing, we count these 2 reads as only 1 to remove the PCR bias problem which may cause such duplicates.
De-duplication would not let you analyze highly expressed transcripts which will need to have many identical reads. If you think your RNA-seq sample has a high PCR duplication rate it is better to downsample it.
Chipper is offline   Reply With Quote