Unconfigured Ad

**simonandrews** · 09-15-2009, 11:42 PM

I'm not really clear on what you're saying here. Do you find that 99% of your reads are the same sequence, with exactly the same start and end positions? If that's the case I'd suspect that you may have just ended up sequencing a primer rather than your library. Sometimes these primer sequences can map to a reference genome and give a false impression that you're seeing a real genomic sequence.

Alternatively are you saying that you have many clusters (if so, how many?), but that in each one you see just a single read duplicated many times, with no other overlapping reads? In this case I'd suspect a problem with your library preparation - probably in one of the PCR steps. This is assuming that your library was prepared using random fragmentation (sonnication or similar). If your library was generated by restriction digestion then this is what you'd expect to see.

Have you checked the mapping efficiency of your sequence (ie what proportion of clusters were able to be mapped to your reference). This might give a clue as to what's gone wrong.

**tec** · 09-16-2009, 12:21 AM

duplicate reads in ChIPSeq

Hello simonandrews,

-> Alternatively are you saying that you have many clusters (if so, how many?), but that in each one you see just a single read duplicated many times, with no other overlapping reads?

Thats exactly what i see. I work with the human genome and can detect at least clusters on every chromosome. Using seqmap for mapping of ~ 5 mil single reads it outputs ~ 15.000 unique locations of single reads - all other fall in this locations (duplicates). The mapping efficiency is ~ 65% as expected.
(mapping with eland gives the same proportion)

The library was prepared using random fragmentation (sonication) and the initial fragment length is ~ 200 - 400 bp.

I have no idea what's gone wrong. What could happend during the library preparation?

Thanks! tec

**simonandrews** · 09-16-2009, 12:33 AM

My immediate thought would be that you could have had a step in your library prep where you lost virtually all of your input material, and that a subsequent PCR step dramatically amplified what was left and produced a large number of duplicated reads.

**tec** · 09-16-2009, 12:49 AM

ok, but how this could happend??? (..a virtually loss?)

The library was prepared using the standard illumina protocol and kit.
We sequencend another ChIPSeq experiment and there was no such problem.

Thanks! tec

**tec** · 10-06-2009, 07:46 AM

duplicate reads in ChIPSeq !?

Hello all,

the problem with duplicate reads still keeps me busy..
Therefore we performed a Topo cloning resequencing check of the library.
Surprisingly, over 75% of the clones were unique - which doesn't correlate with the sequencing run!!!

Does anyone have an idea???

Thanks! tec

**dvh** · 10-07-2009, 04:35 AM

Thats just a sampling issue.

Say there are only 1000 unique molecules in the library:

If you topo/sanger sequence x100, only a few will look like duplicates.

But if you nex-gen sequence 10,000 most will look like duplicates.

Make another library with more DNA input, less PCR...

**tec** · 10-08-2009, 04:23 AM

Originally posted by dvh View Post

Thats just a sampling issue.

Say there are only 1000 unique molecules in the library:

If you topo/sanger sequence x100, only a few will look like duplicates.

But if you nex-gen sequence 10,000 most will look like duplicates.

Make another library with more DNA input, less PCR...

i agree!
But taken the fact into acount that another library showed exact the same distribution in the topo/sanger sequencing and the Illumina sequencing gave nice results - i am confused.
Is it possible that during the preparation of the flow cell, e.g. cluster generation.., something went wrong which could led to that result???

Topics	Statistics	Last Post
Large-Scale Protein Screen Uncovers Hidden Regulators of Alternative Polyadenylation by SEQadmin2 Started by SEQadmin2, 06-26-2026, 11:10 AM	0 responses 14 views 0 reactions	Last Post by SEQadmin2 06-26-2026, 11:10 AM
Whole-Genome Sequencing Traces Faroe Islands Ancestry to a North Atlantic Founder Population by SEQadmin2 Started by SEQadmin2, 06-17-2026, 06:09 AM	0 responses 48 views 0 reactions	Last Post by SEQadmin2 06-17-2026, 06:09 AM
Sequencing the Two-Toed Sloth Genome Reveals Jumping Genes Tied to Its Extreme Metabolism by SEQadmin2 Started by SEQadmin2, 06-09-2026, 11:58 AM	0 responses 107 views 0 reactions	Last Post by SEQadmin2 06-09-2026, 11:58 AM
A New Method Makes Hantavirus Genome Analysis Faster and More Accessible by SEQadmin2 Started by SEQadmin2, 06-05-2026, 10:09 AM	0 responses 125 views 0 reactions	Last Post by SEQadmin2 06-05-2026, 10:09 AM

Unconfigured Ad

duplicate reads in ChIPSeq

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News