SEQanswers

Go Back   SEQanswers > Sequencing Technologies/Companies > Illumina/Solexa



Similar Threads
Thread Thread Starter Forum Replies Last Post
repeating chipseq or chipseq with another antibody hawainpanda Sample Prep / Library Generation 2 03-12-2015 03:56 PM
Duplicate Reads myronpeto Bioinformatics 7 03-07-2013 07:36 AM
Indexing ChIPseq libraries using Illumina's TruSeq and ChIPseq kits Alex Clop Epigenetics 6 11-08-2012 11:07 AM
ChIPSeq: comparing lanes with different number of reads dnusol Bioinformatics 2 11-25-2010 10:52 PM
ChipSEQ on Solexa (low % align, unusable reads) bioinfosm Bioinformatics 4 12-01-2008 11:54 AM

Reply
 
Thread Tools
Old 09-15-2009, 12:59 PM   #1
tec
Member
 
Location: germany

Join Date: Apr 2008
Posts: 14
Default duplicate reads in ChIPSeq

Hello community,

we have a problem concernig a illumina sequenced ChIPSeq experiment.
After mapping and viewing the reads in the UCSC GB surprisedly 99% of the reads map to some unique locations. The corresonding reads share the same start and end coordinate and there are no additional cluster of duplication surrounding a location in terms of the origional fragment lenght.

Does anyone have an idea? I would very much appreciate your assistance

tec
tec is offline   Reply With Quote
Old 09-15-2009, 11:42 PM   #2
simonandrews
Simon Andrews
 
Location: Babraham Inst, Cambridge, UK

Join Date: May 2009
Posts: 871
Default

I'm not really clear on what you're saying here. Do you find that 99% of your reads are the same sequence, with exactly the same start and end positions? If that's the case I'd suspect that you may have just ended up sequencing a primer rather than your library. Sometimes these primer sequences can map to a reference genome and give a false impression that you're seeing a real genomic sequence.

Alternatively are you saying that you have many clusters (if so, how many?), but that in each one you see just a single read duplicated many times, with no other overlapping reads? In this case I'd suspect a problem with your library preparation - probably in one of the PCR steps. This is assuming that your library was prepared using random fragmentation (sonnication or similar). If your library was generated by restriction digestion then this is what you'd expect to see.

Have you checked the mapping efficiency of your sequence (ie what proportion of clusters were able to be mapped to your reference). This might give a clue as to what's gone wrong.
simonandrews is offline   Reply With Quote
Old 09-16-2009, 12:21 AM   #3
tec
Member
 
Location: germany

Join Date: Apr 2008
Posts: 14
Default duplicate reads in ChIPSeq

Hello simonandrews,

-> Alternatively are you saying that you have many clusters (if so, how many?), but that in each one you see just a single read duplicated many times, with no other overlapping reads?

Thats exactly what i see. I work with the human genome and can detect at least clusters on every chromosome. Using seqmap for mapping of ~ 5 mil single reads it outputs ~ 15.000 unique locations of single reads - all other fall in this locations (duplicates). The mapping efficiency is ~ 65% as expected.
(mapping with eland gives the same proportion)

The library was prepared using random fragmentation (sonication) and the initial fragment length is ~ 200 - 400 bp.

I have no idea what's gone wrong. What could happend during the library preparation?

Thanks! tec
tec is offline   Reply With Quote
Old 09-16-2009, 12:33 AM   #4
simonandrews
Simon Andrews
 
Location: Babraham Inst, Cambridge, UK

Join Date: May 2009
Posts: 871
Default

My immediate thought would be that you could have had a step in your library prep where you lost virtually all of your input material, and that a subsequent PCR step dramatically amplified what was left and produced a large number of duplicated reads.
simonandrews is offline   Reply With Quote
Old 09-16-2009, 12:49 AM   #5
tec
Member
 
Location: germany

Join Date: Apr 2008
Posts: 14
Default

ok, but how this could happend??? (..a virtually loss?)

The library was prepared using the standard illumina protocol and kit.
We sequencend another ChIPSeq experiment and there was no such problem.

Thanks! tec
tec is offline   Reply With Quote
Old 10-06-2009, 07:46 AM   #6
tec
Member
 
Location: germany

Join Date: Apr 2008
Posts: 14
Exclamation duplicate reads in ChIPSeq !?

Hello all,

the problem with duplicate reads still keeps me busy..
Therefore we performed a Topo cloning resequencing check of the library.
Surprisingly, over 75% of the clones were unique - which doesn't correlate with the sequencing run!!!

Does anyone have an idea???

Thanks! tec
tec is offline   Reply With Quote
Old 10-07-2009, 04:35 AM   #7
dvh
Member
 
Location: london, uk

Join Date: Jul 2008
Posts: 35
Default

Thats just a sampling issue.

Say there are only 1000 unique molecules in the library:

If you topo/sanger sequence x100, only a few will look like duplicates.

But if you nex-gen sequence 10,000 most will look like duplicates.

Make another library with more DNA input, less PCR...
dvh is offline   Reply With Quote
Old 10-08-2009, 04:23 AM   #8
tec
Member
 
Location: germany

Join Date: Apr 2008
Posts: 14
Default

Quote:
Originally Posted by dvh View Post
Thats just a sampling issue.

Say there are only 1000 unique molecules in the library:

If you topo/sanger sequence x100, only a few will look like duplicates.

But if you nex-gen sequence 10,000 most will look like duplicates.

Make another library with more DNA input, less PCR...
i agree!
But taken the fact into acount that another library showed exact the same distribution in the topo/sanger sequencing and the Illumina sequencing gave nice results - i am confused.
Is it possible that during the preparation of the flow cell, e.g. cluster generation.., something went wrong which could led to that result???
tec is offline   Reply With Quote
Reply

Tags
applications, chipseq, illumina sequencing, library generation, sample prep

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 12:25 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO