SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
No peaks in my ChIP Seq samples Chloe Sample Prep / Library Generation 12 01-25-2017 11:16 PM
ChIP-Seq: False positive peaks in ChIP-seq and other sequencing-based functional assa Newsbot! Literature Watch 0 06-29-2011 01:10 PM
ChIP-Seq: Using MACS to Identify Peaks from ChIP-Seq Data. Newsbot! Literature Watch 0 06-03-2011 02:00 AM
sequencing exons in chip-seq mattanswers General 11 07-13-2010 11:05 AM
GBrowse and Chip-seq Peaks luqie Bioinformatics 0 03-10-2009 12:09 AM

Reply
 
Thread Tools
Old 01-11-2011, 06:08 AM   #1
pogaora
Member
 
Location: Dublin

Join Date: Oct 2008
Posts: 11
Default ChIP-seq peaks map to exons

Hi,

We have some ChIP-seq data which targetted a transcription factor in human cells. We have found that in a lot of cases we see peaks lining up with exons. Has anybody else found this and anyone know why it might happen?

Cheers.

Last edited by pogaora; 01-11-2011 at 06:39 AM.
pogaora is offline   Reply With Quote
Old 01-11-2011, 07:56 AM   #2
townway
Member
 
Location: Rockville

Join Date: May 2009
Posts: 40
Default

we had the same situation as well when we use tags to do the CHIP

Quote:
Originally Posted by pogaora View Post
Hi,

We have some ChIP-seq data which targetted a transcription factor in human cells. We have found that in a lot of cases we see peaks lining up with exons. Has anybody else found this and anyone know why it might happen?

Cheers.
townway is offline   Reply With Quote
Old 01-12-2011, 12:24 AM   #3
mudshark
Senior Member
 
Location: Munich

Join Date: Jan 2009
Posts: 138
Default

did you perform a control sequencing reaction - input or mock? transcribed regions are in general over-represented in chromatin preps and tend to attract peak calls if controls are not included in the analysis.
mudshark is offline   Reply With Quote
Old 01-12-2011, 03:14 AM   #4
pogaora
Member
 
Location: Dublin

Join Date: Oct 2008
Posts: 11
Default

Quote:
Originally Posted by mudshark View Post
did you perform a control sequencing reaction - input or mock? transcribed regions are in general over-represented in chromatin preps and tend to attract peak calls if controls are not included in the analysis.
Hi,

Yes, we did an input control which left out the IP step.
pogaora is offline   Reply With Quote
Old 01-12-2011, 05:00 AM   #5
mudshark
Senior Member
 
Location: Munich

Join Date: Jan 2009
Posts: 138
Default

a) did you include the input as a background control in your peak calling procedure?

b) are you sure that your antibody is specific?

c) did you validate some of the exonic peaks using qPCR?

d) did you test if the peaks in exons are enriched for the consensus binding motif of your TF?
mudshark is offline   Reply With Quote
Old 01-12-2011, 05:23 AM   #6
pogaora
Member
 
Location: Dublin

Join Date: Oct 2008
Posts: 11
Default

Quote:
Originally Posted by mudshark View Post
a) did you include the input as a background control in your peak calling procedure?

b) are you sure that your antibody is specific?

c) did you validate some of the exonic peaks using qPCR?

d) did you test if the peaks in exons are enriched for the consensus binding motif of your TF?
Hi mudshark,

The input was included when we called the peaks, although having said that, this is a feature that we observe when visualising the data on UCSC browser (lots of reads piling up in the ChIP sample with nothing in the input) rather than being peaks called with a low FDR by the software. Did I mislead readers with the term peak?!

The antibody is ChIP certified by the vendors and specific so far as we can see.

We haven´t done any validations yet - we´re still trying to get a handle on what are likely to be ´real´ enriched regions.

We don´t have a reliable motif for this TF. Known targets are very scarce.
pogaora is offline   Reply With Quote
Old 01-12-2011, 06:52 AM   #7
mudshark
Senior Member
 
Location: Munich

Join Date: Jan 2009
Posts: 138
Default

just to get you right: you "see" the peak but the peak finding software does not find them?

if yes: don't trust your eyes

what is the scaling of input and IP signals when you look in the browser, what are your total read numbers in IP and input?
mudshark is offline   Reply With Quote
Old 01-12-2011, 10:37 AM   #8
JohnK
Senior Member
 
Location: Los Angeles, China.

Join Date: Feb 2010
Posts: 106
Default

Are you sure your gene-model is for the right build you're using?
Are you sure the gene exists?
CCDS may be a good way of making sure you don't include erroneous gene predictions.
You may have a signal, but you under-sequenced- what's your MSERR?
You may not have a signal.
You could use MEME and SamTools to build your consensus sequences, crop them and look for motifs; with MEME.
I'd like to hear more about this as I am interested now.
JohnK is offline   Reply With Quote
Old 01-13-2011, 06:29 AM   #9
pogaora
Member
 
Location: Dublin

Join Date: Oct 2008
Posts: 11
Default

Quote:
Originally Posted by JohnK View Post
Are you sure your gene-model is for the right build you're using?
Are you sure the gene exists?
CCDS may be a good way of making sure you don't include erroneous gene predictions.
You may have a signal, but you under-sequenced- what's your MSERR?
You may not have a signal.
You could use MEME and SamTools to build your consensus sequences, crop them and look for motifs; with MEME.
I'd like to hear more about this as I am interested now.
Glad to have piqued your interest!
We´re happy with the gene model and the existence/expression of the gene in these cells.
However, your point about under-sequencing is well taken. Our yield of alignable reads was low (total of about 5 million alignable reads from 2 replicates). What is MSERR?

The attached pdf of UCSC tracks shows what we´re talking about. I missed the scale on the left of the top track - it goes up to 6 (six) compared with the scale on the bottom track which goes to 73. The Input control was used in peak calling.
The peak calling software did in fact call the first peak (on the left of the fig) as a peak but none of the others.
We plan to use MEME to look at generating a motif but need to decide what regions to feed it first.

Cheers
Attached Files
File Type: pdf exon_peaks.pdf (32.8 KB, 95 views)
pogaora is offline   Reply With Quote
Old 01-13-2011, 07:10 AM   #10
Chipper
Senior Member
 
Location: Sweden

Join Date: Mar 2008
Posts: 324
Default

Looks like you have a contamination from an exome capture study?
Chipper is offline   Reply With Quote
Old 01-13-2011, 07:21 AM   #11
mudshark
Senior Member
 
Location: Munich

Join Date: Jan 2009
Posts: 138
Default

looks a little bit too systematic.

is the gene you show in the browser a gene that someone working close to you just maxi-prepped? could it be a very trivial wetlab contamination? me, e.g., i work in drosophila and whatever ChiP I generate and whatever other labs do ip the white gene is always bound (exons only) as it is the favorite marker gene..

if not i would anyway immediately do a qPCR validation and check a different antibody in case one is available.
mudshark is offline   Reply With Quote
Old 01-13-2011, 07:51 AM   #12
pogaora
Member
 
Location: Dublin

Join Date: Oct 2008
Posts: 11
Default

Quote:
Originally Posted by Chipper View Post
Looks like you have a contamination from an exome capture study?
We haven't done any exome capture experiments, but there were RNAseq samples on the same flowcell. The picture shown is not typical of most genes though - even some which we know are highly expressed in these cells.
pogaora is offline   Reply With Quote
Old 01-13-2011, 07:56 AM   #13
JohnK
Senior Member
 
Location: Los Angeles, China.

Join Date: Feb 2010
Posts: 106
Default

Quote:
Originally Posted by pogaora View Post
Glad to have piqued your interest!
We´re happy with the gene model and the existence/expression of the gene in these cells.
However, your point about under-sequencing is well taken. Our yield of alignable reads was low (total of about 5 million alignable reads from 2 replicates). What is MSERR?

The attached pdf of UCSC tracks shows what we´re talking about. I missed the scale on the left of the top track - it goes up to 6 (six) compared with the scale on the bottom track which goes to 73. The Input control was used in peak calling.
The peak calling software did in fact call the first peak (on the left of the fig) as a peak but none of the others.
We plan to use MEME to look at generating a motif but need to decide what regions to feed it first.

Cheers
MSER is the minimum saturated enrichment ratio, which is the estimated level for which your sequencing would saturate the binding site. (Kharchenko, et al; 2008)
JohnK is offline   Reply With Quote
Old 01-13-2011, 07:59 AM   #14
pogaora
Member
 
Location: Dublin

Join Date: Oct 2008
Posts: 11
Default

Quote:
Originally Posted by townway View Post
we had the same situation as well when we use tags to do the CHIP
Sorry - forgot to say that this was antibody directed against the native protein - nothing tagged.
pogaora is offline   Reply With Quote
Old 01-13-2011, 08:00 AM   #15
pogaora
Member
 
Location: Dublin

Join Date: Oct 2008
Posts: 11
Default

Quote:
Originally Posted by mudshark View Post
looks a little bit too systematic.

is the gene you show in the browser a gene that someone working close to you just maxi-prepped? could it be a very trivial wetlab contamination? me, e.g., i work in drosophila and whatever ChiP I generate and whatever other labs do ip the white gene is always bound (exons only) as it is the favorite marker gene..

if not i would anyway immediately do a qPCR validation and check a different antibody in case one is available.
The gene shown is one that we have an interest in but isn't one that is worked on in the lab specifically. Unlikely that anyone would have a prep of this floating around but I can check - I'm not the wet lab person.

If there is no obvious explanation, I guess the picture is intriguing enough to warrant some lab investigations to figure out if it's an artefact or not. I thought there might be some previous experience of such an occurrence.
pogaora is offline   Reply With Quote
Old 01-13-2011, 08:17 AM   #16
JohnK
Senior Member
 
Location: Los Angeles, China.

Join Date: Feb 2010
Posts: 106
Default

Quote:
Originally Posted by pogaora View Post
The gene shown is one that we have an interest in but isn't one that is worked on in the lab specifically. Unlikely that anyone would have a prep of this floating around but I can check - I'm not the wet lab person.

If there is no obvious explanation, I guess the picture is intriguing enough to warrant some lab investigations to figure out if it's an artefact or not. I thought there might be some previous experience of such an occurrence.
Did you de-dup?
JohnK is offline   Reply With Quote
Old 01-13-2011, 08:20 AM   #17
pogaora
Member
 
Location: Dublin

Join Date: Oct 2008
Posts: 11
Default

Quote:
Originally Posted by JohnK View Post
Did you de-dup?
Yes indeed.
pogaora is offline   Reply With Quote
Old 01-13-2011, 08:43 AM   #18
JohnK
Senior Member
 
Location: Los Angeles, China.

Join Date: Feb 2010
Posts: 106
Default

Quote:
Originally Posted by pogaora View Post
Yes indeed.
Possibility:

Bias towards GC-rich content in fragment selection both in library prep and amplification before and during sequencing.

(Park, October 2009)
JohnK is offline   Reply With Quote
Old 01-13-2011, 11:24 PM   #19
mudshark
Senior Member
 
Location: Munich

Join Date: Jan 2009
Posts: 138
Default

as there are no such peaks in the input I assume that it is not a systematic bias such as an over representation of GC rich fragments
mudshark is offline   Reply With Quote
Old 01-14-2011, 12:06 AM   #20
Chipper
Senior Member
 
Location: Sweden

Join Date: Mar 2008
Posts: 324
Default

Quote:
Originally Posted by pogaora View Post
We haven't done any exome capture experiments, but there were RNAseq samples on the same flowcell. The picture shown is not typical of most genes though - even some which we know are highly expressed in these cells.
If RNA-seq was strand-specific it may explain why you did not get any peak calls (if your peak caller requires peaks on both strands). You could perhaps calculate RPKMs and compare to the other samples, or compare SNP calls in exons to figure out where the reads come from...

Bias may be different in ChIP and input since you start with less material in the ChIP library, but GC-bias will not result in such enrichments over exons.
Chipper is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 12:24 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO