Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • autocorrelation pattern in ChIP-seq alignments

    Hello,

    We have ChIP-seq data that was from a single-end run with 35 bp reads. There are a few samples, with a different antibody used in each one. We aligned the reads and created autocorrelation plots (sometimes called cross-correlation) using HOMER and SPP. The DNA fragment length is around 150 bp, so we expect to see a single large peak at 150 bp.

    Some of the samples look as we expect, but some have a large peak at 35 bp, and a small peak at 150 bp. Does this mean that something is wrong with these samples?

    Thanks!

  • #2
    in fact it is a cross-correlation not an autocorrelation.

    as regards your question: i have seen this before and I don't think it is a problem in the first place. It probably depends on the 'true' fragment size of your target bound DNA, the signal-to-noise ratio and the abundance of target sites. i.e. if your signal to noise is low and the target sites are just a few you will get the average fragment size determined by the size selection step. if you have a good signal to noise and the target protein protects 35 bp of DNA you might get a cross correlation of 35bp.

    Comment


    • #3
      It's the other way around - good signal to noise gives av fragment size, else the correlation is dominated by a peak of exactly the read length. Not sure why though, but has nothing to do with protein DNA protection.

      Comment


      • #4
        This very insightful and helpful post by Anshul Kundaje on the MACS mailing list has a really good theory involving the mappability of the genome for why you see this pattern in non-enriched ChIP-seq data sets:

        Comment


        • #5
          Thank you all for your responses!

          I've looked at the data again, and the best cross-correlation profiles are from the best antibodies, so your explanations make sense.

          I only have one lingering question: is the data from the not-as-good cross-correlation profiles still usable? That is, do we need to repeat those entire experiments, or will MACS be able to identify the real peaks?

          Many thanks!

          skip56558

          Comment


          • #6
            In my experience I have not found realistic-looking or useable peaks in these types of data sets, unfortunately. I usually try to examine some of the peaks in a browser - you can tell pretty quickly if they look like real ChIP-seq peaks, which are very enriched compared to the background, or just like slightly higher regions in a noisy background. Another way to check is to run your peaks through an annotation tool like CEAS and look for enrichment in promoter regions.

            My experience is with ChIP-seq for transcription factor binding sites, so that advice might not apply for other types of experiments like histone modifications, though.

            Comment


            • #7
              Originally posted by cwhelan View Post
              This very insightful and helpful post by Anshul Kundaje on the MACS mailing list has a really good theory involving the mappability of the genome for why you see this pattern in non-enriched ChIP-seq data sets:

              http://groups.google.com/group/macs-...595465a1f9b212
              Here is some more information from the same author: Phantom Peaks

              I've also noticed the same thing - that there are usually two peaks: one at the read length and one at the average fragment length. I have found that the strength of the fragment length peak compared to the read length peak is usually a good indicator of the signal-to-noise quality and one's ability to detect peaks in the data.

              I've always been under the impression that those peaks at the read length might be caused by PCR duplication, but the above link also has a good idea about biases in mappability.

              Justin

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Current Approaches to Protein Sequencing
                by seqadmin


                Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                04-04-2024, 04:25 PM
              • seqadmin
                Strategies for Sequencing Challenging Samples
                by seqadmin


                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                03-22-2024, 06:39 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, 04-11-2024, 12:08 PM
              0 responses
              31 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-10-2024, 10:19 PM
              0 responses
              32 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-10-2024, 09:21 AM
              0 responses
              28 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-04-2024, 09:00 AM
              0 responses
              53 views
              0 likes
              Last Post seqadmin  
              Working...
              X