Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Finding significantly different ChIP-seq peak length?

    Hello,

    I have a simple data analysis question.

    Say I have a protein ChIP-seq data comparing three replicates of wild-type and three replicates of mutants. I have normalized the data using spikes, mapped reads, etc and let's say for the sake of the argument the data is normalized.

    A typical pipeline would then call peak then count how many reads in those peaks then compare WT vs mutant with statistical tests (e.g. using MAnorm or DESeq2 or other softwares that's built to find differentially "expressed" ChIP-seq)

    The ideal scenario is like this below for example, both WT and MUT peaks overlap pretty nicely in a hotspot such that we can use the method above to find significantly up/down ChIP-seq peaks by using the signal.



    However, what if it's not the signal that we want to compare, but the length? Let's say this protein upon mutation are more spread around the hotspots instead of in the hotspots. Therefore the overall signal does not change, but the shape or length does. Worse, the length change is not consistent but can be to the left, or to the right, or both, as such:



    The wild-type peak is very consistent in shape and length and only mutants change.

    The goal is really to test whether there is a significant length increase/decrease compared to wild type. Whether it's go to the right, left, or both doesn't matter. What statistical test would be the best?

    What I did was I first call peak, then merged peaks that are close together in the 6 samples (I call these hotspots). Then for each sample I sum their peak length in each merged peak group. When I looked at the distribution, the length for each sample really follows poisson/nbinom therefore I used DESeq2 to find significantly different length. Would this be an acceptable method?

  • #2
    Hi- It's a good question...! When you use methods like Deseq you compress all the information in a peak in a single number: The count of reads (or the length, in your case). Have a look at this paper and associated R package "MMDiff: quantitative testing for shape changes in ChIP-Seq data sets".

    Dario

    Comment


    • #3
      For something like this you might want to test for differences in the distribution, such as with a KS test.
      edit: MMDiff looks MUCH more interesting!

      Comment


      • #4
        Originally posted by dpryan View Post
        For something like this you might want to test for differences in the distribution, such as with a KS test.
        edit: MMDiff looks MUCH more interesting!
        Trying MMdiff out for both Chip-Seq and BS-Seq data is on my todo list, but still haven't got around it!

        Comment


        • #5
          Thanks for the MMdiff suggestion! Will definitely try it out!!

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Current Approaches to Protein Sequencing
            by seqadmin


            Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
            04-04-2024, 04:25 PM
          • seqadmin
            Strategies for Sequencing Challenging Samples
            by seqadmin


            Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
            03-22-2024, 06:39 AM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, 04-11-2024, 12:08 PM
          0 responses
          30 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 10:19 PM
          0 responses
          32 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 09:21 AM
          0 responses
          28 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-04-2024, 09:00 AM
          0 responses
          52 views
          0 likes
          Last Post seqadmin  
          Working...
          X