Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • MeDIP-seq peak calling (with replicates)

    Hi all, I'm curious what people are using these days for finding peaks in MeDIP-seq data (I have two experimental conditions, each with replicates, and I'm interested in finding changes between them). I've browsed through a number of common tools, such as MACs, but they seem to suggest pooling biological replicates into a single .bed file for further analysis. I would assume that it's beneficial to exploit the presence of replicates for judging noise and am wondering if there's something else out there. Alternatively, if someone can point me to where this sort of issue is discussed in the MACs manual I'd appreciate it.

    Alternatively, do people recommend using MACs or a similar tool to give a first pass at finding peaks, which can then be used as regions of interest for other tools (which I suppose I could just write, but I assume others have already done that).

  • #2
    We wouldn't normally do peak detection for MeDIP as you'll end up selecting for high CpG content regions. We prefer to systematically analyse all regions (often splitting up by type - exons, promoters etc) which will still have some bias because of differing levels of observations, but does allow you to spot interesting things.

    Comment


    • #3
      We have analyzed MeDIP both using peak callers (MACS) and pre-defined regions (promoters, 3'UTR, introns, etc.). In any case, we keep the replicates separate, deriving different peaksets for each replicate in the peak-calling case.

      One we have peaks, we use the Bioconductor package DiffBind, which allows you to derive a consensus peakset, and then uses the distributions of enrichment scores in the replicates to identify differentially methylated regions.

      Comment


      • #4
        rory: Thanks that makes sense and was what I figured needed to happen. I expect I'll just create a couple .bed files and merge them with bedtools prior to the normal count based tools used for everything else.

        simon: Unfortunately in my case we expect to not see any systemic regional (i.e., exons, promoters, etc.) differences between the datasets. I'll run those analyses anyway since that should be done anyway though.

        Thanks all

        Comment


        • #5
          I would like to use diffbind for differential peak calling on peaksets generated using macs2. However, I cannot find information on the required format for the input peakset. Research tells me that the fourth column contains a "confidence' value. My guess is that this should be either the P-value or Q-value in the peaks.xls files coming from macs2. Please can someone enlighten me? Thanks.

          Comment


          • #6
            The easiest way to read MACS peaks into DiffBind is to specify the .xls files in the sample sheet and specify "macs" as the peak caller (either in a column of the sample sheet, or using the peakCaller="macs" in the call to dba(). This will use p-value as the score. Alternatively, you can convert the .xls to tab-separated text files, with the first three columns being the chromosome, start, and end of each peak, followed by as many score columns as you want. then set scoreCol= whichever column you want to use as a score.

            Ultimately, which peak caller scores shouldn't matter much, as they will be discarded after you call dba.count() to determine the enrichment for each merged peak in every sample.

            Cheers-
            Rory

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Strategies for Sequencing Challenging Samples
              by seqadmin


              Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
              03-22-2024, 06:39 AM
            • seqadmin
              Techniques and Challenges in Conservation Genomics
              by seqadmin



              The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

              Avian Conservation
              Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
              03-08-2024, 10:41 AM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, Yesterday, 06:37 PM
            0 responses
            10 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, Yesterday, 06:07 PM
            0 responses
            9 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 03-22-2024, 10:03 AM
            0 responses
            50 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 03-21-2024, 07:32 AM
            0 responses
            67 views
            0 likes
            Last Post seqadmin  
            Working...
            X