Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • MBD-seq (or any enrichment-seq) read number <-> biology?

    Let's assume we have the same cell type in conditions A and B.

    A non-sequencing experiment shows that there is overall more DNA methylation in condition A than in condition B.

    Then we perform MBD-seq on conditions A and B to see where this difference in methylation lies in the genome. (To secure that we put exactly the same amount of DNA starting material in A and B, we carefully quantified it using Qubit etc). We align using BWA. Immediately we see that in all biological replicates, in condition A there are more uniquely mapped reads (up to 25 %) than in condition B. (In some cases there were also more raw sequencing reads in condition A). We are talking about 16 - 25 million uniquelly mapped reads / sample, single end, single sample seq on GAII. Inputs perfectly same. A first question is: does this higher number of uniqely mapped reads in A reflect the biology of A and B (more overall methylation in A)?

    However, after loading wig files into a browser, we can't see any differentially methylated regions - A and B look like perfect replicates of eachother . The differential methylation analysis is currently being run (calling peaks with BALM and using MeDIPS to quantify methylation), however preliminary results show there is no much difference between A and B.

    Does anyone have an explanation for this, assuming that the non-sequencing experiment was valid, and that there indeed is quantitative overall difference in methylation between A and B?

    The only things that come to my mind for now are:

    1. differential methylation between A and B occurs in repetitive DNA sequences, which are exluded from the analysis by BWA?

    2. maybe the MBD protein used for enrichment recognizes other citosine modifications, not only methylation, so the difference in methylation between A and B could be in the state of another modification in B, but still recognised by MBD?

    3. Is there a normalisation step in the algorithms used that would divide peak hight (= quantity) by total number of reads (which is higher in A). If more reads in A reflect biological presence of more methylation, would dividing each peak quantity by this higher total number of reads diminish the difference between A and B?
    I'm not a computational person, but in my understanding this normalisation step wouldn't affect the quantitative analysis and identification of differentially methylated regions (DMRs) only if we assume that DMRs will behave like a microarray experiment: most of the regions don't change, and only a few do. But is it possible that the change in methylation is more uniformly distributed across the genome so this normalisation is affecting quantitative analysis?

    We are quite confused with this experiment, as two different experiments that both worked perfectly don't agree: one says there is more methylation overall in A than in B, but then MBD-seq shows that A and B are identical, like they are replicates of each other. The DNA used in both experiments is exactly the same, so no possibility of inter-replicate effect variability.

    Many thanks!

  • #2
    first thing coming to my mind is exactly what you mentioned: differential methylation is happening in repetitive regions.

    what was the non-sequencing technology?

    Comment


    • #3
      The non-sequencing experiment was dot-blot, imobilizing total DNA on a membrane and staining with anti-methylC antibody. The experiment was very clean and clear. A has more total methylation than B.

      Another question I have in case those changes occur in repetitive elements - does it mean anything that A has more uniquely mapped reads than B? If all the change was in repetitive regions they would be exluded from the uniquely mapped reads list by BWA, but there's still a difference, in all biological replicates?

      Comment


      • #4
        A couple comments ...

        How are you going about finding DMRs? It seems that "loading wig files into a browser" may not be very fruitful -- BALM and MEDIPS are geared to absolute methylation, which isn't necessary if you want to go direct to differential methylation. Have you produced an "MA" plot? I do a fair amount of DMR analyses in edgeR (being a co-author), simply taking read densities in bins along the genome or in regions close to TSSs and doing standard count analyses.

        I think that normalization could be an issue here, especially if there is "uniformly distributed changes" along the genome. This is similar to comparing genomes, say, where one has 2 copies and one has 4 copies throughout most of the genome. Because of the distribution of sampling of the genomes, they'd look very similar (in relative read density).

        Do you have any positive/negative controls for your two conditions?

        Regards,
        Mark

        Comment


        • #5
          Hi Unununium

          I agree with mark, you need to use a method that is better geared to identify differentially bound regions. I personally used MACS peak finder to identify regions of differential binding. It automatically corrects for library size, which should get around your issue of different read numbers. I have to mention that I always disable model building and set the shift size based on fragmentation pattern of the initial sample, as MACS had some trouble in my data to build reliable shift models.

          EdgeR as mentioned by mark however will probably give you the more statistically sound answers. Although it is a bit more tricky to use, it is definitely worth a try. Maybe have a look at the bioconductor DiffBind package, which as far as I understand aims to combine the two approaches.

          Cheers
          Seb

          Comment


          • #6
            Thanks for suggestions, that is exactly what we are trying now - combination of BALM with DiffBind and the edgeR-like approach (counting reads in bins), however for now we can't seem to find many (or any) differentialy methylated regions between the samples.

            Yes we have a control for A and B, apart from Inputs. This control is a baseline state in which perturbations A and B are induced.

            More questions might follow when we finish the initial analysis, especially if we don't identify genomic regions where DMRs happen.

            Thanks!

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Current Approaches to Protein Sequencing
              by seqadmin


              Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
              04-04-2024, 04:25 PM
            • seqadmin
              Strategies for Sequencing Challenging Samples
              by seqadmin


              Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
              03-22-2024, 06:39 AM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, 04-11-2024, 12:08 PM
            0 responses
            29 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-10-2024, 10:19 PM
            0 responses
            31 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-10-2024, 09:21 AM
            0 responses
            28 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-04-2024, 09:00 AM
            0 responses
            52 views
            0 likes
            Last Post seqadmin  
            Working...
            X