Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • Juulluu21
    Junior Member
    • Jun 2016
    • 6

    Bismark Methyl-seq Analysis

    We have sequenced a genome using Illumina's True-seq bisulfite sequencing kit. After getting back the seq, we are analyzing methylation rate using Bismark. I Need help with the interpretation of the result and proper way of normalization.

    Before sequencing: Sample DNA was divided into 2 groups:
    1. Bisulfite treatment was carried out and DNA was subsequently sequenced (group 1, methylated group)
    2. DNA was sequenced without bisulfite treatment (group 2, control group)

    Both group was sequenced in paired-end fashion.

    I am using Bismark to analyze the seq and trying to get the methylation rate in this particular genome. After running Bismark on Methylated files I got this finale percentages:

    C methylated in CpG context: 0.6%

    C methylated in CHG context: 0.5%

    C methylated in CHH context: 0.7%

    Whereas after running Bismark on my Control files I got these percentages:

    C methylated in CpG context: 99.6%

    C methylated in CHG context: 99.3%

    C methylated in CHH context: 99.9%

    So, how would I interpret my data?

    a. Is 0.6 % (CpG) the actual methylation percentage in my genome?

    b. I have found in some literatures that if CpG, CHG, and CHH percentages are very close, that means that genome actually does not do methylation. Is it true?

    c. What was the purpose of using the control group (group 2)? Do I still need any spike-in control to normalize the data? If so, what that could be?

    Thank you very much for reading this long post!!

    Bests!!!
  • fkrueger
    Senior Member
    • Sep 2009
    • 627

    #2
    Hi Juulluu,

    Just briefly before answering your question, have you trimmed off 7-8bp from the 5’ end of reads as is recommended for TruSeq libraries? If not I would recommend doing so, please see also here:
    A tool to map bisulfite converted sequence reads and determine cytosine methylation states - File not found · FelixKrueger/Bismark


    The statistics being output at the end of a Bismark run are calculated from every single methylation call that has been performed during the entire run. While the number can be used as a proxy to what the average genome methylation levels are, this comes with the caveat that if you had some kind of overrepresented sequence in there, e.g. a certain class of repeats or the like, it might get a larger share towards the total average than the rest of genome. It would probably be more informative to calculate the percentage methylation over shorter stretches of your genome (e.g. using a fixed number of CpG residues or, more crudely, maybe 3-5kb running windows) and look at a bean plot representation of your genome. Just out of interest, which organism was it?

    But yes, from the numbers you are showing one could already say that:
    a) The overall methylation levels in your genome appear to be quite low. There still is a possibility that there is some localised methylation at specific loci in your genome (e.g. promoter regions), but you can’t tell this from a total average number but need to start visualising and exploring your data in more detail
    b) Good bisulfite conversion is normally > 99% efficicient. Since you are seeing total methylation levels between 0.5 and 0.7% you can say that the overall conversion must have been at least 99.5% efficient
    c) The methylated residues (0.5-0.7%) may either be true genomic methylation or a bisufulfite conversion error, but more likely a mix of the two. In other words the true methylation level may be even lower than that.
    d) Well, the three contexts appear to be very similar, so my first impression would be that there is at least discernible increase in CpG methylation which you see in mammals
    e) I am not sure why you included a non-bisulfite treated control. Not converted DNA should ideally come back as 100% methylated, which is nearly true. The fact that they are not exactly 100% can probably be explained by mis-alignments, or alignments of reads to regions in the genome that are not in the genome-build but were found in the library. It also shows that even here you are getting (conversion error) values between 0.1 and 0.7%, which is a very similar range as you real methylation data predicts. So unless you can convincingly show that there is some true, more localised methylation going on in your genome it will be a little difficult to argue that the methylation you are seeing is not within the expected noise.

    Comment

    Latest Articles

    Collapse

    • SEQadmin2
      Nine Things a Sample Prep Scientist Thinks About Before Sequencing
      by SEQadmin2


      I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.

      Here are nine questions we think about, in roughly the order they matter, before...
      06-18-2026, 07:11 AM
    • SEQadmin2
      From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
      by SEQadmin2


      Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


      The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
      ...
      06-02-2026, 10:05 AM

    ad_right_rmr

    Collapse

    News

    Collapse

    Topics Statistics Last Post
    Started by SEQadmin2, Yesterday, 05:37 AM
    0 responses
    5 views
    0 reactions
    Last Post SEQadmin2  
    Started by SEQadmin2, 06-26-2026, 11:10 AM
    0 responses
    16 views
    0 reactions
    Last Post SEQadmin2  
    Started by SEQadmin2, 06-17-2026, 06:09 AM
    0 responses
    50 views
    0 reactions
    Last Post SEQadmin2  
    Started by SEQadmin2, 06-09-2026, 11:58 AM
    0 responses
    110 views
    0 reactions
    Last Post SEQadmin2  
    Working...