Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Bisulfite sequencing - filtering by min. conversion rate

    Hi,

    I am looking into filtering bisulfite reads by a minimum conversion rate. Something high like a 95% CpG and non-CpG conversion rate. I've been working with Bismark and really like it.

    I know that I get conversion rates in the end results, but I would like to also filter individual reads by conversion rate, either before or after methylation calling / mapping.

    Are there any available tools that could do it, or otherwise a suggested way to do this?

    Many thanks!

  • #2
    It should be fairly simple to do this by parsing the methylation call string in each line of the bismark output, but I guess I would also ask why you wanted to do this. I'm aware that some groups have applied this filter in the past (though only ever in non-CpG context, requiring full conversion in CpG context would definitely be a mistake), but that was under the assumption that there is effectively no non-CpG methylation, which increasingly appears to not be the case. By removing highly (or even moderately) methylated reads you run the risk of biasing your results and potentially removing interesting data. If non-conversion of specific reads does happen in your library then it should only be a problem if it's targeted in some way, otherwise randomly distributing a few methylated base calls shouldn't bias your results too much.

    What we do filter for in our analyses is regions which show unusually high coverage. Mismapping of repetitive regions does happen, and can produce odd results, but this type of filtering removes a region of the genome from the analysis, rather than individual reads.

    Comment


    • #3
      Dear Simon,

      What you wrote here made a lot of sense and I have meanwhile adopted this practice of not filtering reads to correct for conversion rate. There are several options I'm exploring, including correcting observed methylation levels and filtering for unusual coverage peaks. I'd like to argue for that in future papers and that it does not make sense to throw away significant amounts of the data due to suspected low conversion.

      Are there any studies that are helpful to support the approach of not filtering for highly converted reads? Otherwise, would you perhaps agree with a reference to our correspondence?

      Thanks

      Comment


      • #4
        I'm not aware of any studies which have looked in a systematic way at the dynamics of bisulphite conversion so I'm not sure there's a fixed conclusion one way or the other. We therefore stick with the more conservative approach of not biasing our data by systematically removing parts of the data which we have no actual evidence are wrong.

        Removing overrepresented sequences is easily justified since we know that we can't get that many correct sequences from a region - therefore we must be mis-measuring our data in those regions and we can therefore ignore them.

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Current Approaches to Protein Sequencing
          by seqadmin


          Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
          04-04-2024, 04:25 PM
        • seqadmin
          Strategies for Sequencing Challenging Samples
          by seqadmin


          Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
          03-22-2024, 06:39 AM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, 04-11-2024, 12:08 PM
        0 responses
        23 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-10-2024, 10:19 PM
        0 responses
        24 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-10-2024, 09:21 AM
        0 responses
        21 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-04-2024, 09:00 AM
        0 responses
        52 views
        0 likes
        Last Post seqadmin  
        Working...
        X