Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Strong CpG methylation bias between R1 and R2

    Hello,
    I'm new in bioinformatics, and for first time training I've got the set of WGBS 100bp PE reads from few human cancer tissues.
    I've filtered reads with prinseq, sorted, and aligned them with bismark in PE mode to hg38 (prepared with bismark) from ucsc.
    Mapping efficiency is ~20% with ~80% C's methylated in CpG context.
    OK, low mappability of reads from BS treated DNA has been mentioned many times.
    Then I tried to map reads 1 and 2 separately in SE mode.
    Read 1: mapping efficiency ~60% with ~80% C's methylated in CpG context.
    Read 2: mapping efficiency ~50% with ~40% C's methylated in CpG context.
    additional trimming by 10-20 nt from any end of read2 slightly increase mappability, but doesn't affect methylation rate.
    This result seems extremely odd to me.
    If DNA was treated with BS, how can it happen that only read2 in pair shows 2X less methylation in CpG context?
    Does anybody have a fresh look?
    Thank you in advance.

  • #2
    Would you have following information:
    1- Kit or method used for library prep
    2- Read length
    3- Library peak size
    4- FastQC output for reads

    Comment


    • #3
      This is what I could extract from core lab personnel:

      1- Kit or method used for library prep

      Genomic DNA was extracted from tissue, BS treated, sonicated, end repaired, dA-tailed. Then standard illumina adaptors were used for PE sequencing.

      2- Read length

      100bases (adaptors already trimmed)

      3- Library peak size

      ~200nt

      4- FastQC output for reads

      sorry, I can't attach picture right now, but fastQC report is good for all reads median quality at 5' end is 30, at 3' end is ~15. And I preformed quality trimming with threshold over 15.

      Comment


      • #4
        Generally there are three WGBS library prep methods:
        1- Post-ligation bisulfite conversion: DNA fragmentation and standard library preparation with methylated adapters followed by bisulfite conversion and amplification
        2- Post-bisulfite conversion library preparation by second strand synthesis of converted ssDNA followed by standard end repair, A tailing and adapter ligation and PCR amplification of double stranded DNA.
        3- Post-bisulfite conversion library preparation by synthesise of second strand with random primers appended with one partial Illumina adapter sequence and tagging the 3’ end of new strand with Terminal Tagging Oligo appended with other partial Illumina adapter followed by PCR amplification.

        I assume your library was prepared with method 1. Peak size of 200 on average would have insert size of 75 nt so I would expect that large number of reads have been trimmed at 5’ end.

        It would be interesting to see the FastQC “per base sequence content” plot for reads and that should show similar portion of converted Cs. For an example see following plots for low diversity RRBS library that shows low %C in R1 and correspondingly low %G in R2. If your plots show similar C and G then issue could be analysis step.

        RRBS.pdf

        Comment


        • #5
          Something in this description seems wrong. After bisulfite conversion the DNA should be (mostly) single stranded (since the bisulfite conversion requires single stranded DNA). Thus the standard end-repair, A-tailing and Illumina adapter ligation with Y-adapters will not work.

          Originally posted by zubr View Post
          This is what I could extract from core lab personnel:

          1- Kit or method used for library prep

          Genomic DNA was extracted from tissue, BS treated, sonicated, end repaired, dA-tailed. Then standard illumina adaptors were used for PE sequencing.

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Strategies for Sequencing Challenging Samples
            by seqadmin


            Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
            03-22-2024, 06:39 AM
          • seqadmin
            Techniques and Challenges in Conservation Genomics
            by seqadmin



            The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

            Avian Conservation
            Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
            03-08-2024, 10:41 AM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, Yesterday, 06:37 PM
          0 responses
          8 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, Yesterday, 06:07 PM
          0 responses
          8 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 03-22-2024, 10:03 AM
          0 responses
          49 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 03-21-2024, 07:32 AM
          0 responses
          66 views
          0 likes
          Last Post seqadmin  
          Working...
          X