Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Me-DIP-seq troubleshooting, several questions

    Hi all,

    We are in the process of optimizing our Me-Dip-seq protocol using TruSeq multiplexing on the Illumina HiSeq2000 and we have been puzzled by (or cursed with) a variety of problems.

    I am on the bioinformatics side but to generalize we have tried two methods for library prep:

    1 - shear the DNA (ourselves and outsourced), run a gel, cut out the area with fragments of appropriate size/elute, immuno-precipitation, TruSeq protocol, sequencing

    2 - shear the DNA (ourselves and outsourced), TruSeq protocol, immuno-precipitation, sequencing

    Method 1's problems led to Method 2, but Method 2 has rotten immunoprecipitation. We are planning on 100bp PE runs with probably 4 samples multiplexed in each lane, but in some of our trials we've sequenced 36bp SE on the GA IIx because we had a spare lane on a flow cell or something and we were just trying things out.

    Method 1

    After library prep there are always two distinct bands on the gels/agilent etc. The top band is about 20% as dense as the bottom band. We've sequenced the bands separately and together, both contain genomic DNA but the bottom band definitely seems to be the one we want. The top band sequences with low quality but it still seems to be primarily genomic DNA. Presently I am still trying to figure out why this is happening and what the differences between the two bands are. So Question 1 - has anyone else run across this multiple-band issue when you do the TruSeq library prep last?

    On the second better lower band the FastQC reports almost always comes out strange in some way or another and there is mediocre (40-60%) alignment to the genome with the 100bp PE runs and better alignment (80-90%) with the 36bp SE runs. The biggest stand-out on the FastQC report is on the 100bp PE %GC distribution across the reads. Instead of being one peak like the theoretical distribution we end up with two peaks, one mostly coinciding with theoretical (top of the peak around 38-40 bases in) and a second equally large peak around 54-70 bases. Question 2 - What does this even mean?

    Question 3 - with the same alignment stringency, would you expect to have such a difference in the number of matches in 100bp PE reads in comparison to 36bp SE? This is something I can pursue elsewhere, its just a question I just started pondering so I thought I would throw it in.

    With method 1 we do see GC enrichment (according to the program MEDIPS) that corresponds roughly to what the authors of the program saw with their Me-DIP-seq data.

    Method 2

    In this case (adapters before antibody) we get beautiful sequence data with beautiful alignments, quality, etc. Great coverage, no double bands to pick from, everything looks fantastic and the process is easier in the lab BUT there is no GC enrichment. Its just a lot of pretty evenly distributed genomic DNA. Question 4 - has anyone else seen this? What might explain it? My working hypothesis is that the sequencing adapters have enough GC that the antibody is pulling everything down pretty equally? But others seem to have had success with this method.

    The lack of enrichment has occurred every time with both the longer PE reads and the shorter SE reads. Explanations? Suggestions? HELP?!?!

    Question 5 - BATMAN? MEDIPS? Are there any other Me-DIP-seq analyses methods someone might recommend for use once we get our library prep handled?

    Thanks for ANY input or commentary you may be able to provide about any of these questions. I am still somewhat new to NGS data but learning.
    Last edited by NearyJL78; 09-28-2011, 10:17 AM.

  • #2
    If you're creating 100bp sequences you're going to have a reasonable portion of your library which runs through the insert into the adapters on the other end. The reason for your double peaks in your GC plot could be the reads which did and didn't run into adapter. This would also explain why your data mapped much better for the 36bp reads.

    The other thing we often see in MeDIP data is a separate peak on the GC plot which comes from major satellite sequences. This tends to be a sharper defined peak in the middle of the broader genomic GC distribution. In extreme cases we've seen 40% of the reads coming from these satellite sequences.

    Comment


    • #3
      Question 1: One source of multiple bands is over amplification of your library. See this thread: http://seqanswers.com/forums/showthr...t=daisy+chains
      I also address the issue in my ChIP-seq TruSeq protocol as well as some other issues pertaining to how the TruSeq adapters run on an agarose gel.
      I got the data back from the ChIP-seq samples I sent out using my TruSeq library preparation protocol. I can definitely say the library preparation protocol works great, clearly better than what Il…

      Now that were are getting 100 million reads/lane on the HiSeq2000, it’s time to barcode even for histone ChIPs.  From what I see Bioo has some Illumina barcoded adaptors.  There is also a good meth…

      Are you quantitating your samples with qPCR because the aberrantly migrating fragments will not quantitate well with methods that measure dsDNA.

      Question 3: My guess is on your 100bp reads you are reading through your fragments and onto adapter sequence and that is keeping them from aligning.

      Question 4: Really no idea here but you should consider for this protocol that the Y-shaped TruSeq adapters do not run true to size on an agarose gel. You may be missing the bulk of your DNA at this step. See the above links for more thought on this.
      Last edited by ETHANol; 09-27-2011, 12:05 AM.
      --------------
      Ethan

      Comment


      • #4
        One thought about Method 1 is that the library prep kit requires dsDNA for the ligation reaction - and after MeDIP your DNA is single-stranded, so perhaps you are selecting for fast reannealing sequences?

        As for Method 2 - according to http://seqanswers.com/forums/showthread.php?t=11509 - TruSeq adaptors are methylated, so you are in effect just pulling down the adaptors, not the methylated regions of the genome.

        Comment


        • #5
          Wow thank you all so much for your assistance, especially about the TruSeq adapters being methylated!

          Comment


          • #6
            Originally posted by simonandrews View Post

            The other thing we often see in MeDIP data is a separate peak on the GC plot which comes from major satellite sequences. This tends to be a sharper defined peak in the middle of the broader genomic GC distribution. In extreme cases we've seen 40% of the reads coming from these satellite sequences.
            Thanks Simon. Since you've run these types of experiments I have one more question for you. Overall do you find a much increased GC% in the NGS data after immunoprecip in comparison to control?

            For human samples we are seeing about 39% in controls and 43-45% in Me-DIP data, but my boss thinks maybe this % should be even higher? I haven't found any information for comparison from anyone else to know for sure.

            Thanks again (EVERYONE!)
            JN

            Comment


            • #7
              Originally posted by ETHANol View Post
              I also address the issue in my ChIP-seq TruSeq protocol as well as some other issues pertaining to how the TruSeq adapters run on an agarose gel.
              I got the data back from the ChIP-seq samples I sent out using my TruSeq library preparation protocol. I can definitely say the library preparation protocol works great, clearly better than what Il…

              Now that were are getting 100 million reads/lane on the HiSeq2000, it’s time to barcode even for histone ChIPs.  From what I see Bioo has some Illumina barcoded adaptors.  There is also a good meth…

              Are you quantitating your samples with qPCR because the aberrantly migrating fragments will not quantitate well with methods that measure dsDNA.
              I'm not sure if they are quanititating with qPCR but I am forwarding all of this to my boss. I just wanted to thank you personally for the links to your blog, which are exceedingly helpful.

              Comment


              • #8
                Originally posted by frozenlyse View Post
                One thought about Method 1 is that the library prep kit requires dsDNA for the ligation reaction - and after MeDIP your DNA is single-stranded, so perhaps you are selecting for fast reannealing sequences?

                As for Method 2 - according to http://seqanswers.com/forums/showthread.php?t=11509 - TruSeq adaptors are methylated, so you are in effect just pulling down the adaptors, not the methylated regions of the genome.
                I really wish I'd asked this question here sooner! THANK YOU for the info about the TruSeq adapters being methylated!

                Comment

                Latest Articles

                Collapse

                • seqadmin
                  Current Approaches to Protein Sequencing
                  by seqadmin


                  Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                  04-04-2024, 04:25 PM
                • seqadmin
                  Strategies for Sequencing Challenging Samples
                  by seqadmin


                  Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                  03-22-2024, 06:39 AM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by seqadmin, 04-11-2024, 12:08 PM
                0 responses
                22 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 04-10-2024, 10:19 PM
                0 responses
                24 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 04-10-2024, 09:21 AM
                0 responses
                19 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 04-04-2024, 09:00 AM
                0 responses
                52 views
                0 likes
                Last Post seqadmin  
                Working...
                X