Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • What "high duplication rate" means

    Hi all,
    Could someone define "high duplication rate" in ChIP seq data analysis?
    Thks

  • #2
    Am I correct in guessing that you ran FastQC on your reads and it showed a really high duplication rate? If so, I would typically consider that normal unless whatever you're pulling down is rather non-specific in where it's located/binds.

    Comment


    • #3
      Actually the bioinformatician who ran the analysis said that, so I'm just trying to understand since he couldn't explain it to me

      Comment


      • #4
        That probably refers to PCR duplicates... that is, even though you may have 90 reads at a location, they are likely to be 90 copies of the same original DNA fragment and so should not be considered independent binding events of your protein to DNA. This happens when low-complexity libraries are heavily amplified, which is common for ChIP-Seq.

        You can determine if duplicates are a problem because you would expect that reads start at multiple locations across a genomic region where your protein was cross-linked, because the DNA was sheared randomly. If the reads start at only a few locations and there are multiple reads at each of the starts, then those are going to be duplicates.

        As dpryan mentioned, with ChIP-Seq perfectly good data can look highly duplicated, since there might be very high coverage of reads in a constrained space. So it takes some actual examination of how the coverage looks to tell if it is just stacked high or duplicated.
        Last edited by SNPsaurus; 01-30-2014, 11:23 PM.
        Providing nextRAD genotyping and PacBio sequencing services. http://snpsaurus.com

        Comment


        • #5
          Just to add to SNPsaurus' response, keep in mind that gauging duplication rate is difficult if you have single-end reads. Then, the maximum coverage of a single position after removing what appear to be PCR duplicates is twice whatever your read-length is. Of course, for Chip-seq, this is unrealistic, so unless you have paired-end reads you're probably better off ignoring PCR duplicates.

          Comment


          • #6
            Thank you dpryan and SNPsaurus, that's very helpful.

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Essential Discoveries and Tools in Epitranscriptomics
              by seqadmin




              The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
              04-22-2024, 07:01 AM
            • seqadmin
              Current Approaches to Protein Sequencing
              by seqadmin


              Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
              04-04-2024, 04:25 PM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, Yesterday, 08:47 AM
            0 responses
            12 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-11-2024, 12:08 PM
            0 responses
            60 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-10-2024, 10:19 PM
            0 responses
            59 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-10-2024, 09:21 AM
            0 responses
            54 views
            0 likes
            Last Post seqadmin  
            Working...
            X