Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Good ChIP-seq finder?

    Hi all of you, I was wondering whether there are any peak finders for transcription factors out there that qualify as good under the following criteria:

    * can distinguish between closely adjacent peaks
    * identifies peaks with high spatial resolution
    * high sensitivity and specificity
    * accepts aligned peaks in .bed format
    * does not confuse with millions of useless options
    * reasonably fast (processes more than 2 million drosophila reads per minute)
    * makes full use of input sequencing by subtracting background before peak finding
    * free and open source

    Have wasted quite some time testing various peak finders and have been very disappointed by everything I looked at. The ones that I have tested include MACS, FindPeaks and PICS.

    Wrote my own peak finder today that appears to fulfill all of the above by using a simple strand specific double window scanning approach on background subtracted sample. Appears to work really well and am wondering right now why nobody has done this before.

    Would love to hear your opinion on this.

  • #2
    Hi there!

    Originally posted by steinmann View Post
    Hi all of you, I was wondering whether there are any peak finders for transcription factors out there that qualify as good under the following criteria:

    * can distinguish between closely adjacent peaks
    * identifies peaks with high spatial resolution
    * high sensitivity and specificity
    * accepts aligned peaks in .bed format
    * does not confuse with millions of useless options
    * reasonably fast (processes more than 2 million drosophila reads per minute)
    * makes full use of input sequencing by subtracting background before peak finding
    * free and open source

    Have wasted quite some time testing various peak finders and have been very disappointed by everything I looked at. The ones that I have tested include MACS, FindPeaks and PICS.
    Well, you are asking for the perfect software! I believe there's no a general solution to your problem. I've tried CisGenome, MACS and FP4 and, depending on the biological problem, I think you'll have to tune your parameters. All the softwares available rely on different statistics and different assumptions, each may perform better on certain analysis...
    Generally speaking, all of those are able to find TF binding sites in a reliable way... things change when you're looking for histone modifications or, possibly, megabase-wide phenomena.

    Originally posted by steinmann View Post
    Wrote my own peak finder today that appears to fulfill all of the above by using a simple strand specific double window scanning approach on background subtracted sample. Appears to work really well and am wondering right now why nobody has done this before.

    Would love to hear your opinion on this.
    Well, I'm working on something similar right now :-)

    d

    Comment


    • #3
      Originally posted by dawe View Post
      Generally speaking, all of those are able to find TF binding sites in a reliable way... things change when you're looking for histone modifications or, possibly, megabase-wide phenomena.
      d
      Have not managed to do proper peak finding with those. MACS can simply not distinguish between closely adjacent peaks and I can not get FP4 to not miss a whole lot of obvious peaks.

      Originally posted by dawe View Post
      Well, I'm working on something similar right now :-)
      d
      Interesting

      Which language?
      Do you intend to publish a paper on it?
      Will you make it freely available?
      How did you solve the problem of splitting closely adjacent peaks? Similar to FP4?

      Am not exactly sure what would be the best way to identify and separate peaks that are so close that they overlap. Would want the function to be as simple and robust as possible.

      Comment


      • #4
        Originally posted by steinmann View Post
        Interesting

        Which language?
        Do you intend to publish a paper on it?
        Will you make it freely available?
        How did you solve the problem of splitting closely adjacent peaks? Similar to FP4?
        I'm using python, especially for the numpy/scipy modules (which are pretty fast). Hopefully there will be a paper, it much depends on how it performs on real data I'm working on :-)
        About the license... well, I've included the BSD license, but still there's no code for the release.
        About the adjacent peaks... There's no ready solution for that, still thinking about that.
        BTW, FP4 has a couple of options which could help for that (trim and subpeaks), give those a try.

        Originally posted by steinmann View Post
        Am not exactly sure what would be the best way to identify and separate peaks that are so close that they overlap. Would want the function to be as simple and robust as possible.
        Again, that would depend on the biological effect you are studying... There are cases in which two peaks should be considered as part of the same effect (e.g. pH2AX)...

        Comment


        • #5
          My application would be the identification of transcription factor binding sites. Am aware of the subpeaks function in FP4 and it appears to work fairly well.

          Have now implemented something similar to analyze the enriched regions from the double window scanning. What I essentially get from my scanning is a merging and smoothing of the double peaks (see attachment for transformation of two closely adjacent peaks). For these regions I then take the first derivative and look for sign changes to identify all possible maxima. The maximum with the highest enrichment score I then define as my first peak. I then test whether I have a valley of a certain depth between the first peak and the second highest maximum. If this is the case I return both peaks, if not I try the third highest maximum and so on.

          Seems to be reasonably robust and simple, but does not take care of triple peaks.
          Attached Files

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Current Approaches to Protein Sequencing
            by seqadmin


            Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
            04-04-2024, 04:25 PM
          • seqadmin
            Strategies for Sequencing Challenging Samples
            by seqadmin


            Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
            03-22-2024, 06:39 AM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, 04-11-2024, 12:08 PM
          0 responses
          27 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 10:19 PM
          0 responses
          31 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 09:21 AM
          0 responses
          27 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-04-2024, 09:00 AM
          0 responses
          52 views
          0 likes
          Last Post seqadmin  
          Working...
          X