Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to get TF binding sites from ChIP-Seq

    Hello all,

    I have the following question: suppose I have a ChIP-Seq data for a protein enrichment, and I made peak calling, e.g. with MACS, and obtained, say, 40,000 peaks with a width around 1000 bp each. How do I proceed to detect the precise coordinates of all bound proteins? (By precise I mean at least a 10 bp resolution).

    Thanks!

  • #2
    the coordinates of the peaks summits (MACS has a separate file for this afaik) should give you a fairly good starting point.

    Comment


    • #3
      are the summits exactly in the middle between the peak start/end?

      Comment


      • #4
        that is usually not the case (depends in the peak caller though)

        Comment


        • #5
          Hi
          I mapped my chiseq data to reference genome of dm3 using bowtie. Once got the bam file I used macs to call the peaks. I got around 15000 plus peaks. I also got NAME_summits.bed file. What does summit means and how can i use this file for further downstream analysis. Secondly, how can I get confident and enriched peaks from the peaks.xls file from MACS?
          Please let me know your views and sugesstions???
          Anurag

          Comment


          • #6
            Originally posted by anurag.gautam View Post
            Hi
            What does summit means and how can i use this file for further downstream analysis.
            summits are the positions with maximum enrichment within a larger peak area. most likely they map to the exact binding position of your target protein. however, this will depend on whether your target protein binds to DNA directly..

            downstream analysis depends on the question you are asking. if you e.g. want to discover a binding motif or your TF, the summit position might be used to isolate DNA pieces for motif enrichment analysis (-> MEME)

            Originally posted by anurag.gautam View Post
            Secondly, how can I get confident and enriched peaks from the peaks.xls file from MACS?
            please re-phrase the question as I cannot understand it. sorry.

            Comment


            • #7
              Originally posted by mudshark View Post
              summits are the positions with maximum enrichment within a larger peak area. most likely they map to the exact binding position of your target protein. however, this will depend on whether your target protein binds to DNA directly..

              downstream analysis depends on the question you are asking. if you e.g. want to discover a binding motif or your TF, the summit position might be used to isolate DNA pieces for motif enrichment analysis (-> MEME)



              please re-phrase the question as I cannot understand it. sorry.
              MACS call the peaks by providing pvalue and mfold cutoff.. BASed on those values it calls the peaks. Can I directly use the coordinates of the called peaks to get my DNA sequence to further use it for motif analysis, or Can I provide some more cutt off or filteration criteria to find strong peaks which are of my interest based on summit and fold enrichment values...which in turn will provide me confident and enriched peaks...??
              Hope that elaborates your question...Let me know..

              Comment


              • #8
                Originally posted by anurag.gautam View Post
                Can I directly use the coordinates of the called peaks to get my DNA sequence to further use it for motif analysis, or Can I provide some more cutt off or filteration criteria to find strong peaks which are of my interest based on summit and fold enrichment values...which in turn will provide me confident and enriched peaks...??
                Sure it can make sense to further filter the peaks in particular if you have a lots of them (>1000). Straight forward approach would be to look at the top 100 or 200 ones (ranked by enrichment or p-value).

                Comment


                • #9
                  Originally posted by mudshark View Post
                  Sure it can make sense to further filter the peaks in particular if you have a lots of them (>1000). Straight forward approach would be to look at the top 100 or 200 ones (ranked by enrichment or p-value).
                  Ok. I sorter out in descending order based on -10*log10(pvalue) . But what about the summit value, I also want to use it for ranking my highly confident or enriched peaks.. Any criteria can u provide which uses -10*log10(pvalue), fold_enrichment and summit value to rank my peaks???

                  for exampe,,
                  chr start end length summit tags -10*log10(pvalue) fold_enrichment
                  chr3L 12092327 12092827 501 294 212 1372.6 41.16
                  chrX 18215330 18215683 354 249 147 947.81 35.95
                  chrX 587408 587798 391 234 171 1134.01 35.17
                  chr3L 3348888 3349361 474 259 171 1004.39 34.13
                  chrX 8385843 8386276 434 180 143 793.24 33.87
                  chr3R 2225145 2225813 669 396 212 1117.08 33.43

                  Comment


                  • #10
                    Does summit value also tells about the height of my peak??

                    Comment


                    • #11
                      summit is just the position within the peak area

                      Comment


                      • #12
                        If I also want to use summit value for ranking my highly confident or enriched peaks.. Any criteria can u provide which uses -10*log10(pvalue), fold_enrichment and summit value to rank my peaks???

                        Comment


                        • #13
                          i suggest you just sort and filter the top 100 based on the p-value (OR enrichment) and then extract the sequence around the summit position.

                          Comment


                          • #14
                            Thanks for your quick replies mudshark..
                            I was able to rank my strong peaks based on pvaue and length of the peak >1000 bp (depending on my peaks called). Could you give furhter more ideas about motif analysis. I used meme suite to do the denovo motif analysis also. Once I get the motifs ,, what kind of significant biological information can be drawn from it..??

                            Comment

                            Latest Articles

                            Collapse

                            • seqadmin
                              Current Approaches to Protein Sequencing
                              by seqadmin


                              Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                              04-04-2024, 04:25 PM
                            • seqadmin
                              Strategies for Sequencing Challenging Samples
                              by seqadmin


                              Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                              03-22-2024, 06:39 AM

                            ad_right_rmr

                            Collapse

                            News

                            Collapse

                            Topics Statistics Last Post
                            Started by seqadmin, 04-11-2024, 12:08 PM
                            0 responses
                            26 views
                            0 likes
                            Last Post seqadmin  
                            Started by seqadmin, 04-10-2024, 10:19 PM
                            0 responses
                            29 views
                            0 likes
                            Last Post seqadmin  
                            Started by seqadmin, 04-10-2024, 09:21 AM
                            0 responses
                            25 views
                            0 likes
                            Last Post seqadmin  
                            Started by seqadmin, 04-04-2024, 09:00 AM
                            0 responses
                            52 views
                            0 likes
                            Last Post seqadmin  
                            Working...
                            X