Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to get TF binding sites from ChIP-Seq

    Hello all,

    I have the following question: suppose I have a ChIP-Seq data for a protein enrichment, and I made peak calling, e.g. with MACS, and obtained, say, 40,000 peaks with a width around 1000 bp each. How do I proceed to detect the precise coordinates of all bound proteins? (By precise I mean at least a 10 bp resolution).

    Thanks!

  • #2
    the coordinates of the peaks summits (MACS has a separate file for this afaik) should give you a fairly good starting point.

    Comment


    • #3
      are the summits exactly in the middle between the peak start/end?

      Comment


      • #4
        that is usually not the case (depends in the peak caller though)

        Comment


        • #5
          Hi
          I mapped my chiseq data to reference genome of dm3 using bowtie. Once got the bam file I used macs to call the peaks. I got around 15000 plus peaks. I also got NAME_summits.bed file. What does summit means and how can i use this file for further downstream analysis. Secondly, how can I get confident and enriched peaks from the peaks.xls file from MACS?
          Please let me know your views and sugesstions???
          Anurag

          Comment


          • #6
            Originally posted by anurag.gautam View Post
            Hi
            What does summit means and how can i use this file for further downstream analysis.
            summits are the positions with maximum enrichment within a larger peak area. most likely they map to the exact binding position of your target protein. however, this will depend on whether your target protein binds to DNA directly..

            downstream analysis depends on the question you are asking. if you e.g. want to discover a binding motif or your TF, the summit position might be used to isolate DNA pieces for motif enrichment analysis (-> MEME)

            Originally posted by anurag.gautam View Post
            Secondly, how can I get confident and enriched peaks from the peaks.xls file from MACS?
            please re-phrase the question as I cannot understand it. sorry.

            Comment


            • #7
              Originally posted by mudshark View Post
              summits are the positions with maximum enrichment within a larger peak area. most likely they map to the exact binding position of your target protein. however, this will depend on whether your target protein binds to DNA directly..

              downstream analysis depends on the question you are asking. if you e.g. want to discover a binding motif or your TF, the summit position might be used to isolate DNA pieces for motif enrichment analysis (-> MEME)



              please re-phrase the question as I cannot understand it. sorry.
              MACS call the peaks by providing pvalue and mfold cutoff.. BASed on those values it calls the peaks. Can I directly use the coordinates of the called peaks to get my DNA sequence to further use it for motif analysis, or Can I provide some more cutt off or filteration criteria to find strong peaks which are of my interest based on summit and fold enrichment values...which in turn will provide me confident and enriched peaks...??
              Hope that elaborates your question...Let me know..

              Comment


              • #8
                Originally posted by anurag.gautam View Post
                Can I directly use the coordinates of the called peaks to get my DNA sequence to further use it for motif analysis, or Can I provide some more cutt off or filteration criteria to find strong peaks which are of my interest based on summit and fold enrichment values...which in turn will provide me confident and enriched peaks...??
                Sure it can make sense to further filter the peaks in particular if you have a lots of them (>1000). Straight forward approach would be to look at the top 100 or 200 ones (ranked by enrichment or p-value).

                Comment


                • #9
                  Originally posted by mudshark View Post
                  Sure it can make sense to further filter the peaks in particular if you have a lots of them (>1000). Straight forward approach would be to look at the top 100 or 200 ones (ranked by enrichment or p-value).
                  Ok. I sorter out in descending order based on -10*log10(pvalue) . But what about the summit value, I also want to use it for ranking my highly confident or enriched peaks.. Any criteria can u provide which uses -10*log10(pvalue), fold_enrichment and summit value to rank my peaks???

                  for exampe,,
                  chr start end length summit tags -10*log10(pvalue) fold_enrichment
                  chr3L 12092327 12092827 501 294 212 1372.6 41.16
                  chrX 18215330 18215683 354 249 147 947.81 35.95
                  chrX 587408 587798 391 234 171 1134.01 35.17
                  chr3L 3348888 3349361 474 259 171 1004.39 34.13
                  chrX 8385843 8386276 434 180 143 793.24 33.87
                  chr3R 2225145 2225813 669 396 212 1117.08 33.43

                  Comment


                  • #10
                    Does summit value also tells about the height of my peak??

                    Comment


                    • #11
                      summit is just the position within the peak area

                      Comment


                      • #12
                        If I also want to use summit value for ranking my highly confident or enriched peaks.. Any criteria can u provide which uses -10*log10(pvalue), fold_enrichment and summit value to rank my peaks???

                        Comment


                        • #13
                          i suggest you just sort and filter the top 100 based on the p-value (OR enrichment) and then extract the sequence around the summit position.

                          Comment


                          • #14
                            Thanks for your quick replies mudshark..
                            I was able to rank my strong peaks based on pvaue and length of the peak >1000 bp (depending on my peaks called). Could you give furhter more ideas about motif analysis. I used meme suite to do the denovo motif analysis also. Once I get the motifs ,, what kind of significant biological information can be drawn from it..??

                            Comment

                            Latest Articles

                            Collapse

                            • seqadmin
                              Advancing Precision Medicine for Rare Diseases in Children
                              by seqadmin




                              Many organizations study rare diseases, but few have a mission as impactful as Rady Children’s Institute for Genomic Medicine (RCIGM). “We are all about changing outcomes for children,” explained Dr. Stephen Kingsmore, President and CEO of the group. The institute’s initial goal was to provide rapid diagnoses for critically ill children and shorten their diagnostic odyssey, a term used to describe the long and arduous process it takes patients to obtain an accurate...
                              12-16-2024, 07:57 AM
                            • seqadmin
                              Recent Advances in Sequencing Technologies
                              by seqadmin



                              Innovations in next-generation sequencing technologies and techniques are driving more precise and comprehensive exploration of complex biological systems. Current advancements include improved accessibility for long-read sequencing and significant progress in single-cell and 3D genomics. This article explores some of the most impactful developments in the field over the past year.

                              Long-Read Sequencing
                              Long-read sequencing has seen remarkable advancements,...
                              12-02-2024, 01:49 PM

                            ad_right_rmr

                            Collapse

                            News

                            Collapse

                            Topics Statistics Last Post
                            Started by seqadmin, 12-17-2024, 10:28 AM
                            0 responses
                            22 views
                            0 likes
                            Last Post seqadmin  
                            Started by seqadmin, 12-13-2024, 08:24 AM
                            0 responses
                            42 views
                            0 likes
                            Last Post seqadmin  
                            Started by seqadmin, 12-12-2024, 07:41 AM
                            0 responses
                            28 views
                            0 likes
                            Last Post seqadmin  
                            Started by seqadmin, 12-11-2024, 07:45 AM
                            0 responses
                            42 views
                            0 likes
                            Last Post seqadmin  
                            Working...
                            X