Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Promoter Analysis

    I did ChIP-seq on a TF.

    Now I have a consensus binding site. ATGNNNCGCNNNCAT (whatever)

    What I want to do is predict where else this consensus binding site is aside from the ChIP sites. I used Virtual Footprint (put in sequences that made up the consensus site) and got 400 possible matches for where else such DNA sequences exist within my genome. I have the start and stop locations with respect to my fasta file.

    Now I want to take those locations and identify what genes are in the vicinity of the binding site (+/- 50 bp from these sites). I don't know how to do this except manually look through IGV.

    How can I automate this process. Thank you for your help

    I tried to use ChIP anno with R but there are issues just loading the libraries. Any perl scripts or something would be useful. Thank you

  • #2
    I tried using excel and quickly realized I need to run loops.

    Comment


    • #3
      This sounds like a problem you can use Galaxy for

      Comment


      • #4
        So you have the chromosome and start/stop for your ~400 positions? Put them in BED format (tab-separated lines with "chromosome start stop"), get a GFF/GTF file for your genome with the genes (possibly filter it with grep for the features you are interested in) and use BEDTools (a swiss army knife for all annotation comparison needs); e.g. the "closest" command:

        Comment


        • #5
          I did as "sarvidsson" suggested

          Both files contain chromosome name, start position, stop position, and name of feature/gene without headings

          Here is an example

          My list of 400 position are in the following format called "toanno.bed"
          Chromosome 2985 2998 Site1
          Chromosome 6738 6751 Site2

          My list of genes I want to match them with are in the following format called "genome.bed"
          Chromosome 351 1724 Gene1
          Chromosome 1828 2946 Gene2


          When I use the command
          closestBed -a toanno.bed -b genome.bed > features.bed

          I get a concatenated file containing both files head to tail... basically a long concatenate command...

          I figured out I am not putting into .bed format. Basically the problem is with unicode.

          Save your data with excel, which only does Unicode 16 then save it as Unicode 8. WoW ridiculous.
          Last edited by TheSeqGeek; 02-15-2015, 02:39 PM.

          Comment


          • #6
            You could probably annotate the base pair positions using a tool that annotates lists of variants from NGS - if the position is near a gene, it would get annotated as upstream, downstream, intronic, etc. That would probably work for some of the positions. You could also align the bp positions to annotation information from 1000 genomes to find out if the site is in or near a gene.

            Comment


            • #7
              Originally posted by AlliCox View Post
              You could probably annotate the base pair positions using a tool that annotates lists of variants from NGS .
              So what's the tool?

              Comment


              • #8
                Originally posted by TheSeqGeek View Post
                So what's the tool?
                You could use SnpEff, but then you'd need to fake some VCF to get there. BEDTools is the tool for the job.

                Comment


                • #9
                  Originally posted by sarvidsson View Post
                  You could use SnpEff, but then you'd need to fake some VCF to get there. BEDTools is the tool for the job.
                  Yeah, I already got it to work with bed tools closestBed command. Only issue was with type of text editor I was using to generate .bed file as I described for anyone else having similar issues.

                  Comment

                  Latest Articles

                  Collapse

                  • seqadmin
                    Advancing Precision Medicine for Rare Diseases in Children
                    by seqadmin




                    Many organizations study rare diseases, but few have a mission as impactful as Rady Children’s Institute for Genomic Medicine (RCIGM). “We are all about changing outcomes for children,” explained Dr. Stephen Kingsmore, President and CEO of the group. The institute’s initial goal was to provide rapid diagnoses for critically ill children and shorten their diagnostic odyssey, a term used to describe the long and arduous process it takes patients to obtain an accurate...
                    12-16-2024, 07:57 AM
                  • seqadmin
                    Recent Advances in Sequencing Technologies
                    by seqadmin



                    Innovations in next-generation sequencing technologies and techniques are driving more precise and comprehensive exploration of complex biological systems. Current advancements include improved accessibility for long-read sequencing and significant progress in single-cell and 3D genomics. This article explores some of the most impactful developments in the field over the past year.

                    Long-Read Sequencing
                    Long-read sequencing has seen remarkable advancements,...
                    12-02-2024, 01:49 PM

                  ad_right_rmr

                  Collapse

                  News

                  Collapse

                  Topics Statistics Last Post
                  Started by seqadmin, 12-17-2024, 10:28 AM
                  0 responses
                  39 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 12-13-2024, 08:24 AM
                  0 responses
                  52 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 12-12-2024, 07:41 AM
                  0 responses
                  38 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 12-11-2024, 07:45 AM
                  0 responses
                  46 views
                  0 likes
                  Last Post seqadmin  
                  Working...
                  X