SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
tophat with/without annotation,and cufflink with annotation? louis7781x Bioinformatics 19 04-05-2013 07:09 AM
functional prediction SNP airtime Bioinformatics 1 10-24-2011 01:37 PM
PubMed: TILLING - a shortcut in functional genomics. Newsbot! Literature Watch 0 09-14-2011 02:00 AM
ANNOVAR: functional annotation of genetic variants from high-throughput sequencing da NGSfan Literature Watch 0 08-31-2010 02:39 AM
Functional consequences of indels christophpale Bioinformatics 1 05-25-2010 07:40 PM

Reply
 
Thread Tools
Old 01-21-2010, 05:45 AM   #1
RockChalkJayhawk
Senior Member
 
Location: Rochester, MN

Join Date: Mar 2009
Posts: 191
Arrow Functional Annotation

What tool does everyone use to annotate ChIP-Seq peaks (i.e. nearest gene, etc). Is there a linux source I could download somewhere for it?
RockChalkJayhawk is offline   Reply With Quote
Old 01-21-2010, 07:26 AM   #2
quinlana
Senior Member
 
Location: Charlottesville

Join Date: Sep 2008
Posts: 119
Default

I suggest either:

a. Galaxy's "Operate on Genomic Intervals": http://main.g2.bx.psu.edu/

or

b. BEDTools (admittedly my own command-line software). You would download genes and whatever annotations you are interested in (BED format) and then use the tools to find closest genes (closestBed), etc.
bedtools.googlecode.com

Aaron
quinlana is offline   Reply With Quote
Old 01-29-2010, 08:30 AM   #3
RockChalkJayhawk
Senior Member
 
Location: Rochester, MN

Join Date: Mar 2009
Posts: 191
Default

Quote:
Originally Posted by quinlana View Post
I suggest either:

a. Galaxy's "Operate on Genomic Intervals": http://main.g2.bx.psu.edu/

or

b. BEDTools (admittedly my own command-line software). You would download genes and whatever annotations you are interested in (BED format) and then use the tools to find closest genes (closestBed), etc.
bedtools.googlecode.com

Aaron
Aaron,
Can BEDtools find insersections in more than 2 bed files? For example, if I am doing ChIP-Seq for Factor A, B, C, & D and I want a single bed file telling me all the places enriched for all of the factors or 3 out of 4 etc.
RockChalkJayhawk is offline   Reply With Quote
Old 01-29-2010, 12:32 PM   #4
quinlana
Senior Member
 
Location: Charlottesville

Join Date: Sep 2008
Posts: 119
Default

Quote:
Originally Posted by RockChalkJayhawk View Post
Aaron,
Can BEDtools find insersections in more than 2 bed files? For example, if I am doing ChIP-Seq for Factor A, B, C, & D and I want a single bed file telling me all the places enriched for all of the factors or 3 out of 4 etc.
Hi,
BEDTools cannot do what you ask in a single command. However, there are multiple ways to do this with a couple commands. I demonstrate two possible solutions below (assuming I understood you correctly).

Based on your example, let's assume you have four BED files, each representing regions of enrichment for A, B, C, and D, respectively.

The following command will return all of the regions enriched for A that overlap (by at least 1bp) with regions enriched for B,C and D. The "-u" returns a unique entry even when multiple overlaps are found
Code:
$ intersectBed -a A.bed -b B.bed -u | \
  intersectBed -a stdin -b C.bed -u | \
  intersectBed -a stdin -b D.bed -u > ABCD.bed
You could then mix and match commands like this to capture all possible situations.

An alternate and perhaps simpler way is to count the number of overlaps between A/B, A/C, A/D. The example below assumes each BED file has 6 columns (chrom, start, end) and the fourth column (hence the cut -f 4) is the count of overlaps b/w A and B which is returned by the "-c" option.

# Count the overlaps b/w A and the others. Every entry in A will have a count. It will be 0 if there were no overlaps
Code:
$ intersectBed -a A.bed -b B.bed -c | cut -f 4 > AtoB.counts
$ intersectBed -a A.bed -b C.bed -c | cut -f 4 > AtoC.counts
$ intersectBed -a A.bed -b D.bed -c | cut -f 4 > AtoD.counts
# Now, let's paste the counts to the end of the A entries
Code:
$ paste A.bed AtoB.counts AtoC.counts AtoD.counts > AwithCounts.bed
Now you will have something that looks like this:
Code:
chr1	100	200	0	2	1
chr1	200	300	1	1	2
...
chrY	100	200	0	0	0
The first entry says that this A interval was also enriched in C and D, but not B.
The second entry says that this A interval was also enriched in all 3 others.
The third entry says that this A interval was not enriched in any others.

You would repeat for B, C and D and could then write a basic awk or perl script to ask your questions with such an output.

There are other ways to tackle this and obviously subtleties to the questions asked, but I hope this helps you get the ball rolling, as it were.

Aaron

Last edited by quinlana; 01-29-2010 at 01:26 PM. Reason: Corrected filenames in the second example.
quinlana is offline   Reply With Quote
Reply

Tags
annotation, chip-seq

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 04:06 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO