![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
tophat with/without annotation,and cufflink with annotation? | louis7781x | Bioinformatics | 19 | 04-05-2013 08:09 AM |
functional prediction SNP | airtime | Bioinformatics | 1 | 10-24-2011 02:37 PM |
PubMed: TILLING - a shortcut in functional genomics. | Newsbot! | Literature Watch | 0 | 09-14-2011 03:00 AM |
ANNOVAR: functional annotation of genetic variants from high-throughput sequencing da | NGSfan | Literature Watch | 0 | 08-31-2010 03:39 AM |
Functional consequences of indels | christophpale | Bioinformatics | 1 | 05-25-2010 08:40 PM |
![]() |
|
Thread Tools |
![]() |
#1 |
Senior Member
Location: Rochester, MN Join Date: Mar 2009
Posts: 191
|
![]()
What tool does everyone use to annotate ChIP-Seq peaks (i.e. nearest gene, etc). Is there a linux source I could download somewhere for it?
|
![]() |
![]() |
![]() |
#2 |
Senior Member
Location: Charlottesville Join Date: Sep 2008
Posts: 119
|
![]()
I suggest either:
a. Galaxy's "Operate on Genomic Intervals": http://main.g2.bx.psu.edu/ or b. BEDTools (admittedly my own command-line software). You would download genes and whatever annotations you are interested in (BED format) and then use the tools to find closest genes (closestBed), etc. bedtools.googlecode.com Aaron |
![]() |
![]() |
![]() |
#3 | |
Senior Member
Location: Rochester, MN Join Date: Mar 2009
Posts: 191
|
![]() Quote:
Can BEDtools find insersections in more than 2 bed files? For example, if I am doing ChIP-Seq for Factor A, B, C, & D and I want a single bed file telling me all the places enriched for all of the factors or 3 out of 4 etc. |
|
![]() |
![]() |
![]() |
#4 | |
Senior Member
Location: Charlottesville Join Date: Sep 2008
Posts: 119
|
![]() Quote:
BEDTools cannot do what you ask in a single command. However, there are multiple ways to do this with a couple commands. I demonstrate two possible solutions below (assuming I understood you correctly). Based on your example, let's assume you have four BED files, each representing regions of enrichment for A, B, C, and D, respectively. The following command will return all of the regions enriched for A that overlap (by at least 1bp) with regions enriched for B,C and D. The "-u" returns a unique entry even when multiple overlaps are found Code:
$ intersectBed -a A.bed -b B.bed -u | \ intersectBed -a stdin -b C.bed -u | \ intersectBed -a stdin -b D.bed -u > ABCD.bed An alternate and perhaps simpler way is to count the number of overlaps between A/B, A/C, A/D. The example below assumes each BED file has 6 columns (chrom, start, end) and the fourth column (hence the cut -f 4) is the count of overlaps b/w A and B which is returned by the "-c" option. # Count the overlaps b/w A and the others. Every entry in A will have a count. It will be 0 if there were no overlaps Code:
$ intersectBed -a A.bed -b B.bed -c | cut -f 4 > AtoB.counts $ intersectBed -a A.bed -b C.bed -c | cut -f 4 > AtoC.counts $ intersectBed -a A.bed -b D.bed -c | cut -f 4 > AtoD.counts Code:
$ paste A.bed AtoB.counts AtoC.counts AtoD.counts > AwithCounts.bed Code:
chr1 100 200 0 2 1 chr1 200 300 1 1 2 ... chrY 100 200 0 0 0 The second entry says that this A interval was also enriched in all 3 others. The third entry says that this A interval was not enriched in any others. You would repeat for B, C and D and could then write a basic awk or perl script to ask your questions with such an output. There are other ways to tackle this and obviously subtleties to the questions asked, but I hope this helps you get the ball rolling, as it were. Aaron Last edited by quinlana; 01-29-2010 at 02:26 PM. Reason: Corrected filenames in the second example. |
|
![]() |
![]() |
![]() |
Tags |
annotation, chip-seq |
Thread Tools | |
|
|