Unconfigured Ad

**quinlana** · 01-21-2010, 08:26 AM

I suggest either:

a. Galaxy's "Operate on Genomic Intervals": http://main.g2.bx.psu.edu/

or

b. BEDTools (admittedly my own command-line software). You would download genes and whatever annotations you are interested in (BED format) and then use the tools to find closest genes (closestBed), etc.
bedtools.googlecode.com

Aaron

**RockChalkJayhawk** · 01-29-2010, 09:30 AM

Originally posted by quinlana View Post

I suggest either:

a. Galaxy's "Operate on Genomic Intervals": http://main.g2.bx.psu.edu/

or

b. BEDTools (admittedly my own command-line software). You would download genes and whatever annotations you are interested in (BED format) and then use the tools to find closest genes (closestBed), etc.
bedtools.googlecode.com

Aaron

Aaron,
Can BEDtools find insersections in more than 2 bed files? For example, if I am doing ChIP-Seq for Factor A, B, C, & D and I want a single bed file telling me all the places enriched for all of the factors or 3 out of 4 etc.

**quinlana** · 01-29-2010, 01:32 PM

Originally posted by RockChalkJayhawk View Post

Aaron,
Can BEDtools find insersections in more than 2 bed files? For example, if I am doing ChIP-Seq for Factor A, B, C, & D and I want a single bed file telling me all the places enriched for all of the factors or 3 out of 4 etc.

Hi,
BEDTools cannot do what you ask in a single command. However, there are multiple ways to do this with a couple commands. I demonstrate two possible solutions below (assuming I understood you correctly).

Based on your example, let's assume you have four BED files, each representing regions of enrichment for A, B, C, and D, respectively.

The following command will return all of the regions enriched for A that overlap (by at least 1bp) with regions enriched for B,C and D. The "-u" returns a unique entry even when multiple overlaps are found

Code:

$ intersectBed -a A.bed -b B.bed -u | \
  intersectBed -a stdin -b C.bed -u | \
  intersectBed -a stdin -b D.bed -u > ABCD.bed

You could then mix and match commands like this to capture all possible situations.

An alternate and perhaps simpler way is to count the number of overlaps between A/B, A/C, A/D. The example below assumes each BED file has 6 columns (chrom, start, end) and the fourth column (hence the cut -f 4) is the count of overlaps b/w A and B which is returned by the "-c" option.

# Count the overlaps b/w A and the others. Every entry in A will have a count. It will be 0 if there were no overlaps

Code:

$ intersectBed -a A.bed -b B.bed -c | cut -f 4 > AtoB.counts
$ intersectBed -a A.bed -b C.bed -c | cut -f 4 > AtoC.counts
$ intersectBed -a A.bed -b D.bed -c | cut -f 4 > AtoD.counts

# Now, let's paste the counts to the end of the A entries

Code:

$ paste A.bed AtoB.counts AtoC.counts AtoD.counts > AwithCounts.bed

Now you will have something that looks like this:

Code:

chr1	100	200	0	2	1
chr1	200	300	1	1	2
...
chrY	100	200	0	0	0

The first entry says that this A interval was also enriched in C and D, but not B.
The second entry says that this A interval was also enriched in all 3 others.
The third entry says that this A interval was not enriched in any others.

You would repeat for B, C and D and could then write a basic awk or perl script to ask your questions with such an output.

There are other ways to tackle this and obviously subtleties to the questions asked, but I hope this helps you get the ball rolling, as it were.

Aaron

Topics	Statistics	Last Post
High-Resolution Sequencing Exposes Hidden Toxoplasma Diversity by SEQadmin2 Started by SEQadmin2, 07-02-2026, 11:08 AM	0 responses 16 views 0 reactions	Last Post by SEQadmin2 07-02-2026, 11:08 AM
New AI Model Captures Long-Range Genomic Signals to Improve RNA Splice Site Prediction by SEQadmin2 Started by SEQadmin2, 06-30-2026, 05:37 AM	0 responses 17 views 0 reactions	Last Post by SEQadmin2 06-30-2026, 05:37 AM
Large-Scale Protein Screen Uncovers Hidden Regulators of Alternative Polyadenylation by SEQadmin2 Started by SEQadmin2, 06-26-2026, 11:10 AM	0 responses 20 views 0 reactions	Last Post by SEQadmin2 06-26-2026, 11:10 AM
Whole-Genome Sequencing Traces Faroe Islands Ancestry to a North Atlantic Founder Population by SEQadmin2 Started by SEQadmin2, 06-17-2026, 06:09 AM	0 responses 54 views 0 reactions	Last Post by SEQadmin2 06-17-2026, 06:09 AM

Unconfigured Ad

Functional Annotation

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News