Seqanswers Leaderboard Ad

**SylvainL** · 03-20-2015, 02:48 AM

Hi,
the easiest (and fastest) would be to go either using the packages Rsamtools (including GRanges) and rtracklayer on R, or bedtools...

**WhatsOEver** · 03-23-2015, 02:32 AM

Although I agree with SylvainL that R should be faster than awk in this case you should not underestimate the power of linux tools for tasks like this. A simple script using awk how this could look like is the following (it expects bed format in its current version):

#!/bin/sh

#example call: thisScriptName.sh coordinatesFile dataFile outputFile

#$1 (coordinatesFile) to be the file with your coordinates you want to select for
#$2 (dataFile) to contain your data
#$3 (outputFile) is the file you want to save your results to

IFS='
' #use line break as end of line separator
#loop through lines of your coordinatesFile and for each line look in your data file
for line in $(cat $1)
do
lowLimit=$(echo $line | awk -F'\t' '{print $2}')
highLimit=$(echo $line | awk -F'\t' '{print $3}')
awk '{OFS="\t"; if(($2>'$lowLimit' && $2<'$highLimit') || ($3>'$lowLimit' && $3<'$highLimit')) print $0}' $2 >> $3
done

**AlexReynolds** · 03-25-2015, 11:20 AM

BEDOPS offers bedmap, for mapping features from one BED file to another. It is fast and efficient, and supported on Linux. Operations are offered that work with numerical and categorical features.

More information is available here:

6.2.1. bedmap — BEDOPS v2.4.41

http://bedops.readthedocs.org/en/latest/content/reference/statistics/bedmap.html

Be careful when converting Excel output to BED, in that Microsoft adds special endline characters that interfere with the functionality of bedmap and other Unix tools. The "dos2unix" application is useful here for cleaning Excel-sourced text files:

6.2.1. bedmap — BEDOPS v2.4.41

http://bedops.readthedocs.org/en/latest/content/reference/statistics/bedmap.html#endlines

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 59 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 57 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 51 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 55 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

Mapping annotation to genome positions

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News