SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Tophat: generate junctions.bed file from BAM file Julien Roux Bioinformatics 5 01-14-2016 04:19 PM
Updated How to convert .txt file to .bed .GFF or .BAR file format, forevermark4 Bioinformatics 2 06-30-2014 06:02 AM
TopHat file.bam file.bed join Trudy Bioinformatics 1 05-21-2013 12:59 PM
Is there a BED file format validator? Does a BED file have to be sorted position? LauraSmith Bioinformatics 3 05-21-2013 12:54 PM
Converting Dindel VCF file to GATK BED file MolecularToast Bioinformatics 2 09-24-2011 07:38 PM

Reply
 
Thread Tools
Old 04-25-2013, 08:34 AM   #1
dietmar13
Senior Member
 
Location: Vienna

Join Date: Mar 2010
Posts: 107
Default intersect (actually: filter) a gtf file with coordinates from a bed-file

-- SOLVED -- bedtools intersect

hello,

does someone know a script which filters a gtf-file with coordinates from a bed file.

i have a bed-file with regions and want filter out all features from a gtf file which overlap (completely or largely) with this regions. is there a script or program? bedtools intersect does not work?

i know, an awk-script would not be very difficult, but why re-invent the wheel...

thank you,

dietmar

Last edited by dietmar13; 04-25-2013 at 10:37 PM.
dietmar13 is offline   Reply With Quote
Old 04-25-2013, 08:35 PM   #2
kmcarr
Senior Member
 
Location: USA, Midwest

Join Date: May 2008
Posts: 1,135
Default

Quote:
Originally Posted by dietmar13 View Post
bedtools intersect does not work?
Why do you say this? Seems like bedtools intersect with the '-v' option is exactly what you are looking for.
kmcarr is offline   Reply With Quote
Old 04-25-2013, 10:36 PM   #3
dietmar13
Senior Member
 
Location: Vienna

Join Date: Mar 2010
Posts: 107
Default @kmcarr

thank you - you are right. i was misled by all the examples where only bed and bam files were used for bedtools intersect examples...

dietmar
dietmar13 is offline   Reply With Quote
Old 05-17-2013, 03:29 AM   #4
vishal.rossi
Member
 
Location: Bonn Germany

Join Date: Apr 2013
Posts: 25
Default

Hi,

I am comparing 2 different files. 1st file has 113 entries and the 2nd one has 88 entries.
I use the following command to get the differences
intersectBed -v -a 1.bed -b 2.bed or
intersectBed -v -wa -wb -a 1.bed -b 2.bed

But it shows that only 3 entries don't match in both the cases which is false.
Does anyone has the idea why?

Thanks
vishal.rossi is offline   Reply With Quote
Old 05-17-2013, 04:45 AM   #5
syfo
Just a member
 
Location: Southern EU

Join Date: Nov 2012
Posts: 103
Default

Quote:
Originally Posted by vishal.rossi View Post
I am comparing 2 different files. [...] I use the following command to get the differences [...] only 3 entries don't match
What are you looking for exactly?
IntersectBed with the "-v" parameter will show you the intervals from "1.bed" that have nothing in common with the ones in "2.bed". Entries from "2.bed" are not supposed to be reported.
Also, one common nucleotide is enough by default to define an overlap between two intervals. For a more stringent criteria you might want to consider "-f" and "-r".
syfo is offline   Reply With Quote
Old 05-21-2013, 12:42 PM   #6
AlexReynolds
Member
 
Location: Seattle, WA

Join Date: Feb 2013
Posts: 45
Default

Another option is BEDOPS bedops, which does set operations on BED data, and BEDOPS gtf2bed, which does a lossless conversion of GTF data into BED format, which can be used with BEDOPS tools.

Let's assume that your regions-of-interest are in a file called myRegions.bed and your GTF-formatted annotations are in a file called myAnnotations.gtf.

First, we sort myRegions.bed:

$ sort-bed myRegions.bed > mySortedRegions.bed

Next, we convert the annotations to BED format:

$ gtf2bed < myAnnotations.gtf > myAnnotations.bed

Finally, we apply a --not-element-of set operation to show elements of the annotations file which do not overlap mySortedRegions.bed, if there is one or more bases of overlap (i.e., any overlap at all):

$ bedops --not-element-of -1 myAnnotations.bed mySortedRegions.bed > myAnswer.bed

As the gtf2bed conversion step was lossless, it is easy to convert myAnswer.bed back to GTF:

$ awk '{print $1"\t"$7"\t"$8"\t"($2+1)"\t"$3"\t"$5"\t"$6"\t"$9"\t"(substr($0, index($0,$10)))}' myAnswer.bed > myAnswer.gtf

Last edited by AlexReynolds; 05-21-2013 at 01:05 PM.
AlexReynolds is offline   Reply With Quote
Old 07-13-2017, 06:20 AM   #7
eieneg
Junior Member
 
Location: Europe

Join Date: Feb 2017
Posts: 5
Talking To intersect coordinates

Just use GFF-Intersector

https://github.com/PriceJon/GFF_Intersector

it can intersect GFF files with multiple other coordinates!! have you got R? if so just 2 commands and you don't have to worry about the visualisation issue
eieneg is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 07:37 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2017, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO