SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Converting VCF to GFF (again) francois.sabot Bioinformatics 3 08-24-2016 03:48 AM
GFF to GTF, and GTF to GRanges objects lewewoo Bioinformatics 2 04-03-2012 02:52 PM
gff3,gtf to gff parulvk Bioinformatics 2 11-15-2011 11:48 AM
need help with converting VCF to GTF/GFF format rudi283 Bioinformatics 4 03-05-2011 10:49 AM
GFF to GTF gen2prot Bioinformatics 9 12-14-2010 10:07 AM

Reply
 
Thread Tools
Old 10-26-2011, 10:26 AM   #1
efoss
Member
 
Location: Seattle

Join Date: Jul 2011
Posts: 98
Default converting GFF to GTF

I want to convert a GFF file from Flybase into a GTF file and I'm having a lot of trouble figuring out how to do this. Here is a description of the difference between GTF and GFF:

http://genome.ucsc.edu/FAQ/FAQformat.html#format4

Does anyone know of an easy way to convert GFF to GTF? It looks like there is a perl script to do this, but I was unable to install a module required for that script and also the script looks a bit complex and I saw comments claiming that it is not dependable. Does anyone know of an easy way to convert GFF to GTF?
efoss is offline   Reply With Quote
Old 10-26-2011, 10:35 AM   #2
westerman
Rick Westerman
 
Location: Purdue University, Indiana, USA

Join Date: Jun 2008
Posts: 1,104
Default

I sent you private email but I suggest getting GTFs from Ensembl. ftp://ftp.ensembl.org/pub/current_gtf

Or if you are doing Illumina work then iGenomes is a good resource.

The GFF to GTF conversion is a pain.
westerman is offline   Reply With Quote
Old 10-26-2011, 10:46 AM   #3
efoss
Member
 
Location: Seattle

Join Date: Jul 2011
Posts: 98
Default

Quote:
Originally Posted by westerman View Post
I sent you private email but I suggest getting GTFs from Ensembl. ftp://ftp.ensembl.org/pub/current_gtf

Or if you are doing Illumina work then iGenomes is a good resource.

The GFF to GTF conversion is a pain.
Thanks very much. I see that they have a Drosophila GTF file on Ensemble. It's 5.25 rather than 5.41, but I assume that as long as it's 5 point something that it will be compatible with my other files.

Thanks again for the help.

Eric
efoss is offline   Reply With Quote
Old 10-26-2011, 05:49 PM   #4
upendra_35
Senior Member
 
Location: USA

Join Date: Apr 2010
Posts: 102
Default

Use the below perl script for coverting GFF to GTF

##!/usr/bin/perl

use strict;
use warnings;
use Data:umper;

use File::Basename;
use Bio::FeatureIO;

my $inFile = shift;
my ($name, $path, $suffix) = fileparse($inFile, qr/\.gff/);
my $outFile = $path . $name . ".gtf";

my $inGFF = Bio::FeatureIO->new( '-file' => "$inFile",
'-format' => 'GFF',
'-version' => 3 );
my $outGTF = Bio::FeatureIO->new( '-file' => ">$outFile",
'-format' => 'GFF',
'-version' => 2.5);

while (my $feature = $inGFF->next_feature() ) {

$outGTF->write_feature($feature);

}
upendra_35 is offline   Reply With Quote
Old 10-26-2011, 07:58 PM   #5
kmcarr
Senior Member
 
Location: USA, Midwest

Join Date: May 2008
Posts: 1,165
Default

Quote:
Originally Posted by upendra_35 View Post
Use the below perl script for coverting GFF to GTF

Code:
##!/usr/bin/perl
 
use strict;
use warnings;
use Data::Dumper;
 
use File::Basename;
use Bio::FeatureIO;
 
my $inFile = shift;
my ($name, $path, $suffix) = fileparse($inFile, qr/\.gff/);
my $outFile = $path . $name . ".gtf";
 
my $inGFF = Bio::FeatureIO->new( '-file' => "$inFile",
 '-format' => 'GFF',
 '-version' => 3 );
my $outGTF = Bio::FeatureIO->new( '-file' => ">$outFile",
 '-format' => 'GFF',
 '-version' => 2.5);
 
while (my $feature = $inGFF->next_feature() ) {
 
$outGTF->write_feature($feature);
 
}
While it looks good on paper, in my experience this has rarely worked. Almost alway due to some non standard formatting of the attributes column (column #9) of the input GFF. But, as always, YMMV.
kmcarr is offline   Reply With Quote
Old 10-26-2011, 08:31 PM   #6
efoss
Member
 
Location: Seattle

Join Date: Jul 2011
Posts: 98
Default

Quote:
Originally Posted by upendra_35 View Post
Use the below perl script for coverting GFF to GTF

##!/usr/bin/perl

use strict;
use warnings;
use Data:umper;

use File::Basename;
use Bio::FeatureIO;

my $inFile = shift;
my ($name, $path, $suffix) = fileparse($inFile, qr/\.gff/);
my $outFile = $path . $name . ".gtf";

my $inGFF = Bio::FeatureIO->new( '-file' => "$inFile",
'-format' => 'GFF',
'-version' => 3 );
my $outGTF = Bio::FeatureIO->new( '-file' => ">$outFile",
'-format' => 'GFF',
'-version' => 2.5);

while (my $feature = $inGFF->next_feature() ) {

$outGTF->write_feature($feature);

}
Hi Upendra,

Thanks for the suggestion. I just tried to run this and got this error message:


------------- EXCEPTION -------------
MSG: don't know what do do with directive: '##species'
STACK Bio::FeatureIO::gff::_handle_directive /Library/Perl/5.10.0/Bio/FeatureIO/gff.pm:537
STACK Bio::FeatureIO::gff::_initialize /Library/Perl/5.10.0/Bio/FeatureIO/gff.pm:115
STACK Bio::FeatureIO::new /Library/Perl/5.10.0/Bio/FeatureIO.pm:277
STACK Bio::FeatureIO::new /Library/Perl/5.10.0/Bio/FeatureIO.pm:297
STACK toplevel /Users/efoss/sequencing/Aida/RNAseq/test_GFF_to_GTF_script_102611_3.pl:16
-------------------------------------

I imagine that this is related to kmcarr's comment. I find it a bit weird that it is so hard to convert from GFF to GTF, since the GTF format is only changing the 9th column of the GFF format, and it seems like (if you have all the necessary information in the GFF file) that it should be trivial to convert it into the GTF format. Is all the information necessary for the GTF format necessarily included in the GFF format?

Eric
Seattle
efoss is offline   Reply With Quote
Old 10-26-2011, 10:27 PM   #7
Simon Anders
Senior Member
 
Location: Heidelberg, Germany

Join Date: Feb 2010
Posts: 994
Default

Quote:
I imagine that this is related to kmcarr's comment. I find it a bit weird that it is so hard to convert from GFF to GTF, since the GTF format is only changing the 9th column of the GFF format, and it seems like (if you have all the necessary information in the GFF file) that it should be trivial to convert it into the GTF format. Is all the information necessary for the GTF format necessarily included in the GFF format?
The whole point of the GTF format was to standardise certain aspects that are left open in GFF. Hence, there are many different valid ways to encode the same information in a valid GFF format, and any parser or converter needs to be written specifically for the choices the author of the GFF file made. For example, a GTF file requires the gene ID attribute to be called "gene_id", while in GFF files, it may be "ID", "Gene", something different, or completely missing. Hence, a general GFF-to-GTF converter (as opposed to one converting only GFF files from a very specific source) needs to guess this from the data, which is non-trivial.
Simon Anders is offline   Reply With Quote
Old 10-27-2011, 06:47 AM   #8
efoss
Member
 
Location: Seattle

Join Date: Jul 2011
Posts: 98
Default

Quote:
Originally Posted by Simon Anders View Post
The whole point of the GTF format was to standardise certain aspects that are left open in GFF. Hence, there are many different valid ways to encode the same information in a valid GFF format, and any parser or converter needs to be written specifically for the choices the author of the GFF file made. For example, a GTF file requires the gene ID attribute to be called "gene_id", while in GFF files, it may be "ID", "Gene", something different, or completely missing. Hence, a general GFF-to-GTF converter (as opposed to one converting only GFF files from a very specific source) needs to guess this from the data, which is non-trivial.
Thank you. That's an excellent explanation.

Eric
efoss is offline   Reply With Quote
Old 10-15-2013, 05:06 AM   #9
unique379
Member
 
Location: Italy

Join Date: Aug 2012
Posts: 27
Default

Hello all,

I would like to have Genome coordinates of miRNA in .gtf format, is it possible to convert from .gff3 of miRBase ?? if yes then how ??


Thanks and looking forward for your kind reply

Last edited by unique379; 10-15-2013 at 05:12 AM.
unique379 is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 06:28 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO