SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
GFF3 to GenBank convert sphil Bioinformatics 4 05-18-2012 08:59 AM
GFF 2 genbank converter deMan Bioinformatics 3 02-16-2012 02:33 PM
genbank2gff.pl (Genbank 2 GFF problem) mcastell Bioinformatics 1 12-16-2011 07:26 AM
Assembled sequence submission to Genbank? Melissa General 0 04-26-2011 01:54 AM
Converting genbank accession to UCSC warrenemmett Bioinformatics 0 08-17-2009 06:47 AM

Reply
 
Thread Tools
Old 05-10-2012, 04:41 PM   #1
ErinL
Junior Member
 
Location: California

Join Date: Jun 2011
Posts: 7
Default GenBank to .tbl (Sequin format)

Hi everyone,

I'm working on submitting a set of whole genome shotgun sequencing projects to GenBank/NCBI. For this set of genomes, I have annotations which were generated using the RAST system (in GenBank and FFF format). However, in order to submit to GenBank/NCBI, these annotations need to be converted to what NCBI calls a 'feature table' (Sequin format/.tbl file). The file format is detailed here: http://www.ncbi.nlm.nih.gov/Sequin/table.html

I've searched the web for parsers to create the required table format using either GenBank or FFF formated files, and have asked the NCBI support staff if they know of such a parser. However, I have not been able to find one. Does anyone know where I can find something to convert between GenBank or FFF and the NCBI feature table format?

Thanks in advance!

Sincerely,
Erin
ErinL is offline   Reply With Quote
Old 05-11-2012, 03:37 AM   #2
maubp
Peter (Biopython etc)
 
Location: Dundee, Scotland, UK

Join Date: Jul 2009
Posts: 1,541
Default

I thought you could give them GenBank/EMBL files too? Maybe I'm thinking of EMBL not the NCBI...

P.S. What is this "FFF format"? I thought it was a typo for GFF, but you did it three times.
maubp is offline   Reply With Quote
Old 05-11-2012, 03:43 AM   #3
nickloman
Senior Member
 
Location: Birmingham, UK

Join Date: Jul 2009
Posts: 356
Default

Quote:
Originally Posted by maubp View Post
I thought you could give them GenBank/EMBL files too? Maybe I'm thinking of EMBL not the NCBI...
Unfortunately not Bit crazy. But it's easy to write a conversion script between the two. I've got one somewhere.
nickloman is offline   Reply With Quote
Old 05-11-2012, 08:10 AM   #4
ErinL
Junior Member
 
Location: California

Join Date: Jun 2011
Posts: 7
Default

I asked and they won't except GenBank files. It seems a bit crazy, since that's what they're going to make out of the .tbl/Sequin file anyway.

I'm sure I could write my own conversion script, but I'm a bit new to this whole scripting business, so it may take me a whole. I thought it was worth checking with the community to see if someone had one handy before I go through the trouble.

And yes, FFF was a typo for GFF. Guess my thinking cap was a bit loose at the end of the day. Sorry for the confusion.
ErinL is offline   Reply With Quote
Old 05-11-2012, 08:17 AM   #5
ErinL
Junior Member
 
Location: California

Join Date: Jun 2011
Posts: 7
Default

Just found one parser that claims to convert between GenBank and Sequin, but it appears to work for only one contig at a time (created table ends after the last gene of the first contig) and ignores tRNAs.

http://lfz.corefacility.ca/gbk2tbl/
ErinL is offline   Reply With Quote
Old 05-11-2012, 08:33 AM   #6
nickloman
Senior Member
 
Location: Birmingham, UK

Join Date: Jul 2009
Posts: 356
Default

I'll try and dig out my script.

If it's any help, Torsten Seemann's automated annotation pipeline can output sequin and/or table format:
http://bioinformatics.net.au/prokka-manual.html
nickloman is offline   Reply With Quote
Old 05-11-2012, 08:51 AM   #7
ErinL
Junior Member
 
Location: California

Join Date: Jun 2011
Posts: 7
Default

Thanks nickloman, we've thought about just re-doing the annotations through NCBI's pipeline, but the problem is we already used the annotations we have for all of our analyses and want to have them associated with the genomes when we submit them. I'm working on seeing if I can use the parser I posted above if I pre-split the files into contigs and add the tRNAs/rRNAs by hand, but I'll keep an eye out in case you find your script first!
ErinL is offline   Reply With Quote
Old 05-11-2012, 09:10 AM   #8
nickloman
Senior Member
 
Location: Birmingham, UK

Join Date: Jul 2009
Posts: 356
Default

Found it! Hope it's vaguely useful:

https://gist.github.com/2660685
nickloman is offline   Reply With Quote
Old 05-11-2012, 09:17 AM   #9
ErinL
Junior Member
 
Location: California

Join Date: Jun 2011
Posts: 7
Default

Great! Thanks!

Erin
ErinL is offline   Reply With Quote
Old 05-11-2012, 12:52 PM   #10
ErinL
Junior Member
 
Location: California

Join Date: Jun 2011
Posts: 7
Default

Hey nikloman,

Just as an fyi and a note for potential future users of your script, the code you linked to broke at the first CDS feature in my GBK. I made a couple of minor changes and it seems to work now, although it doesn't pick up the annotations for the tRNAs/rRNAs. At this point I figure it's relatively trivial to go through and add those in by hand for a small number of genomes. In the future I will be submitting an additional ~70 genomes, and will (hopefully) post an updated script with that feature fixed.

I've attached my edits as a plain text file (the forum wont accept a .py file).

Thank you again!

Erin
Attached Files
File Type: txt genbank_to_tbl.txt (3.0 KB, 360 views)
ErinL is offline   Reply With Quote
Old 05-11-2012, 02:19 PM   #11
nickloman
Senior Member
 
Location: Birmingham, UK

Join Date: Jul 2009
Posts: 356
Default

Ah OK, well it's like most scripts - you get it working for your problem and then you forget about it. But glad you could make it run for you!
nickloman is offline   Reply With Quote
Old 08-10-2012, 04:19 PM   #12
oudacontrol
Junior Member
 
Location: Bay Area

Join Date: Jan 2011
Posts: 3
Default

Have either of you found a gff to the Sequin format/.tbl file converter?
oudacontrol is offline   Reply With Quote
Old 08-27-2012, 08:38 AM   #13
ErinL
Junior Member
 
Location: California

Join Date: Jun 2011
Posts: 7
Default

Quote:
Originally Posted by oudacontrol View Post
Have either of you found a gff to the Sequin format/.tbl file converter?
nickloman's script works fine for the format conversion itself, but then there are a myriad of changes that must be made to your original annotations to conform with GenBank naming conventions. For the number of genomes I'm submitting, I found it easier to just submit the fasta files for re-submission through NCBI's pipeline, which spits out Sequin formatted files.
ErinL is offline   Reply With Quote
Old 11-27-2013, 08:49 AM   #14
seb.lees
Member
 
Location: France, Poitiers

Join Date: Sep 2012
Posts: 12
Default

Hi everyone.

for people who have Artemis intalled on their computer, you can also open the .gbk with the soft and use the 'SAVE AS' menu to save it under the sequin/tbl format.

All features are kept, as well as tRNA and rRNA information.

hope it may help.

seb.
seb.lees is offline   Reply With Quote
Old 06-15-2015, 12:35 AM   #15
wanyu
Junior Member
 
Location: China and Australia

Join Date: May 2015
Posts: 4
Default

Thanks, it helps me, but Artemis can only read and convert the first contig in a muti-genbank file.

Quote:
Originally Posted by seb.lees View Post
Hi everyone.

for people who have Artemis intalled on their computer, you can also open the .gbk with the soft and use the 'SAVE AS' menu to save it under the sequin/tbl format.

All features are kept, as well as tRNA and rRNA information.

hope it may help.

seb.

Last edited by wanyu; 06-15-2015 at 04:26 AM.
wanyu is offline   Reply With Quote
Old 06-26-2015, 04:33 PM   #16
kproxy
Junior Member
 
Location: Bellingham WA

Join Date: Oct 2012
Posts: 1
Default

Quote:
Originally Posted by ErinL View Post
Hey nikloman,

Just as an fyi and a note for potential future users of your script, the code you linked to broke at the first CDS feature in my GBK. I made a couple of minor changes and it seems to work now, although it doesn't pick up the annotations for the tRNAs/rRNAs. At this point I figure it's relatively trivial to go through and add those in by hand for a small number of genomes. In the future I will be submitting an additional ~70 genomes, and will (hopefully) post an updated script with that feature fixed.

I've attached my edits as a plain text file (the forum wont accept a .py file).

Thank you again!

Erin
I was just wondering if you've updated this script to catch the tRNAs/rRNAs. I have been working in Artemis for dealing with my genomes but now would like to submit my 10 genomes with my annotations to genbank and have found no good way to generate the .tbl file.

Thanks!
kproxy is offline   Reply With Quote
Old 07-11-2015, 06:55 AM   #17
wanyu
Junior Member
 
Location: China and Australia

Join Date: May 2015
Posts: 4
Smile

Hi,

I have done an edition of the script for converting a GenBank file into a Sequin feature table, which is based on the scripts by nickloman and ErinL. It worked well for me and my submission to GenBank was accepted.

gbk2tbl.py

Please let me know if you encounter any problem of using it.

Thanks.

Quote:
Originally Posted by kproxy View Post
I was just wondering if you've updated this script to catch the tRNAs/rRNAs. I have been working in Artemis for dealing with my genomes but now would like to submit my 10 genomes with my annotations to genbank and have found no good way to generate the .tbl file.

Thanks!
wanyu is offline   Reply With Quote
Old 11-13-2015, 10:32 AM   #18
Bioinform
Member
 
Location: US

Join Date: May 2013
Posts: 17
Default

Hi Wanyu,

i get the following error.

pythogn gbk2tbl.py --modifiers modifier.txt PST.gbk

gbk2tbl.py: error: unrecognized arguments: PST.gbk

Thanks
Bioinform is offline   Reply With Quote
Old 11-13-2015, 10:47 AM   #19
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 6,856
Default

Quote:
Originally Posted by Bioinform View Post
Hi Wanyu,

i get the following error.

pythogn gbk2tbl.py --modifiers modifier.txt PST.gbk

gbk2tbl.py: error: unrecognized arguments: PST.gbk

Thanks
From GitHub page:
Quote:
Usage: python gbk2tbl.py --mincontigsize 200 --prefix <prefix> --modifiers <modifier file> < annotation.gbk 2> stderr.txt
Your command should be

Code:
$ python gbk2tbl.py --modifiers modifier.txt < PST.gbk 2> stderr.txt
GenoMax is offline   Reply With Quote
Old 11-27-2015, 11:33 PM   #20
wanyu
Junior Member
 
Location: China and Australia

Join Date: May 2015
Posts: 4
Smile

Quote:
Originally Posted by GenoMax View Post
From GitHub page:


Your command should be

Code:
$ python gbk2tbl.py --modifiers modifier.txt < PST.gbk 2> stderr.txt
Yes, you are right because the script takes both STDIN and arguments as inputs. I have changed the description of my script on GitHub to make it clearer.

Last edited by wanyu; 12-02-2015 at 03:04 PM.
wanyu is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 07:09 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO