Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • GenBank to .tbl (Sequin format)

    Hi everyone,

    I'm working on submitting a set of whole genome shotgun sequencing projects to GenBank/NCBI. For this set of genomes, I have annotations which were generated using the RAST system (in GenBank and FFF format). However, in order to submit to GenBank/NCBI, these annotations need to be converted to what NCBI calls a 'feature table' (Sequin format/.tbl file). The file format is detailed here: http://www.ncbi.nlm.nih.gov/Sequin/table.html

    I've searched the web for parsers to create the required table format using either GenBank or FFF formated files, and have asked the NCBI support staff if they know of such a parser. However, I have not been able to find one. Does anyone know where I can find something to convert between GenBank or FFF and the NCBI feature table format?

    Thanks in advance!

    Sincerely,
    Erin

  • #2
    I thought you could give them GenBank/EMBL files too? Maybe I'm thinking of EMBL not the NCBI...

    P.S. What is this "FFF format"? I thought it was a typo for GFF, but you did it three times.

    Comment


    • #3
      Originally posted by maubp View Post
      I thought you could give them GenBank/EMBL files too? Maybe I'm thinking of EMBL not the NCBI...
      Unfortunately not Bit crazy. But it's easy to write a conversion script between the two. I've got one somewhere.

      Comment


      • #4
        I asked and they won't except GenBank files. It seems a bit crazy, since that's what they're going to make out of the .tbl/Sequin file anyway.

        I'm sure I could write my own conversion script, but I'm a bit new to this whole scripting business, so it may take me a whole. I thought it was worth checking with the community to see if someone had one handy before I go through the trouble.

        And yes, FFF was a typo for GFF. Guess my thinking cap was a bit loose at the end of the day. Sorry for the confusion.

        Comment


        • #5
          Just found one parser that claims to convert between GenBank and Sequin, but it appears to work for only one contig at a time (created table ends after the last gene of the first contig) and ignores tRNAs.

          Comment


          • #6
            I'll try and dig out my script.

            If it's any help, Torsten Seemann's automated annotation pipeline can output sequin and/or table format:

            Comment


            • #7
              Thanks nickloman, we've thought about just re-doing the annotations through NCBI's pipeline, but the problem is we already used the annotations we have for all of our analyses and want to have them associated with the genomes when we submit them. I'm working on seeing if I can use the parser I posted above if I pre-split the files into contigs and add the tRNAs/rRNAs by hand, but I'll keep an eye out in case you find your script first!

              Comment


              • #8
                Found it! Hope it's vaguely useful:

                genbank_to_tbl.py. GitHub Gist: instantly share code, notes, and snippets.

                Comment


                • #9
                  Great! Thanks!

                  Erin

                  Comment


                  • #10
                    Hey nikloman,

                    Just as an fyi and a note for potential future users of your script, the code you linked to broke at the first CDS feature in my GBK. I made a couple of minor changes and it seems to work now, although it doesn't pick up the annotations for the tRNAs/rRNAs. At this point I figure it's relatively trivial to go through and add those in by hand for a small number of genomes. In the future I will be submitting an additional ~70 genomes, and will (hopefully) post an updated script with that feature fixed.

                    I've attached my edits as a plain text file (the forum wont accept a .py file).

                    Thank you again!

                    Erin
                    Attached Files

                    Comment


                    • #11
                      Ah OK, well it's like most scripts - you get it working for your problem and then you forget about it. But glad you could make it run for you!

                      Comment


                      • #12
                        Have either of you found a gff to the Sequin format/.tbl file converter?

                        Comment


                        • #13
                          Originally posted by oudacontrol View Post
                          Have either of you found a gff to the Sequin format/.tbl file converter?
                          nickloman's script works fine for the format conversion itself, but then there are a myriad of changes that must be made to your original annotations to conform with GenBank naming conventions. For the number of genomes I'm submitting, I found it easier to just submit the fasta files for re-submission through NCBI's pipeline, which spits out Sequin formatted files.

                          Comment


                          • #14
                            Hi everyone.

                            for people who have Artemis intalled on their computer, you can also open the .gbk with the soft and use the 'SAVE AS' menu to save it under the sequin/tbl format.

                            All features are kept, as well as tRNA and rRNA information.

                            hope it may help.

                            seb.

                            Comment


                            • #15
                              Thanks, it helps me, but Artemis can only read and convert the first contig in a muti-genbank file.

                              Originally posted by seb.lees View Post
                              Hi everyone.

                              for people who have Artemis intalled on their computer, you can also open the .gbk with the soft and use the 'SAVE AS' menu to save it under the sequin/tbl format.

                              All features are kept, as well as tRNA and rRNA information.

                              hope it may help.

                              seb.
                              Last edited by wanyu; 06-15-2015, 03:26 AM.

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                Strategies for Sequencing Challenging Samples
                                by seqadmin


                                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                                03-22-2024, 06:39 AM
                              • seqadmin
                                Techniques and Challenges in Conservation Genomics
                                by seqadmin



                                The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                                Avian Conservation
                                Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                                03-08-2024, 10:41 AM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, Yesterday, 06:37 PM
                              0 responses
                              8 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, Yesterday, 06:07 PM
                              0 responses
                              8 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 03-22-2024, 10:03 AM
                              0 responses
                              49 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 03-21-2024, 07:32 AM
                              0 responses
                              67 views
                              0 likes
                              Last Post seqadmin  
                              Working...
                              X