SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
SNAP gene finder polijana Bioinformatics 1 02-20-2013 06:02 AM
Augustus training help JueFish Bioinformatics 0 03-28-2011 09:40 AM
Augustus and GBROWSE k-gun12 Bioinformatics 1 11-03-2010 03:32 PM
Training Augustus with tophat k-gun12 Bioinformatics 1 10-21-2010 04:38 PM
Augustus with RNA-Seq osvaldoreis Bioinformatics 1 07-06-2010 10:34 AM

Reply
 
Thread Tools
Old 11-22-2013, 03:03 PM   #1
condomitti
Member
 
Location: São Paulo - Brazil

Join Date: Sep 2013
Posts: 33
Default augustus gene finder

Hello fellows,

I've been trying to train Augustus with Anolis Carolinensis dataset (available through ENSEMBL website). I've downloaded genbank files and run the command etraining, specifying the training file and the species name for which to train. I'm getting the following error message:


mRNA contains character c
GBProcessor::getGeneList(): GBProcessor::getJoin( ): failed!!!
Encountered error after reading 0 annotations.

etraining: ERROR
No genbank sequences found.


Has any of you faced anything similar to that?

This is how the genbank file looks like: http://s13.postimg.org/nlg8peh6f/Cap...2_20_58_58.png

I've been trying to find a solution for almost 3 hours now and nothing.
Any help will be greatly appreciated!

Thank you in advance!
condomitti is offline   Reply With Quote
Old 11-23-2013, 05:57 AM   #2
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,080
Default

Based on the example data provided it appears that Augustus may be expecting the training sequences to be in this format:

Code:
LOCUS       HS04636   9453 bp  DNA
FEATURES             Location/Qualifiers
     source          1..9453
     CDS             join(966..1017,1818..1934,2055..2198,2852..2995,3426..3607,
                     4340..4423,4543..4789,5072..5358,5860..6007,6494..6903)
BASE COUNT     2937 a   1716 c  1710 g   3090 t
ORIGIN
        1 gagctcacat taactattta cagggtaact gcttaggacc agtattatga ggagaattta
       61 cctttcccgc ctctctttcc aagaaacaag gagggggtga aggtacggag aacagtattt
      121 cttctgttga aagcaactta gctacaaaga taaattacag ctatgtacac tgaaggtagc
      181 tatttcattc cacaaaataa gagtttttta aaaagctatg tatgtatgtg ctgcatatag
      241 agcagatata cagcctatta agcgtcgtca ctaaaacata aaacatgtca gcctttctta
      301 accttactcg ccccagtctg tcccgacgtg acttcctcga ccctctaaag acgtacagac
      361 cagacacggc ggcggcggcg ggagagggga ttccctgcgc ccccggacct cagggccgct
      421 cagattcctg gagaggaagc caagtgtcct tctgccctcc cccggtatcc catccaaggc
      481 gatcagtcca gaactggctc tcggaagcgc tcgggcaaag actgcgaaga agaaaagaca
      541 tctggcggaa acctgtgcgc ctggggcggt ggaactcggg gaggagaggg agggatcaga

so on to the next record

     9241 acactgttca ctgttttttt taaaaaaaaa acttgatttg ttattaacat tgatctgctg
     9301 acaaaacctg ggaatttggg ttgtgtatgc gaatgtttca gtgcctcaga caaatgtgta
     9361 tttaacttat gtaaaagata agtctggaaa taaatgtctg tttatttttg tactatttaa
     9421 aaaaaaaaaa aaaaatcgat gtcgactcga gtc
//
LOCUS       HS08198   2344 bp  DNA
FEATURES             Location/Qualifiers
     source          1..2344
     CDS             join(445..582,758..894,1053..1123,1208..1315,1587..1688,177
                     2..1810,1890..1903)
BASE COUNT     400 a   730 c  778 g   436 t
ORIGIN
        1 agcgggcggc ggtcgtgggc ggggttgcag gcgaggctca acgaacgctg gtctgaccgt
       61 cggcgctccc tgttgccggg ccctgagcaa gtggcttcat gaaccccgtg acgttggcca
      121 tggagataag accactgggt gatggtttaa ggaagataac gtgtaaaggg ctaaggactg
      181 tcggtggaaa tcaggggtgc aggagaaatg gataaacagc cagaggtcaa ctcggacttt
GenoMax is offline   Reply With Quote
Old 11-23-2013, 06:24 AM   #3
condomitti
Member
 
Location: São Paulo - Brazil

Join Date: Sep 2013
Posts: 33
Default

GenoMax, do you know any tool I could use to convert from one format to the one in the example?

Thanks.
condomitti is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 06:46 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO