I have a fairly large number of sequin annotation files generated by GenBank's automated PGAP pipeline that I want to convert into bed format. I've been able to use the asn2gb program from the ncbi toolkit to generate GenBank format files from those sequin files, but so far I haven't had much luck converting from genbank into gff. After getting into gff, conversion into bed won't be a problem. But so far I haven't had any luck going from genbank to gff.
I am trying to use bp_genbank2gff3.pl, but am getting this error message:
--------------------- WARNING ---------------------
MSG: Bad LOCUS name? Changing [Staphylococcus_spHMPREF3292-1.0_Cont0>11789] to 'unknown' and length to Staphylococcus_spHMPREF3292-1.0_Cont0>11789
---------------------------------------------------
I thought maybe it was unhappy because of the greater-than symbol in the LOCUS line. So I tried removing it, and then I get a new error message:
------------- EXCEPTION: Bio::Root::Exception -------------
MSG: asking for tag value that does not exist date
STACK: Error::throw
STACK: Bio::Root::Root::throw /gsc/scripts/opt/genome/current/user/lib/perl/Bio/Root/Root.pm:357
STACK: Bio::SeqFeature::Generic::get_tag_values /gsc/scripts/opt/genome/current/user/lib/perl/Bio/SeqFeature/Generic.pm:498
STACK: main::gff_header /gsc/bin/bp_genbank2gff3.pl:895
STACK: /gsc/bin/bp_genbank2gff3.pl:406
-----------------------------------------------------------
All my input files were generated by GenBank itself, and I used a GenBank script to generate the .gbk files I now have. So I would think my .gbk files are correctly formatted. Can anyone either advise me on what I'm doing wrong with the 'bp_genbank2gff3.pl' script? Or if there is a better way to convert from .gbk into .gff that would work too.
Thanks,
John Martin
I am trying to use bp_genbank2gff3.pl, but am getting this error message:
--------------------- WARNING ---------------------
MSG: Bad LOCUS name? Changing [Staphylococcus_spHMPREF3292-1.0_Cont0>11789] to 'unknown' and length to Staphylococcus_spHMPREF3292-1.0_Cont0>11789
---------------------------------------------------
I thought maybe it was unhappy because of the greater-than symbol in the LOCUS line. So I tried removing it, and then I get a new error message:
------------- EXCEPTION: Bio::Root::Exception -------------
MSG: asking for tag value that does not exist date
STACK: Error::throw
STACK: Bio::Root::Root::throw /gsc/scripts/opt/genome/current/user/lib/perl/Bio/Root/Root.pm:357
STACK: Bio::SeqFeature::Generic::get_tag_values /gsc/scripts/opt/genome/current/user/lib/perl/Bio/SeqFeature/Generic.pm:498
STACK: main::gff_header /gsc/bin/bp_genbank2gff3.pl:895
STACK: /gsc/bin/bp_genbank2gff3.pl:406
-----------------------------------------------------------
All my input files were generated by GenBank itself, and I used a GenBank script to generate the .gbk files I now have. So I would think my .gbk files are correctly formatted. Can anyone either advise me on what I'm doing wrong with the 'bp_genbank2gff3.pl' script? Or if there is a better way to convert from .gbk into .gff that would work too.
Thanks,
John Martin
Comment