View Single Post
Old 06-10-2020, 08:28 AM   #2
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,077
Default

There is no GUI for Kai Blin's NCBI Genome downloader tool so I am not sure what is crashing for you. That is a command line tool. Install and use on command line. There are thousands of bacterial genomes so be careful with the downloads.

gbff file format is actually GenBank format.

NCBI has assembly summary report files available. Here is the file that you can parse for bacteria. You can look at the relevant field to get the FTP download path for particular genome. GenBank format files will be inside that folder.

Code:
#   See ftp://ftp.ncbi.nlm.nih.gov/genomes/README_assembly_summary.txt for a description of the columns in this file.
# assembly_accession    bioproject      biosample       wgs_master      refseq_category taxid   species_taxid   organism_
name    infraspecific_name      isolate version_status  assembly_level  release_type    genome_rep      seq_rel_date    a
sm_name submitter       gbrs_paired_asm paired_asm_comp ftp_path        excluded_from_refseq    relation_to_type_material
GCA_003023565.1 PRJNA351262     SAMN06020791    PXSA00000000.1  na      1919191 1919191 Halobacteriales archaeon QS_9_68_17              QS_9_68_17      latest  Scaffold        Major   Full    2018/03/27      ASM302356v1     None    na      na       ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/003/023/565/GCA_003023565.1_ASM302356v1      derived from metagenome
GCA_003023575.1 PRJNA351262     SAMN06020787    PXRW00000000.1  na      1919185 1919185 Halobacteriales archaeon QS_7_69_60              QS_7_69_60      latest  Scaffold        Major   Full    2018/03/27      ASM302357v1     None    na      na       ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/003/023/575/GCA_003023575.1_ASM302357v1      derived from metagenome

Last edited by GenoMax; 06-10-2020 at 08:33 AM.
GenoMax is offline   Reply With Quote