SEQanswers How to create a BLAST database
 Register FAQ Members List Calendar Search Today's Posts Mark Forums Read

 Similar Threads Thread Thread Starter Forum Replies Last Post deniz Bioinformatics 3 07-07-2019 08:04 AM biobio Bioinformatics 4 06-15-2011 05:20 AM SeqClark Bioinformatics 2 03-07-2011 01:17 AM andreitudor Bioinformatics 4 03-03-2011 07:26 AM CarlElit Bioinformatics 1 01-04-2010 06:23 AM

 02-15-2011, 01:16 AM #1 aliealexandre Junior Member   Location: Japan Join Date: Feb 2011 Posts: 8 How to create a BLAST database Hi everybody, I'm new on this forum and of course, I come here because I have a problem... sorry. I use Blast on macOSx I want to create a protein database named NveProt. Then I created a .fasta file NveProt.fas and saved it in ncbi-blast.2.2.24+/db folder. Then I use the command to create a new database : ./makeblastdb -in NveProt.fas and I get the following message : Building a new DB, current time: 02/15/2011 19:11:39 New DB name: NveProt.fas New DB title: NveProt.fas Sequence type: Protein Keep Linkouts: T Keep MBits: T Maximum file size: 1073741824B BLAST Database error: No alias or index file found for protein database [NveProt.fas] in search path [/Users/aliealexandre/ncbi-blast-2.2.24+/bin::/Users/aliealexandre/ncbi-blast-2.2.24+/db:] What's happening Do I have to modify the path variable ? How to do that ? Thank you for your help Alex
 02-15-2011, 03:02 AM #2 maubp Peter (Biopython etc)   Location: Dundee, Scotland, UK Join Date: Jul 2009 Posts: 1,542 Are you running makeblastdb from the ncbi-blast.2.2.24+/db folder? i.e. Are you in the same directory as the FASTA file?
 02-15-2011, 07:10 AM #3 MDonlin Member   Location: St. Louis, MO Join Date: May 2010 Posts: 14 I've always specified the output names for my blast db: ./makeblastdb -in NveProt.fas -dbtype 'prot' -out NveProt -name -NveProt Paths are defined in the .profile file locate in your home directory. Use a text editor or vi to edit the .profile file. There may be more paths than this, but to define the paths in your email, the file should read export PATH=$PATH:/Users/aliealexandre/ncbi-blast-2.2.24+/bin:/Users/aliealexandre/ncbi-2.2.24+/db:$Path After editing this .profile, type >source .profile Or you can set a PATH at the command line: >export PATH=$PATH:/Users/aliealexandre/ncbi-blast-2.2.24+/bin >export PATH=$PATH:/Users/aliealexandre/ncbi-blast-2.2.24+/db >source .profile Maureen
 02-17-2011, 12:25 AM #4 aliealexandre Junior Member   Location: Japan Join Date: Feb 2011 Posts: 8 Thank you @maubp I copy the makeblastdb binary file in the same folder than my database (NveProt) and it worked In the same way, when I want to launch a blastp, for instance, I have to copy the blastp binary file in the same folder that my query sequence and my database file (actually ./db folder in this case) In sum everything must be in the same folder... which sounds strange to me. I thought that binary files are in the ./bin folder and databases in ./db folder. Do I do something wrong ? In any case, thanks to all of you for your prompt answers Alex
 02-17-2011, 02:15 AM #5 maubp Peter (Biopython etc)   Location: Dundee, Scotland, UK Join Date: Jul 2009 Posts: 1,542 I think you need to learn a bit more about the basics of working at the Unix/Linux command line. If you do this: Code: ./makeblastdb -in NveProt.fas -dbtype 'prot' -out NveProt -name -NveProt then you are telling the operating system to look for makeblastdb in the current directory (the single period or dot means the current directory, a double dot means the parent directory). Assuming BLAST+ is installed properly, you should just do this, and the operating system will look on the path for the installed copy of makeblastdb: Code: makeblastdb -in NveProt.fas -dbtype 'prot' -out NveProt -name -NveProt If you haven't installed BLAST+ at the system level (e.g. you are not an administrator) then you can configure your PATH to include your BLAST tools, or just give the path explicitly, e.g. Code: /home/alex/blast/bin/makeblastdb -in NveProt.fas -dbtype 'prot' -out NveProt -name -NveProt Last edited by maubp; 02-17-2011 at 02:16 AM. Reason: typo
 02-17-2011, 05:53 PM #6 aliealexandre Junior Member   Location: Japan Join Date: Feb 2011 Posts: 8 maubbp you're right, I should learn more about UNIX. I'm on this way now... Following your advices I succeed in accomplishing the makeblastdb command. Thank you very much Alex
 02-18-2011, 12:08 AM #7 maubp Peter (Biopython etc)   Location: Dundee, Scotland, UK Join Date: Jul 2009 Posts: 1,542 Great - well done
 03-03-2013, 06:33 PM #8 geneart Member   Location: DC area Join Date: Sep 2011 Posts: 42 local blast database Hi, I am trying to set up a local BLAST, creating a database of miRNA seq from miRBASE. essentially using BLAST to use miRBASE seqs as a database. Everything worked fine in creating a database with the mature.fa sequences from miRBASE. Do I need to format this database? or use it directly to perform searches as is? Please can anyone guide me? Thanks geneart
 03-04-2013, 01:12 AM #9 maubp Peter (Biopython etc)   Location: Dundee, Scotland, UK Join Date: Jul 2009 Posts: 1,542 Hi geneart, I'm having trouble understanding your question. Are you saying you've downloaded a FASTA file from miRBASE and want to turn this into a database? If so yes, you should use the makeblastdb command. You can search directly against a FASTA file, but it is slower (and only uses one CPU), but will also give you pairwise e-values which will look more impressive than they really are, see: http://blastedbio.blogspot.co.uk/201...ize-for-e.html
 03-05-2013, 10:09 AM #10 geneart Member   Location: DC area Join Date: Sep 2011 Posts: 42 BLAST database Yes, maubp ,you are correct ! I did setup the database using miRBASE mature.fa files .I was wondering if I need to format that datanase at all in doing that. It does not matter now as it worked. But I had another question again. I have short RNA sequences from Illumina sequencer and am trying to find matches in miRBASE through a blast standalone (which has mature.fa from miRBASE to be used as a database). Now when I directly use miRBASE and use SSEARCH I get hits however when I BLST locally I don't get any hits. I have used default parameters, which I would like to tweek to see if results change. My problem is how do I run BLASTN-short on cmd line or tweek the parameters in BLAST standalone? Any help? I looked up the BLAST manual that comes with the standalone but could not find it. I am not that command line savy so hoping someone can suggest ways of doing it ? Thanks in advance Geneart.
 04-26-2013, 08:18 AM #11 utagenomics Junior Member   Location: Texas Join Date: Apr 2013 Posts: 4 Hey everyone, So I am also having some similar issues. I successfully made my database in the correct folder, but then when I actually try to run my blast, it isn't working... #I used the following to make the db makeblastdb -in supercontigs.fasta.txt -dbtype 'nucl' -out p.full #I then tried to run this the next step query=NGF.fasta -db=p.full -outfmt="6" -out=blast Any ideas how I can change my second step to successfully run the blast search? Thanks, Kyle
 04-29-2013, 03:19 AM #12 maubp Peter (Biopython etc)   Location: Dundee, Scotland, UK Join Date: Jul 2009 Posts: 1,542 Kyle - you've left out at least part of the command you ran, and more importantly you left out important details like what the error message was. That makes it almost impossible to guess what you've done wrong.
04-29-2013, 05:03 AM   #13
rhinoceros
Senior Member

Location: sub-surface moon base

Join Date: Apr 2013
Posts: 372

Quote:
 Originally Posted by utagenomics Hey everyone, #I used the following to make the db makeblastdb -in supercontigs.fasta.txt -dbtype 'nucl' -out p.full #I then tried to run this the next step query=NGF.fasta -db=p.full -outfmt="6" -out=blast Any ideas how I can change my second step to successfully run the blast search?
Well for one you haven't specified a program, which should probably be blastn in your case. Second, the correct syntax would be:

blastn -query NGF.fasta -db p.full -outfmt 6 -out blast

If that still doesn't work, then I'd guess that there are problems with your environmental variables. You could go around this by specifying the paths of your program, query file, and db, i.e.

/where/is/blast/bin/blastn -query /where/is/this/fileNGF.fasta -db /where/is/this/db/p.full -outfmt 6 -out blast

Last edited by rhinoceros; 04-29-2013 at 05:33 AM.

 05-05-2013, 03:53 AM #14 ahmadsam Junior Member   Location: best Join Date: Dec 2011 Posts: 9 Creating blastdb in windows after downloading and installation 1-Use command prompt and go to the bin directory for creating a database like protein database you need a simple multi fasta file 2- use this command : Code: makeblastdb -in D:\\ref.fasta -dbtype prot -out Plant with the above code makeblastdb generate 3 file with .pin , .phr and .psq format in the bin directory. Plant is the name of output database. 3- for using this database in sample query : Code: blastp -query D:\\in.txt -db plant -out D:\\Out.txt
 10-30-2014, 08:27 AM #15 poudap Junior Member   Location: Norway Join Date: Oct 2014 Posts: 3 How I can make custom database in batch? I am wondering if you could tell me how I can make a database from different files in batch? The commands like entry_batch does not respond. I have also used below command but it gives the error stated afterwards: makeblastdb -in Strains/*.fasta -dbtype 'nucl' -out db/stec_samples Error: Too many positional arguments (1), the offending value: Strains/Sample_1.fasta P.S. A single database can be made from Sample_1.fasta with no error.
 11-02-2014, 11:13 PM #16 rhinoceros Senior Member   Location: sub-surface moon base Join Date: Apr 2013 Posts: 372 This might work: Code: makeblastdb -in $(cat Strains/*.fasta) -dbtype 'nucl' -out db/stec_samples __________________ savetherhino.org 11-03-2014, 01:09 AM #17 maubp Peter (Biopython etc) Location: Dundee, Scotland, UK Join Date: Jul 2009 Posts: 1,542 Quote:  Originally Posted by poudap I am wondering if you could tell me how I can make a database from different files in batch? BLAST does not like this kind of command, where the second FASTA file is considered to be a positional argument: Code: makeblastdb -in Strains/example1.fasta Strains/example2.fasta -dbtype nucl -out db/stec_samples From past experimentation, I know it will work if you quote the list of filenames making them space separated (filenames with spaces are a problem): Code: makeblastdb -in "Strains/example1.fasta Strains/example2.fasta" -dbtype nucl -out db/stec_samples  11-03-2014, 06:48 AM #18 poudap Junior Member Location: Norway Join Date: Oct 2014 Posts: 3 Unfortunately the$(cat Strains/*.fasta) command did not work. The command space seperated also gave error: BLAST options error: File Sample_no2.fasta does not exist.
 11-03-2014, 06:49 AM #19 maubp Peter (Biopython etc)   Location: Dundee, Scotland, UK Join Date: Jul 2009 Posts: 1,542 The simple brute force solution is make a single merged FASTA file (e.g. using the cat command), and then build a BLAST database out of that.
 02-01-2015, 11:47 PM #20 kaps Member   Location: Uganda Join Date: Jan 2015 Posts: 71 Dear all, In creating a local blast database, I downloaded fasta files from ftp://ftp.ncbi.nlm.nih.gov/refseq/release/viral/ and http://www.ncbi.nlm.nih.gov/sites/nu...=%22Viruses%22[PORG]+AND+srcdb_refseq[PROP]. The former appeared larger than the latter, which of them is better? Do both both contain nr sequences? are they different? Thanks