 02-15-2011, 01:16 AM
#1
How to create a BLAST database

Hi everybody,

I'm new on this forum and of course, I come here because I have a problem... sorry.

I use Blast on macOSx
I want to create a protein database named NveProt.
Then I created a .fasta file NveProt.fas and saved it in ncbi-blast.2.2.24+/db folder.
Then I use the command to create a new database :
./makeblastdb -in NveProt.fas
and I get the following message :

Building a new DB, current time: 02/15/2011 19:11:39
New DB name: NveProt.fas
New DB title: NveProt.fas
Sequence type: Protein
Keep Linkouts: T
Keep MBits: T
Maximum file size: 1073741824B
BLAST Database error: No alias or index file found for protein database [NveProt.fas] in search path [/Users/aliealexandre/ncbi-blast-2.2.24+/bin::/Users/aliealexandre/ncbi-blast-2.2.24+/db:]

What's happening Do I have to modify the path variable ? How to do that ?

Thank you for your help
Alex
 02-15-2011, 03:02 AM
#2

Are you running makeblastdb from the ncbi-blast.2.2.24+/db folder? i.e. Are you in the same directory as the FASTA file?
 02-15-2011, 07:10 AM
#3

I've always specified the output names for my blast db:

./makeblastdb -in NveProt.fas -dbtype 'prot' -out NveProt -name -NveProt

Paths are defined in the .profile file locate in your home directory. Use a text editor or vi to edit the .profile file. There may be more paths than this, but to define the paths in your email, the file should read

export PATH=$PATH:/Users/aliealexandre/ncbi-blast-2.2.24+/bin:/Users/aliealexandre/ncbi-2.2.24+/db:$Path

After editing this .profile, type
>source .profile

Or you can set a PATH at the command line:
>export PATH=$PATH:/Users/aliealexandre/ncbi-blast-2.2.24+/bin
>export PATH=$PATH:/Users/aliealexandre/ncbi-blast-2.2.24+/db
>source .profile

Maureen
 02-17-2011, 12:25 AM
#4

Thank you

@maubp
I copy the makeblastdb binary file in the same folder than my database (NveProt) and it worked

In the same way, when I want to launch a blastp, for instance, I have to copy the blastp binary file in the same folder that my query sequence and my database file (actually ./db folder in this case)

In sum everything must be in the same folder... which sounds strange to me. I thought that binary files are in the ./bin folder and databases in ./db folder.

Do I do something wrong ?

In any case, thanks to all of you for your prompt answers

Alex
 02-17-2011, 02:15 AM
#5

I think you need to learn a bit more about the basics of working at the Unix/Linux command line.

If you do this:

Code:
./makeblastdb -in NveProt.fas -dbtype 'prot' -out NveProt -name -NveProt
then you are telling the operating system to look for makeblastdb in the current directory (the single period or dot means the current directory, a double dot means the parent directory).

Assuming BLAST+ is installed properly, you should just do this, and the operating system will look on the path for the installed copy of makeblastdb:

Code:
makeblastdb -in NveProt.fas -dbtype 'prot' -out NveProt -name -NveProt
If you haven't installed BLAST+ at the system level (e.g. you are not an administrator) then you can configure your PATH to include your BLAST tools, or just give the path explicitly, e.g.

Code:
/home/alex/blast/bin/makeblastdb -in NveProt.fas -dbtype 'prot' -out NveProt -name -NveProt
 02-17-2011, 05:53 PM
#6

maubbp you're right, I should learn more about UNIX. I'm on this way now...

Following your advices I succeed in accomplishing the makeblastdb command.

Thank you very much

Alex
 02-18-2011, 12:08 AM
#7

Great - well done
 03-03-2013, 06:33 PM
#8
local blast database

Hi,

I am trying to set up a local BLAST, creating a database of miRNA seq from miRBASE. essentially using BLAST to use miRBASE seqs as a database.

Everything worked fine in creating a database with the mature.fa sequences from miRBASE.

Do I need to format this database? or use it directly to perform searches as is?

Please can anyone guide me?

Thanks
geneart
 03-04-2013, 01:12 AM
#9

Hi geneart,

I'm having trouble understanding your question. Are you saying you've downloaded a FASTA file from miRBASE and want to turn this into a database? If so yes, you should use the makeblastdb command.

You can search directly against a FASTA file, but it is slower (and only uses one CPU), but will also give you pairwise e-values which will look more impressive than they really are, see:
http://blastedbio.blogspot.co.uk/201...ize-for-e.html
 03-05-2013, 10:09 AM
#10
BLAST database

Yes, maubp ,you are correct ! I did setup the database using miRBASE mature.fa files .I was wondering if I need to format that datanase at all in doing that. It does not matter now as it worked.

But I had another question again. I have short RNA sequences from Illumina sequencer and am trying to find matches in miRBASE through a blast standalone (which has mature.fa from miRBASE to be used as a database). Now when I directly use miRBASE and use SSEARCH I get hits however when I BLST locally I don't get any hits. I have used default parameters, which I would like to tweek to see if results change.

My problem is how do I run BLASTN-short on cmd line or tweek the parameters in BLAST standalone? Any help? I looked up the BLAST manual that comes with the standalone but could not find it. I am not that command line savy so hoping someone can suggest ways of doing it ?

Thanks in advance
Geneart.
 04-26-2013, 08:18 AM
#11

Hey everyone,

So I am also having some similar issues. I successfully made my database in the correct folder, but then when I actually try to run my blast, it isn't working...

#I used the following to make the db
makeblastdb -in supercontigs.fasta.txt -dbtype 'nucl' -out p.full

#I then tried to run this the next step
query=NGF.fasta -db=p.full -outfmt="6" -out=blast

Any ideas how I can change my second step to successfully run the blast search?

Thanks,
Kyle
 04-29-2013, 03:19 AM
#12

Kyle - you've left out at least part of the command you ran, and more importantly you left out important details like what the error message was. That makes it almost impossible to guess what you've done wrong.
04-29-2013, 05:03 AM
#13

Well for one you haven't specified a program, which should probably be blastn in your case. Second, the correct syntax would be:

blastn -query NGF.fasta -db p.full -outfmt 6 -out blast

If that still doesn't work, then I'd guess that there are problems with your environmental variables. You could go around this by specifying the paths of your program, query file, and db, i.e.

/where/is/blast/bin/blastn -query /where/is/this/fileNGF.fasta -db /where/is/this/db/p.full -outfmt 6 -out blast

 05-05-2013, 03:53 AM
#14
Creating blastdb in windows

after downloading and installation
1-Use command prompt and go to the bin directory
for creating a database like protein database you need a simple multi fasta file
2- use this command :
Code:
makeblastdb -in D:\\ref.fasta -dbtype prot -out Plant
with the above code makeblastdb generate 3 file with .pin , .phr and .psq format in the bin directory.
Plant is the name of output database.

3- for using this database in sample query :
Code:
blastp -query D:\\in.txt -db plant -out D:\\Out.txt
 10-30-2014, 08:27 AM
#15
How I can make custom database in batch?

I am wondering if you could tell me how I can make a database from different files in batch?

The commands like entry_batch does not respond. I have also used below command but it gives the error stated afterwards:

makeblastdb -in Strains/*.fasta -dbtype 'nucl' -out db/stec_samples

Error: Too many positional arguments (1), the offending value: Strains/Sample_1.fasta

P.S. A single database can be made from Sample_1.fasta with no error.
 11-02-2014, 11:13 PM
#16

This might work:

Code:
makeblastdb -in $(cat Strains/*.fasta) -dbtype 'nucl' -out db/stec_samples

11-03-2014, 01:09 AM
#17

BLAST does not like this kind of command, where the second FASTA file is considered to be a positional argument:

Code:
makeblastdb -in Strains/example1.fasta Strains/example2.fasta -dbtype nucl -out db/stec_samples
From past experimentation, I know it will work if you quote the list of filenames making them space separated (filenames with spaces are a problem):

Code:
makeblastdb -in "Strains/example1.fasta Strains/example2.fasta" -dbtype nucl -out db/stec_samples

11-03-2014, 06:48 AM
#18

Unfortunately the $(cat Strains/*.fasta) command did not work. The command space seperated also gave error:

BLAST options error: File Sample_no2.fasta does not exist.
 11-03-2014, 06:49 AM
#19

The simple brute force solution is make a single merged FASTA file (e.g. using the cat command), and then build a BLAST database out of that.
 02-01-2015, 11:47 PM
#20

Dear all,

In creating a local blast database, I downloaded fasta files from ftp://ftp.ncbi.nlm.nih.gov/refseq/release/viral/ and http://www.ncbi.nlm.nih.gov/sites/nu...=%22Viruses%22[PORG]+AND+srcdb_refseq[PROP].

The former appeared larger than the latter, which of them is better? Do both both contain nr sequences? are they different?

Thanks