SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
BLAST+ creating custom blast database and using blast+ filtering features deniz Bioinformatics 3 07-07-2019 08:04 AM
BLAST database error - when changing to new BLAST+ local program biobio Bioinformatics 4 06-15-2011 05:20 AM
Create local BLAST database SeqClark Bioinformatics 2 03-07-2011 01:17 AM
Blast+ database with gene annotation andreitudor Bioinformatics 4 03-03-2011 07:26 AM
Database of BLAST CarlElit Bioinformatics 1 01-04-2010 06:23 AM

Reply
 
Thread Tools
Old 02-15-2011, 01:16 AM   #1
aliealexandre
Junior Member
 
Location: Japan

Join Date: Feb 2011
Posts: 8
Default How to create a BLAST database

Hi everybody,

I'm new on this forum and of course, I come here because I have a problem... sorry.

I use Blast on macOSx

I want to create a protein database named NveProt. Then I created a .fasta file NveProt.fas and saved it in ncbi-blast.2.2.24+/db folder.

Then I use the command to create a new database : ./makeblastdb -in NveProt.fas

and I get the following message :

Building a new DB, current time: 02/15/2011 19:11:39
New DB name: NveProt.fas
New DB title: NveProt.fas
Sequence type: Protein
Keep Linkouts: T
Keep MBits: T
Maximum file size: 1073741824B
BLAST Database error: No alias or index file found for protein database [NveProt.fas] in search path [/Users/aliealexandre/ncbi-blast-2.2.24+/bin::/Users/aliealexandre/ncbi-blast-2.2.24+/db:]

What's happening
Do I have to modify the path variable ? How to do that ?

Thank you for your help

Alex
aliealexandre is offline   Reply With Quote
Old 02-15-2011, 03:02 AM   #2
maubp
Peter (Biopython etc)
 
Location: Dundee, Scotland, UK

Join Date: Jul 2009
Posts: 1,542
Default

Are you running makeblastdb from the ncbi-blast.2.2.24+/db folder? i.e. Are you in the same directory as the FASTA file?
maubp is offline   Reply With Quote
Old 02-15-2011, 07:10 AM   #3
MDonlin
Member
 
Location: St. Louis, MO

Join Date: May 2010
Posts: 14
Default

I've always specified the output names for my blast db:
./makeblastdb -in NveProt.fas -dbtype 'prot' -out NveProt -name -NveProt

Paths are defined in the .profile file locate in your home directory. Use a text editor or vi to edit the .profile file. There may be more paths than this, but to define the paths in your email, the file should read
export
PATH=$PATH:/Users/aliealexandre/ncbi-blast-2.2.24+/bin:/Users/aliealexandre/ncbi-2.2.24+/db:$Path

After editing this .profile, type
>source .profile

Or you can set a PATH at the command line:
>export PATH=$PATH:/Users/aliealexandre/ncbi-blast-2.2.24+/bin
>export PATH=$PATH:/Users/aliealexandre/ncbi-blast-2.2.24+/db
>source .profile

Maureen
MDonlin is offline   Reply With Quote
Old 02-17-2011, 12:25 AM   #4
aliealexandre
Junior Member
 
Location: Japan

Join Date: Feb 2011
Posts: 8
Default

Thank you

@maubp I copy the makeblastdb binary file in the same folder than my database (NveProt) and it worked

In the same way, when I want to launch a blastp, for instance, I have to copy the blastp binary file in the same folder that my query sequence and my database file (actually ./db folder in this case)

In sum everything must be in the same folder... which sounds strange to me.

I thought that binary files are in the ./bin folder and databases in ./db folder.

Do I do something wrong ?

In any case, thanks to all of you for your prompt answers

Alex
aliealexandre is offline   Reply With Quote
Old 02-17-2011, 02:15 AM   #5
maubp
Peter (Biopython etc)
 
Location: Dundee, Scotland, UK

Join Date: Jul 2009
Posts: 1,542
Default

I think you need to learn a bit more about the basics of working at the Unix/Linux command line. If you do this:

Code:
./makeblastdb -in NveProt.fas -dbtype 'prot' -out NveProt -name -NveProt
then you are telling the operating system to look for makeblastdb in the current directory (the single period or dot means the current directory, a double dot means the parent directory).

Assuming BLAST+ is installed properly, you should just do this, and the operating system will look on the path for the installed copy of makeblastdb:

Code:
makeblastdb -in NveProt.fas -dbtype 'prot' -out NveProt -name -NveProt
If you haven't installed BLAST+ at the system level (e.g. you are not an administrator) then you can configure your PATH to include your BLAST tools, or just give the path explicitly, e.g.

Code:
/home/alex/blast/bin/makeblastdb -in NveProt.fas -dbtype 'prot' -out NveProt -name -NveProt

Last edited by maubp; 02-17-2011 at 02:16 AM. Reason: typo
maubp is offline   Reply With Quote
Old 02-17-2011, 05:53 PM   #6
aliealexandre
Junior Member
 
Location: Japan

Join Date: Feb 2011
Posts: 8
Default

maubbp you're right, I should learn more about UNIX. I'm on this way now... Following your advices I succeed in accomplishing the makeblastdb command.

Thank you very much

Alex
aliealexandre is offline   Reply With Quote
Old 02-18-2011, 12:08 AM   #7
maubp
Peter (Biopython etc)
 
Location: Dundee, Scotland, UK

Join Date: Jul 2009
Posts: 1,542
Default

Great - well done
maubp is offline   Reply With Quote
Old 03-03-2013, 06:33 PM   #8
geneart
Member
 
Location: DC area

Join Date: Sep 2011
Posts: 42
Default local blast database

Hi,
I am trying to set up a local BLAST, creating a database of miRNA seq from miRBASE. essentially using BLAST to use miRBASE seqs as a database. Everything worked fine in creating a database with the mature.fa sequences from miRBASE. Do I need to format this database? or use it directly to perform searches as is? Please can anyone guide me?
Thanks
geneart
geneart is offline   Reply With Quote
Old 03-04-2013, 01:12 AM   #9
maubp
Peter (Biopython etc)
 
Location: Dundee, Scotland, UK

Join Date: Jul 2009
Posts: 1,542
Default

Hi geneart, I'm having trouble understanding your question. Are you saying you've downloaded a FASTA file from miRBASE and want to turn this into a database? If so yes, you should use the makeblastdb command.

You can search directly against a FASTA file, but it is slower (and only uses one CPU), but will also give you pairwise e-values which will look more impressive than they really are, see:
http://blastedbio.blogspot.co.uk/201...ize-for-e.html
maubp is offline   Reply With Quote
Old 03-05-2013, 10:09 AM   #10
geneart
Member
 
Location: DC area

Join Date: Sep 2011
Posts: 42
Default BLAST database

Yes, maubp ,you are correct ! I did setup the database using miRBASE mature.fa files .I was wondering if I need to format that datanase at all in doing that. It does not matter now as it worked. But I had another question again.
I have short RNA sequences from Illumina sequencer and am trying to find matches in miRBASE through a blast standalone (which has mature.fa from miRBASE to be used as a database). Now when I directly use miRBASE and use SSEARCH I get hits however when I BLST locally I don't get any hits.
I have used default parameters, which I would like to tweek to see if results change. My problem is how do I run BLASTN-short on cmd line or tweek the parameters in BLAST standalone? Any help? I looked up the BLAST manual that comes with the standalone but could not find it.
I am not that command line savy so hoping someone can suggest ways of doing it ?
Thanks in advance
Geneart.
geneart is offline   Reply With Quote
Old 04-26-2013, 08:18 AM   #11
utagenomics
Junior Member
 
Location: Texas

Join Date: Apr 2013
Posts: 4
Default

Hey everyone,

So I am also having some similar issues. I successfully made my database in the correct folder, but then when I actually try to run my blast, it isn't working...

#I used the following to make the db

makeblastdb -in supercontigs.fasta.txt -dbtype 'nucl' -out p.full

#I then tried to run this the next step

query=NGF.fasta -db=p.full -outfmt="6" -out=blast

Any ideas how I can change my second step to successfully run the blast search?

Thanks,
Kyle
utagenomics is offline   Reply With Quote
Old 04-29-2013, 03:19 AM   #12
maubp
Peter (Biopython etc)
 
Location: Dundee, Scotland, UK

Join Date: Jul 2009
Posts: 1,542
Default

Kyle - you've left out at least part of the command you ran, and more importantly you left out important details like what the error message was. That makes it almost impossible to guess what you've done wrong.
maubp is offline   Reply With Quote
Old 04-29-2013, 05:03 AM   #13
rhinoceros
Senior Member
 
Location: sub-surface moon base

Join Date: Apr 2013
Posts: 372
Default

Quote:
Originally Posted by utagenomics View Post
Hey everyone,

#I used the following to make the db

makeblastdb -in supercontigs.fasta.txt -dbtype 'nucl' -out p.full

#I then tried to run this the next step

query=NGF.fasta -db=p.full -outfmt="6" -out=blast

Any ideas how I can change my second step to successfully run the blast search?
Well for one you haven't specified a program, which should probably be blastn in your case. Second, the correct syntax would be:

blastn -query NGF.fasta -db p.full -outfmt 6 -out blast

If that still doesn't work, then I'd guess that there are problems with your environmental variables. You could go around this by specifying the paths of your program, query file, and db, i.e.

/where/is/blast/bin/blastn -query /where/is/this/fileNGF.fasta -db /where/is/this/db/p.full -outfmt 6 -out blast

optional flag you should consider: -num_threads INSERT_NUMBER_OF_CORES_IN_YOUR_SYSTEM, e.g. -num_threads 2

Last edited by rhinoceros; 04-29-2013 at 05:33 AM.
rhinoceros is offline   Reply With Quote
Old 05-05-2013, 03:53 AM   #14
ahmadsam
Junior Member
 
Location: best

Join Date: Dec 2011
Posts: 9
Default Creating blastdb in windows

after downloading and installation
1-Use command prompt and go to the bin directory
for creating a database like protein database you need a simple multi fasta file

2- use this command :
Code:
makeblastdb -in D:\\ref.fasta -dbtype prot -out Plant
with the above code makeblastdb generate 3 file with .pin , .phr and .psq format in the bin directory.
Plant is the name of output database.

3- for using this database in sample query :

Code:
blastp -query D:\\in.txt -db plant -out D:\\Out.txt
ahmadsam is offline   Reply With Quote
Old 10-30-2014, 08:27 AM   #15
poudap
Junior Member
 
Location: Norway

Join Date: Oct 2014
Posts: 3
Default How I can make custom database in batch?

I am wondering if you could tell me how I can make a database from different files in batch? The commands like entry_batch does not respond.
I have also used below command but it gives the error stated afterwards:

makeblastdb -in Strains/*.fasta -dbtype 'nucl' -out db/stec_samples

Error: Too many positional arguments (1), the offending value: Strains/Sample_1.fasta

P.S. A single database can be made from Sample_1.fasta with no error.
poudap is offline   Reply With Quote
Old 11-02-2014, 11:13 PM   #16
rhinoceros
Senior Member
 
Location: sub-surface moon base

Join Date: Apr 2013
Posts: 372
Default

This might work:

Code:
makeblastdb -in $(cat Strains/*.fasta) -dbtype 'nucl' -out db/stec_samples
__________________
savetherhino.org
rhinoceros is offline   Reply With Quote
Old 11-03-2014, 01:09 AM   #17
maubp
Peter (Biopython etc)
 
Location: Dundee, Scotland, UK

Join Date: Jul 2009
Posts: 1,542
Default

Quote:
Originally Posted by poudap View Post
I am wondering if you could tell me how I can make a database from different files in batch?
BLAST does not like this kind of command, where the second FASTA file is considered to be a positional argument:
Code:
makeblastdb -in Strains/example1.fasta Strains/example2.fasta -dbtype nucl -out db/stec_samples
From past experimentation, I know it will work if you quote the list of filenames making them space separated (filenames with spaces are a problem):

Code:
makeblastdb -in "Strains/example1.fasta Strains/example2.fasta" -dbtype nucl -out db/stec_samples
maubp is offline   Reply With Quote
Old 11-03-2014, 06:48 AM   #18
poudap
Junior Member
 
Location: Norway

Join Date: Oct 2014
Posts: 3
Default

Unfortunately the $(cat Strains/*.fasta) command did not work.

The command space seperated also gave error:
BLAST options error: File Sample_no2.fasta does not exist.
poudap is offline   Reply With Quote
Old 11-03-2014, 06:49 AM   #19
maubp
Peter (Biopython etc)
 
Location: Dundee, Scotland, UK

Join Date: Jul 2009
Posts: 1,542
Default

The simple brute force solution is make a single merged FASTA file (e.g. using the cat command), and then build a BLAST database out of that.
maubp is offline   Reply With Quote
Old 02-01-2015, 11:47 PM   #20
kaps
Member
 
Location: Uganda

Join Date: Jan 2015
Posts: 71
Default

Dear all, In creating a local blast database, I downloaded fasta files from ftp://ftp.ncbi.nlm.nih.gov/refseq/release/viral/ and http://www.ncbi.nlm.nih.gov/sites/nu...=%22Viruses%22[PORG]+AND+srcdb_refseq[PROP]. The former appeared larger than the latter, which of them is better? Do both both contain nr sequences? are they different?

Thanks
kaps is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 05:41 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO