SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Creating index for Annovar database file canisirius Bioinformatics 3 05-28-2015 04:05 AM
Annotation viewing in IGV/Tablet - how to create an alias file TabeaK Bioinformatics 4 12-12-2012 08:25 AM
How to use the Bioperl to parse the parse flat file of UniProtKB database? bewlib Bioinformatics 1 11-29-2012 04:30 PM
BLAST alias performance couttsbr Bioinformatics 1 10-26-2012 12:12 PM
blast database creation ( multiple file ) NicoBxl Bioinformatics 3 10-05-2010 01:40 AM

Reply
 
Thread Tools
Old 07-29-2013, 11:06 AM   #1
bossanova352
Junior Member
 
Location: United States

Join Date: Mar 2013
Posts: 9
Default No alias file for nr database?

Hey all,

I've been searching for anyone else with this problem, but I can't quite find the answer. I've installed Blast+ and I've used update_blastdb.pl to add a local version of the nr database. This is my command:

blastx -query ./2500_SFB_109258_length_12402_cov_5.509837.fasta -db /usr/local/Programs/ncbi-blast-2.2.28+/db/nr -out ./Scaffold_of_interest_Blastx.xml -evalue 1e-5 -outfmt 5

But I keep getting this error:

BLAST Database error: No alias or index file found for protein database [/usr/local/Programs/ncbi-blast-2.2.28+/db/nr] in search path [/usr/local/Programs/ncbi-blast-2.2.28+/db:]

I've added the database folder path to the .ncbirc file, which I have in the home directory, and I know it works because I've added the refseq_protein and cdd_delta databases and they work just fine. Oddly enough, when I specify the path to the nr database in the command above, I get the same error. All nr database files are unzipped, and in the same folder as refseq_protein and cdd_delta databases. I'm stumped!
bossanova352 is offline   Reply With Quote
Old 07-29-2013, 11:26 AM   #2
atcghelix
Member
 
Location: CA

Join Date: Jul 2013
Posts: 74
Default

Do you have a file called nr.pal in /usr/local/Programs/ncbi-blast-2.2.28+/db/ ?
atcghelix is offline   Reply With Quote
Old 07-29-2013, 11:39 AM   #3
bossanova352
Junior Member
 
Location: United States

Join Date: Mar 2013
Posts: 9
Default

No I don't! I see I have a .pal file for the refseq database, so that must be the issue. Where should this file be coming from? One of the zipped folders on the ftp site?
bossanova352 is offline   Reply With Quote
Old 07-29-2013, 11:41 AM   #4
atcghelix
Member
 
Location: CA

Join Date: Jul 2013
Posts: 74
Default

It might not have downloaded perfectly--I would try using update_blastdb.pl to redownload, and use the --decompress flag so you don't need to unzip them all manually. It should be in one of the nr files (the last one?)

perl update_blastdb.pl nr --decompress
atcghelix is offline   Reply With Quote
Old 01-23-2014, 07:52 AM   #5
OTU
Member
 
Location: Utah

Join Date: May 2013
Posts: 44
Default

Hi everyone!
My problem is somehow realted to this described issue. I am using a refseq)protein database, downloaded from ncbi ftp, which consists of in total 9 folders with files .pni .pnd .pog and so on. When I am using the command
$blastp -query ~/IIa.orfs.hmm.faa.db -db ~/refseq_protein -evalue 1e-5 -num_threads 60 -max_target_seqs 5 -outfmt 5 -out IIa.orfs.hmm.blast.xml

I am getting an error:
>>BLAST Database error: No alias or index file found for protein database [.../db/refseq_protein] in search path [.../software/multi-metagenome/R.data.generation::]

Any ideas what is happening?
I even tried to use makeblastdb command, to format my databases, but it doesn't work as well.
$ makeblastdb -in ~/refseq_protein.*.* -dbtype prot -out ~/db/refseq_protein.db
>>Error: Too many positional arguments (1), the offending value: ~/db/refseq_protein.01.phr

Need help!!!!

otu
OTU is offline   Reply With Quote
Old 01-23-2014, 07:58 AM   #6
westerman
Rick Westerman
 
Location: Purdue University, Indiana, USA

Join Date: Jun 2008
Posts: 1,104
Default

1) I suggest using full path names instead of '~'.

2) To help troubleshoot cases of 'no file found' it is handy for us to see an 'ls' of the directory in question just to make sure you haven't done a mistake such as specifying the wrong directory.

3) Pre-formatted refseq_protein should be 81 files -- no folders involved.
westerman is offline   Reply With Quote
Old 01-23-2014, 08:02 AM   #7
OTU
Member
 
Location: Utah

Join Date: May 2013
Posts: 44
Default

This is what I have done - there were everywhere specified full paths (I just eliminated them from the question). And yes, there are 81 files in the folder db in home directory...
OTU is offline   Reply With Quote
Old 01-23-2014, 08:14 AM   #8
westerman
Rick Westerman
 
Location: Purdue University, Indiana, USA

Join Date: Jun 2008
Posts: 1,104
Default

Well, once again not seeing an 'ls' of your directory and not seeing the actual program line you are using (since you edited it), it becomes hard to troubleshoot the problem. Almost all of the time when someone posts about a file not being found it is because they are not using the correct path for the file despite what they think. In other words the file simply isn't there. I've done it about a zillion times myself.

Going by your statement "there 81 files in the folder db in home directory" then your original blastp line is incorrect since you are *not* using the folder 'db in home directory'. Instead you are just using your home directory.

Please check your paths. If nothing else do an:

ls -l ~/refseq_protein* | head --lines=2

And post that.
westerman is offline   Reply With Quote
Old 01-23-2014, 08:27 AM   #9
OTU
Member
 
Location: Utah

Join Date: May 2013
Posts: 44
Default

No problem.
So, from previous post, here is the full paths included:
$ blastp -query /home/bwawrik/software/multi-metagenome/R.data.generation/IIa.orfs.hmm.faa.db -db /home/bwawrik/db/refseq_protein.* -evalue 1e-5 -num_threads 60 -max_target_seqs 5 -outfmt 5 -out IIa.orfs.hmm.blast.xml

And when I used the command:
$ls -l ~/refseq_protein* | head --lines=2

I got an error:
>>ls: cannot access home/bwawrik/db/refseq_protein.*: No such file or directory

I just cannot understand: if it doesn't "see" the files of database, how it gave an error during running of makeblastdb:
$makeblastdb -in /home/bwawrik/db/refseq_protein.* -dbtype prot -out /home/bwawrik/db/refseq_protein.db
>>Error: Too many positional arguments (1), the offending value: /home/bwawrik/db/refseq_protein.01.phr

Because from this, it seems that it CAN actually read the file, but is simply not "happy" with it.
OTU is offline   Reply With Quote
Old 01-23-2014, 08:42 AM   #10
mastal
Senior Member
 
Location: uk

Join Date: Mar 2009
Posts: 667
Default

Try just using

-db /home/bwawrik/db/refseq_protein

the full path, but only the prefix of the database name
mastal is offline   Reply With Quote
Old 01-23-2014, 08:46 AM   #11
westerman
Rick Westerman
 
Location: Purdue University, Indiana, USA

Join Date: Jun 2008
Posts: 1,104
Default

Don't use a star (*) in your -db name. It should be:

-db /home/bwawrik/db/refseq_protein

Otherwise you are telling blastp that there are 81 (or so) files to use as the DB. It wants the overall name, not the overall files. Your initial blastp line did not have the star and thus it seemed correct except for the pathing problem. Your current blastp is obviously incorrect.

As for your makeblastdb error ... you are doing it wrong. I was going to mention that but it is not relevant to why blastp is not working. Once again you are telling the program to use 81 files. The program is basically seeing:

makeblastdb -in /home/bwawrik/db/refseq_protein.00.phr /home/bwawrik/db/refseq_protein.00.pin /home/bwawrik/db/refseq_protein.00.pnd ... etc.

Which of course ruins the one (1) parameter that should be after '-in' and brings up the 'too many positional arguments' error.

But as I said that is neither here nor there for running blastp. Let's not be concerned with makeblastdb.

Going on ... are you sure you ran that 'ls' that I gave you? I specified 'refseq_protein*' not the 'refseq_protein.*' (with a dot) that ls complained about.

Try the blastp without a star in the -db. And post the results of:

ls /home/bwawrik/db/refseq_protein* | head --lines=2
westerman is offline   Reply With Quote
Old 01-23-2014, 09:04 AM   #12
OTU
Member
 
Location: Utah

Join Date: May 2013
Posts: 44
Default

Ok, so far:
$ ls -l /home/bwawrik/db/refseq_protein* | head --lines=2
-rw-rw-r-- 1 bwawrik bwawrik 534122462 Dec 15 18:37 /home/bwawrik/db/refseq_protein.01.phr
-rw-rw-r-- 1 bwawrik bwawrik 23105152 Dec 15 18:37 /home/bwawrik/db/refseq_protein.01.pin

and when using blastp without star:
BLAST Database error: No alias or index file found for protein database [/home/bwawrik/db/refseq_protein] in search path [/home/bwawrik/software/multi-metagenome/R.data.generation::]
OTU is offline   Reply With Quote
Old 01-23-2014, 09:15 AM   #13
westerman
Rick Westerman
 
Location: Purdue University, Indiana, USA

Join Date: Jun 2008
Posts: 1,104
Default

OK. Now we are getting somewhere -- at least I can be sure that the paths look correct. What I find strange is that your database files begin with *.01.* -- mine begin with *.00.*; e.g.,

Quote:
/group/diagrid/databases/ncbi/week-04-2014/refseq_protein.00.phr
/group/diagrid/databases/ncbi/week-04-2014/refseq_protein.00.pin
More importantly we need to make sure that the overall index file is in place. Mine is at the bottom of the listing so that if I do a 'tail --lines=2' instead of using 'head' I get:

Quote:
/group/diagrid/databases/ncbi/week-04-2014/refseq_protein.09.psq
/group/diagrid/databases/ncbi/week-04-2014/refseq_protein.pal
Or using 'ls -l'

Quote:
-rw-r--r-- 1 braub diagrid-apps 275 Dec 15 20:12 /group/diagrid/databases/ncbi/week-04-2014/refseq_protein.pal
What do you get? I am trying to see if the '*.pal' file is present.
westerman is offline   Reply With Quote
Old 01-23-2014, 09:24 AM   #14
OTU
Member
 
Location: Utah

Join Date: May 2013
Posts: 44
Default

I know the problem , why there is not .00 file - I accidently deleted it.
I am downloading it now.
And for the checking of '*.pal', we have problems:
$ ls -l /home/bwawrik/db/refseq_protein* | tail --lines=2
-rw-r--r-- 1 bwawrik bwawrik 59 Jan 21 11:39 /home/bwawrik/db/refseq_protein.2.08.tar.gz.md5
-rw-r--r-- 1 bwawrik bwawrik 59 Jan 21 11:39 /home/bwawrik/db/refseq_protein.2.09.tar.gz.md5
OTU is offline   Reply With Quote
Old 01-23-2014, 09:28 AM   #15
westerman
Rick Westerman
 
Location: Purdue University, Indiana, USA

Join Date: Jun 2008
Posts: 1,104
Default

Looks like your directory has extraneous files in it. They probably do not hurt. How about doing a

ls -l /home/bwawrik/db/refseq_protein*pal

Let's see if you have the overall index file.
westerman is offline   Reply With Quote
Old 01-23-2014, 09:30 AM   #16
OTU
Member
 
Location: Utah

Join Date: May 2013
Posts: 44
Default

$ ls -l /home/bwawrik/db/refseq_protein*pal
-rw-rw-r-- 1 bwawrik bwawrik 275 Dec 15 19:12 /home/bwawrik/db/refseq_protein.pal
OTU is offline   Reply With Quote
Old 01-23-2014, 09:34 AM   #17
westerman
Rick Westerman
 
Location: Purdue University, Indiana, USA

Join Date: Jun 2008
Posts: 1,104
Default

OK, the pal file is there and looking good. Let's see what happens once you have the *.00.* files back in place. The lack of them *may* cause problems although I would have expected a different error message.
westerman is offline   Reply With Quote
Old 01-23-2014, 11:34 AM   #18
OTU
Member
 
Location: Utah

Join Date: May 2013
Posts: 44
Default

Yeah! It works!
Thank you so much!
It took really a long time, to compute this!
OTU is offline   Reply With Quote
Old 01-23-2014, 11:36 AM   #19
westerman
Rick Westerman
 
Location: Purdue University, Indiana, USA

Join Date: Jun 2008
Posts: 1,104
Default

Ditto ... yeah!
westerman is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 09:44 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO