Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • No alias file for nr database?

    Hey all,

    I've been searching for anyone else with this problem, but I can't quite find the answer. I've installed Blast+ and I've used update_blastdb.pl to add a local version of the nr database. This is my command:

    blastx -query ./2500_SFB_109258_length_12402_cov_5.509837.fasta -db /usr/local/Programs/ncbi-blast-2.2.28+/db/nr -out ./Scaffold_of_interest_Blastx.xml -evalue 1e-5 -outfmt 5

    But I keep getting this error:

    BLAST Database error: No alias or index file found for protein database [/usr/local/Programs/ncbi-blast-2.2.28+/db/nr] in search path [/usr/local/Programs/ncbi-blast-2.2.28+/db:]

    I've added the database folder path to the .ncbirc file, which I have in the home directory, and I know it works because I've added the refseq_protein and cdd_delta databases and they work just fine. Oddly enough, when I specify the path to the nr database in the command above, I get the same error. All nr database files are unzipped, and in the same folder as refseq_protein and cdd_delta databases. I'm stumped!

  • #2
    Do you have a file called nr.pal in /usr/local/Programs/ncbi-blast-2.2.28+/db/ ?

    Comment


    • #3
      No I don't! I see I have a .pal file for the refseq database, so that must be the issue. Where should this file be coming from? One of the zipped folders on the ftp site?

      Comment


      • #4
        It might not have downloaded perfectly--I would try using update_blastdb.pl to redownload, and use the --decompress flag so you don't need to unzip them all manually. It should be in one of the nr files (the last one?)

        perl update_blastdb.pl nr --decompress

        Comment


        • #5
          Hi everyone!
          My problem is somehow realted to this described issue. I am using a refseq)protein database, downloaded from ncbi ftp, which consists of in total 9 folders with files .pni .pnd .pog and so on. When I am using the command
          $blastp -query ~/IIa.orfs.hmm.faa.db -db ~/refseq_protein -evalue 1e-5 -num_threads 60 -max_target_seqs 5 -outfmt 5 -out IIa.orfs.hmm.blast.xml

          I am getting an error:
          >>BLAST Database error: No alias or index file found for protein database [.../db/refseq_protein] in search path [.../software/multi-metagenome/R.data.generation::]

          Any ideas what is happening?
          I even tried to use makeblastdb command, to format my databases, but it doesn't work as well.
          $ makeblastdb -in ~/refseq_protein.*.* -dbtype prot -out ~/db/refseq_protein.db
          >>Error: Too many positional arguments (1), the offending value: ~/db/refseq_protein.01.phr

          Need help!!!!

          otu

          Comment


          • #6
            1) I suggest using full path names instead of '~'.

            2) To help troubleshoot cases of 'no file found' it is handy for us to see an 'ls' of the directory in question just to make sure you haven't done a mistake such as specifying the wrong directory.

            3) Pre-formatted refseq_protein should be 81 files -- no folders involved.

            Comment


            • #7
              This is what I have done - there were everywhere specified full paths (I just eliminated them from the question). And yes, there are 81 files in the folder db in home directory...

              Comment


              • #8
                Well, once again not seeing an 'ls' of your directory and not seeing the actual program line you are using (since you edited it), it becomes hard to troubleshoot the problem. Almost all of the time when someone posts about a file not being found it is because they are not using the correct path for the file despite what they think. In other words the file simply isn't there. I've done it about a zillion times myself.

                Going by your statement "there 81 files in the folder db in home directory" then your original blastp line is incorrect since you are *not* using the folder 'db in home directory'. Instead you are just using your home directory.

                Please check your paths. If nothing else do an:

                ls -l ~/refseq_protein* | head --lines=2

                And post that.

                Comment


                • #9
                  No problem.
                  So, from previous post, here is the full paths included:
                  $ blastp -query /home/bwawrik/software/multi-metagenome/R.data.generation/IIa.orfs.hmm.faa.db -db /home/bwawrik/db/refseq_protein.* -evalue 1e-5 -num_threads 60 -max_target_seqs 5 -outfmt 5 -out IIa.orfs.hmm.blast.xml

                  And when I used the command:
                  $ls -l ~/refseq_protein* | head --lines=2

                  I got an error:
                  >>ls: cannot access home/bwawrik/db/refseq_protein.*: No such file or directory

                  I just cannot understand: if it doesn't "see" the files of database, how it gave an error during running of makeblastdb:
                  $makeblastdb -in /home/bwawrik/db/refseq_protein.* -dbtype prot -out /home/bwawrik/db/refseq_protein.db
                  >>Error: Too many positional arguments (1), the offending value: /home/bwawrik/db/refseq_protein.01.phr

                  Because from this, it seems that it CAN actually read the file, but is simply not "happy" with it.

                  Comment


                  • #10
                    Try just using

                    -db /home/bwawrik/db/refseq_protein

                    the full path, but only the prefix of the database name

                    Comment


                    • #11
                      Don't use a star (*) in your -db name. It should be:

                      -db /home/bwawrik/db/refseq_protein

                      Otherwise you are telling blastp that there are 81 (or so) files to use as the DB. It wants the overall name, not the overall files. Your initial blastp line did not have the star and thus it seemed correct except for the pathing problem. Your current blastp is obviously incorrect.

                      As for your makeblastdb error ... you are doing it wrong. I was going to mention that but it is not relevant to why blastp is not working. Once again you are telling the program to use 81 files. The program is basically seeing:

                      makeblastdb -in /home/bwawrik/db/refseq_protein.00.phr /home/bwawrik/db/refseq_protein.00.pin /home/bwawrik/db/refseq_protein.00.pnd ... etc.

                      Which of course ruins the one (1) parameter that should be after '-in' and brings up the 'too many positional arguments' error.

                      But as I said that is neither here nor there for running blastp. Let's not be concerned with makeblastdb.

                      Going on ... are you sure you ran that 'ls' that I gave you? I specified 'refseq_protein*' not the 'refseq_protein.*' (with a dot) that ls complained about.

                      Try the blastp without a star in the -db. And post the results of:

                      ls /home/bwawrik/db/refseq_protein* | head --lines=2

                      Comment


                      • #12
                        Ok, so far:
                        $ ls -l /home/bwawrik/db/refseq_protein* | head --lines=2
                        -rw-rw-r-- 1 bwawrik bwawrik 534122462 Dec 15 18:37 /home/bwawrik/db/refseq_protein.01.phr
                        -rw-rw-r-- 1 bwawrik bwawrik 23105152 Dec 15 18:37 /home/bwawrik/db/refseq_protein.01.pin

                        and when using blastp without star:
                        BLAST Database error: No alias or index file found for protein database [/home/bwawrik/db/refseq_protein] in search path [/home/bwawrik/software/multi-metagenome/R.data.generation::]

                        Comment


                        • #13
                          OK. Now we are getting somewhere -- at least I can be sure that the paths look correct. What I find strange is that your database files begin with *.01.* -- mine begin with *.00.*; e.g.,

                          /group/diagrid/databases/ncbi/week-04-2014/refseq_protein.00.phr
                          /group/diagrid/databases/ncbi/week-04-2014/refseq_protein.00.pin
                          More importantly we need to make sure that the overall index file is in place. Mine is at the bottom of the listing so that if I do a 'tail --lines=2' instead of using 'head' I get:

                          /group/diagrid/databases/ncbi/week-04-2014/refseq_protein.09.psq
                          /group/diagrid/databases/ncbi/week-04-2014/refseq_protein.pal
                          Or using 'ls -l'

                          -rw-r--r-- 1 braub diagrid-apps 275 Dec 15 20:12 /group/diagrid/databases/ncbi/week-04-2014/refseq_protein.pal
                          What do you get? I am trying to see if the '*.pal' file is present.

                          Comment


                          • #14
                            I know the problem , why there is not .00 file - I accidently deleted it.
                            I am downloading it now.
                            And for the checking of '*.pal', we have problems:
                            $ ls -l /home/bwawrik/db/refseq_protein* | tail --lines=2
                            -rw-r--r-- 1 bwawrik bwawrik 59 Jan 21 11:39 /home/bwawrik/db/refseq_protein.2.08.tar.gz.md5
                            -rw-r--r-- 1 bwawrik bwawrik 59 Jan 21 11:39 /home/bwawrik/db/refseq_protein.2.09.tar.gz.md5

                            Comment


                            • #15
                              Looks like your directory has extraneous files in it. They probably do not hurt. How about doing a

                              ls -l /home/bwawrik/db/refseq_protein*pal

                              Let's see if you have the overall index file.

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                Current Approaches to Protein Sequencing
                                by seqadmin


                                Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                                04-04-2024, 04:25 PM
                              • seqadmin
                                Strategies for Sequencing Challenging Samples
                                by seqadmin


                                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                                03-22-2024, 06:39 AM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, 04-11-2024, 12:08 PM
                              0 responses
                              18 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-10-2024, 10:19 PM
                              0 responses
                              22 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-10-2024, 09:21 AM
                              0 responses
                              17 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-04-2024, 09:00 AM
                              0 responses
                              49 views
                              0 likes
                              Last Post seqadmin  
                              Working...
                              X