Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • How create a db for NCBI BLASTN

    Hello, I have some metagenomic contigs and I want to use blastn against Silva SSU and LSU database to get taxa information.

    I download the Silva RDP database from here in fasta format



    I try to use blastn to blast it directly. However, it doesn't work. Someone told me I have to convert it to NCBI format, but I don't know how to do it.

    Also, the Silva database sequences are RNA seqs. I think I need to convert it to DNA first. I don't know how to do it.

    I know after I convert it to DNA seqs, I can use "makeblastdb" to convert make the database. However, I don't know how to set "makeblastdb" parameters.

    Can anyone help me. Post the scripts I should use.

    Thanks,
    Ben

  • #2
    See this page for the BLAST help: http://www.ncbi.nlm.nih.gov/books/NBK1763/. If you search for "makeblastdb" (you will need to find the 3rd or 4th occurrence) then you will see this command line example:

    Code:
    $ makeblastdb -in hs_chr –input_type blastdb -dbtype nucl -parse_seqids \
     -mask_data hs_chr_mask.asnb -out hs_chr -title \
     "Human Chromosome, Ref B37.1"
    If you type
    Code:
    makeblastdb -help
    in your terminal window you will be able to get detailed program options.

    What OS are you using?

    Comment


    • #3
      How create a db for NCBI BLASTN

      Here is an example of the makeblastdb command,

      $ makeblastdb -in hs_chr.fa -dbtype nucl -parse_seqids \
      -out hs_chr -title "Human chromosomes, Ref B37.1"

      taken from the 'Cookbook' section of the BLAST Command Line Applications User Manual



      You may also want to look at some of the previous threads here on SEQanswers, for example,

      Discussion of next-gen sequencing related bioinformatics: resources, algorithms, open source efforts, etc




      Best wishes,
      Maria

      Comment


      • #4
        I use windows7 blastn.

        I try "$ makeblastdb -in hs_chr –input_type blastdb -dbtype nucl -parse_seqids
        -mask_data hs_chr_mask.asnb -out hs_chr -title \
        "Human Chromosome, Ref B37.1"

        It didn't work. I need to convert Silva RDP fasta database to NCBI database.

        I don't think I need a title. what is mask_data for.

        I run this


        D:\ME>makeblastdb -in LSURef_111_tax_silva.fasta -dbtype nucl -parse_seqids -out LSURef_111_tax_silva.ncbi.db


        Building a new DB, current time: 04/04/2013 10:41:10
        New DB name: LSURef_111_tax_silva.ncbi.db
        New DB title: LSURef_111_tax_silva.fasta
        Sequence type: Nucleotide
        Keep Linkouts: T
        Keep MBits: T
        Maximum file size: 1000000000B
        Adding sequences from FASTA; added 29306 sequences in 3.68226 seconds.

        It looks successful, but I can't use the output file for blastn.

        If anyone can leave your dropbox account for me, I can share my file to you.

        So you can try it, you will know it won't work.

        Here is the link that I download the fasta format database from Silva RDP



        LSURef_111_tax_silva.fasta.tgz is the file that I want to convert to NCBI format

        Ben
        Last edited by SDPA_Pet; 04-04-2013, 11:17 AM.

        Comment


        • #5
          Originally posted by SDPA_Pet View Post

          It looks successful, but I can't use the output file for blastn.


          Ben
          What exactly is happening? Are you getting no results/error messages?

          Comment


          • #6
            Originally posted by GenoMax View Post
            What exactly is happening? Are you getting no results/error messages?
            I run "makeblastdb -in LSURef_111_tax_silva.fasta -dbtype nucl -out LSURef_111_tax_silva.ncbi.db"

            I have got three output files:

            LSURef_111_tax_silva.ncbi.db.nhr
            LSURef_111_tax_silva.ncbi.db.nin
            LSURef_111_tax_silva.ncbi.db.nsq

            I try all of them for blastn

            Run:
            blastn -query test.fna -db LSURef_111_tax_silva.ncbi.db.nhr -out out_test -evalue 1e-6

            I got this

            BLAST Database error: No alias or index file found for nucleotide database [LSURef_111_tax_silva.ncbi.db.nhr] in search path [D:\ME;;]

            I don't know what happened.

            Ben

            Comment


            • #7
              you should run command
              blastn -query test.fna -db LSURef_111_tax_silva.ncbi.db -out out_test -evalue 1e-6
              try it to see if it works
              Regards

              Comment


              • #8
                Originally posted by SDPA_Pet View Post
                Run:
                blastn -query test.fna -db LSURef_111_tax_silva.ncbi.db.nhr -out out_test -evalue 1e-6

                I got this

                BLAST Database error: No alias or index file found for nucleotide database [LSURef_111_tax_silva.ncbi.db.nhr] in search path [D:\ME;;]

                I don't know what happened.

                Ben
                Don't add ".nhr" to the name of your blastdb in the command, just use the base blastdb name, which in this case would be LSURef_111_tax_silva.ncbi.db

                BLASTN will find the specific files it needs from the basename.
                Last edited by kmcarr; 04-04-2013, 12:13 PM. Reason: Crossed replies with yzzhang

                Comment


                • #9
                  Originally posted by yzzhang View Post
                  you should run command
                  blastn -query test.fna -db LSURef_111_tax_silva.ncbi.db -out out_test -evalue 1e-6
                  try it to see if it works
                  Regards
                  The point is that I don't have this output "LSURef_111_tax_silva.ncbi.db"

                  I only got three output files:
                  LSURef_111_tax_silva.ncbi.db.nhr
                  LSURef_111_tax_silva.ncbi.db.nin
                  LSURef_111_tax_silva.ncbi.db.nsq

                  It looks I didn't convert my db correctly at the first step.

                  Comment


                  • #10
                    Originally posted by SDPA_Pet View Post
                    The point is that I don't have this output "LSURef_111_tax_silva.ncbi.db"

                    I only got three output files:
                    LSURef_111_tax_silva.ncbi.db.nhr
                    LSURef_111_tax_silva.ncbi.db.nin
                    LSURef_111_tax_silva.ncbi.db.nsq

                    It looks I didn't convert my db correctly at the first step.
                    The point is you aren't supposed to provide a full filename as the blastdb, only its base file name which in your case is "LSURef_111_tax_silva.ncbi.db". BLAST figures out the rest.

                    Trust us, we know what we're talking about.

                    Comment


                    • #11
                      Originally posted by kmcarr View Post
                      The point is you aren't supposed to provide a full filename as the blastdb, only its base file name which in your case is "LSURef_111_tax_silva.ncbi.db". BLAST figures out the rest.

                      Trust us, we know what we're talking about.
                      Thank you. You are right.

                      BTW, the parameters that I used are enough to create a good database file?

                      makeblastdb -in LSURef_111_tax_silva.fasta -dbtype nucl -out LSURef_111_tax_silva.ncbi.db

                      I just use "-dbtype nucl ". Anything else I should include?

                      Ben

                      Comment


                      • #12
                        Originally posted by SDPA_Pet View Post

                        BTW, the parameters that I used are enough to create a good database file?

                        Ben
                        If you did get output that makes sense then those parameters were adequate.

                        Did you manage to get the search to work?

                        Comment

                        Latest Articles

                        Collapse

                        • seqadmin
                          Essential Discoveries and Tools in Epitranscriptomics
                          by seqadmin




                          The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                          04-22-2024, 07:01 AM
                        • seqadmin
                          Current Approaches to Protein Sequencing
                          by seqadmin


                          Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                          04-04-2024, 04:25 PM

                        ad_right_rmr

                        Collapse

                        News

                        Collapse

                        Topics Statistics Last Post
                        Started by seqadmin, Yesterday, 11:49 AM
                        0 responses
                        15 views
                        0 likes
                        Last Post seqadmin  
                        Started by seqadmin, 04-24-2024, 08:47 AM
                        0 responses
                        16 views
                        0 likes
                        Last Post seqadmin  
                        Started by seqadmin, 04-11-2024, 12:08 PM
                        0 responses
                        61 views
                        0 likes
                        Last Post seqadmin  
                        Started by seqadmin, 04-10-2024, 10:19 PM
                        0 responses
                        60 views
                        0 likes
                        Last Post seqadmin  
                        Working...
                        X