Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • BioTalk
    Member
    • Feb 2010
    • 43

    BLAST Help

    Hello all,

    I am trying to compare two .fasta files with many fasta sequences within both the files using Blast. But for some reason Blast is considering only first sequence from the fasta files. I am not sure what parameter I should use (for standalone BLAST) to compare all the sequences of both the files.

    Please let me know if anyone knows how to do it.

    Thank you!
  • maubp
    Peter (Biopython etc)
    • Jul 2009
    • 1544

    #2
    It should "just work".

    Which flavour and version of standalone BLAST do you have (the NCBI "legacy" version in C, the new NCBI BLAST+ written in C++, or one of the 3rd party BLAST implementations)?

    What BLAST command line are you using?

    Have you checked your FASTA files are using the right new line characters for your OS? The command unix2dos or dos2unix can help here. Also try this to count the entries:

    grep -c "^>" your_file.fasta

    Comment

    • BioTalk
      Member
      • Feb 2010
      • 43

      #3
      Thank you for your prompt response!

      I am using Blast for Linux 64 bit downloaded from:


      The command line I am using is: blastall -p blastn -F F -W 16 -i <inputfile.fa> -d <knownsequences.fa> -o <outputfile.fa>

      Comment

      • BioTalk
        Member
        • Feb 2010
        • 43

        #4
        I am not sure about this question: Have you checked your FASTA files are using the right new line characters for your OS?

        Comment

        • maubp
          Peter (Biopython etc)
          • Jul 2009
          • 1544

          #5
          Originally posted by BioTalk View Post
          I am not sure about this question: Have you checked your FASTA files are using the right new line characters for your OS?
          Unix/Linux/Mac OS X etc all use a LF character for a new line, while DOS/Windows uses the CR LF characters. This incompatibility is a common problem when dealing with files created on another OS.

          Comment

          • maubp
            Peter (Biopython etc)
            • Jul 2009
            • 1544

            #6
            Originally posted by BioTalk View Post
            Thank you for your prompt response!

            I am using Blast for Linux 64 bit downloaded from:


            The command line I am using is: blastall -p blastn -F F -W 16 -i <inputfile.fa> -d <knownsequences.fa> -o <outputfile.fa>
            If you are using "blastall" then you are using the old legacy BLAST executables based on the NCBI C Toolkit.

            If you are using the new BLAST+ suite written in C++ then the command here would be "blastn" instead (and all the options have been renamed).
            Last edited by maubp; 07-26-2010, 07:28 AM. Reason: blastn vs blastp typo

            Comment

            • BioTalk
              Member
              • Feb 2010
              • 43

              #7
              Originally posted by maubp View Post
              Unix/Linux/Mac OS X etc all use a LF character for a new line, while DOS/Windows uses the CR LF characters. This incompatibility is a common problem when dealing with files created on another OS.
              http://en.wikipedia.org/wiki/Newline
              Oh okay, I think this is not a problem with the fasta files as they are all created in Linux and being used in Linux. Also, I tried to open the files and they both looks like a normal fasta file.

              Please let me know if you know how should I deal with "Blank output file" problem!

              Comment

              • maubp
                Peter (Biopython etc)
                • Jul 2009
                • 1544

                #8
                Originally posted by BioTalk View Post
                Please let me know if you know how should I deal with "Blank output file" problem!
                You never mentioned a "blank output file" until now. I thought you said you were having trouble getting BLAST to use multiple input query sequences.


                Does blastall give any error messages?

                Did you remember to create a BLAST database first using formatdb?
                Last edited by maubp; 07-26-2010, 07:21 AM.

                Comment

                • BioTalk
                  Member
                  • Feb 2010
                  • 43

                  #9
                  I am sorry for the confusion! It is giving almost blank output file with the following details in it instead of alignment result.

                  BLASTN 2.2.21 [Jun-14-2009]


                  Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer,
                  Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997),
                  "Gapped BLAST and PSI-BLAST: a new generation of protein database search
                  programs", Nucleic Acids Res. 25:3389-3402.

                  Query= Cluster_573384 1
                  (23 letters)

                  So, I thought Blast it not using multiple input query sequences.

                  Comment

                  • BioTalk
                    Member
                    • Feb 2010
                    • 43

                    #10
                    Originally posted by maubp View Post
                    If you are using "blastall" then you are using the old legacy BLAST executables based on the NCBI C Toolkit.

                    If you are using the new BLAST+ suite written in C++ then the command here would be "blastp" instead (and all the options have been renamed).
                    I tried the command for blastn as you have suggested and for that I got some error for indexing.
                    @biocomp:~/Desktop/Blast/bin$ ./blastn -word_size 16 -query <inputseq.fa> -db <sequencetocompare.fa> -out <outputfile.fa>
                    BLAST Database error: No alias or index file found for nucleotide database [/home/Desktop/sequencetocompare.fa] in search path [/home/Desktop/Blast/bin::]

                    Comment

                    • maubp
                      Peter (Biopython etc)
                      • Jul 2009
                      • 1544

                      #11
                      You should have one line starting "Query=" for each query sequence.

                      If that is a full file, it looks like BLAST is crashing or failing to finish.

                      If I recall correctly, the next output would have been information about the BLAST database - are you sure that is setup right using formatdb? For example, can you do single queries against this database?

                      Comment

                      • rglover
                        rg
                        • Dec 2008
                        • 51

                        #12
                        It could be that when you've formatted the blast database you didn't set it for nucleotide sequences - formatdb defaults to protein if it finds no command to specify nucleotide.
                        Try "formatdb -i <yourfasta.fasta> -p F"
                        The -p F turns protein off and nucleotide on

                        Comment

                        • BioTalk
                          Member
                          • Feb 2010
                          • 43

                          #13
                          I just tried inputting one query sequence with the command: blastall -p blastn -F F -W 16 -i <inputfile.fa> -d <knownsequences.fa> -o <outputfile.fa>

                          and I got almost similar output:

                          BLASTN 2.2.21 [Jun-14-2009]


                          Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer,
                          Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997),
                          "Gapped BLAST and PSI-BLAST: a new generation of protein database search
                          programs", Nucleic Acids Res. 25:3389-3402.

                          Query= 1-72342
                          (20 letters)
                          Do you think there is installation problem? or the command I am using are not correct?

                          Comment

                          • BioTalk
                            Member
                            • Feb 2010
                            • 43

                            #14
                            Originally posted by rglover View Post
                            It could be that when you've formatted the blast database you didn't set it for nucleotide sequences - formatdb defaults to protein if it finds no command to specify nucleotide.
                            Try "formatdb -i <yourfasta.fasta> -p F"
                            The -p F turns protein off and nucleotide on
                            I tried "formatdb -i <yourfasta.fasta> -p F" and then my previous command but it gave me the same output as before

                            Comment

                            • rglover
                              rg
                              • Dec 2008
                              • 51

                              #15
                              What are the names of the database files that formatdb is creating? Could you list them here? You could also try putting "-o T" on the end of your formatdb. Other than that I'm not really sure!

                              Comment

                              Latest Articles

                              Collapse

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by SEQadmin2, 06-09-2026, 11:58 AM
                              0 responses
                              22 views
                              0 reactions
                              Last Post SEQadmin2  
                              Started by SEQadmin2, 06-05-2026, 10:09 AM
                              0 responses
                              29 views
                              0 reactions
                              Last Post SEQadmin2  
                              Started by SEQadmin2, 06-04-2026, 08:59 AM
                              0 responses
                              39 views
                              0 reactions
                              Last Post SEQadmin2  
                              Started by SEQadmin2, 06-02-2026, 12:03 PM
                              0 responses
                              61 views
                              0 reactions
                              Last Post SEQadmin2  
                              Working...