Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • hlyates
    Member
    • Mar 2015
    • 29

    Why is this blast script giving me empty tabular output?

    Instead of using an ncbi database, I ran the makeblastdb command on a fasta file I downloaded from ncbi.

    The following script will produce output for me:
    Code:
    #!/bin/bash
    /homes/bioinfo/ncbi-blast-2.2.29+/bin/blastn -db ~/databases/tbd -query ~/databases/even_tribolium.fasta -out output.txt

    However, when I modify the script as follows, I get absolutely no output.
    Code:
    #!/bin/bash
    /homes/bioinfo/ncbi-blast-2.2.29+/bin/blastn -db ~/databases/tbd -query ~/databases/even_tribolium.fasta -out output.txt -evalue 10E-30 -outfmt "6 qseqid sgi sseqid pident length mismatch qstart qend sstart send evalue bitscore staxids"
    Why? I'm really at a loss. I tried removing my -evalue 10E-30 (thinking maybe I was being too stringent), but still blank. I would be extremely grateful for any assistance on this problem.
  • dschika
    Member
    • Mar 2010
    • 56

    #2
    Were there any hits reported in the output of the first command? If there were no hits at all, I guess the tabular output format will not produce a file.

    Comment

    • GenoMax
      Senior Member
      • Feb 2008
      • 7142

      #3
      Were there any errors reported when you made the database (DB files are non-zero bytes)? Are query/db composed of nucleotides?

      Comment

      • hlyates
        Member
        • Mar 2015
        • 29

        #4
        Here are the answer to your questions:
        • The first script works with no errors. It produces output.
        • The makedb did produce some errors, but I don't believe this is causing the problem. I have tried this problem on a protein database I made as well and experience the exact same behaviour. The first script gives me output, then the tabular output is blank.



        Code:
        Building a new DB, current time: 06/12/2015 13:25:46
        New DB name:   tbd
        New DB title:  /databases/tribolium-est-transcripts.fasta
        Sequence type: Nucleotide
        Keep Linkouts: T
        Keep MBits: T
        Maximum file size: 1000000000B
        Error: (1431.1) FASTA-Reader: Warning: FASTA-Reader: First data line in seq is a                                                                                                 bout 42% ambiguous nucleotides (shouldn't be over 40%)
        Error: (1431.1) FASTA-Reader: Warning: FASTA-Reader: First data line in seq is a                                                                                                 bout 42% ambiguous nucleotides (shouldn't be over 40%)
        Error: (1431.1) FASTA-Reader: Warning: FASTA-Reader: First data line in seq is a                                                                                                 bout 46% ambiguous nucleotides (shouldn't be over 40%)
        Error: (1431.1) FASTA-Reader: Warning: FASTA-Reader: First data line in seq is a                                                                                                 bout 44% ambiguous nucleotides (shouldn't be over 40%)
        Error: (1431.1) FASTA-Reader: Warning: FASTA-Reader: First data line in seq is a                                                                                                 bout 48% ambiguous nucleotides (shouldn't be over 40%)
        Error: (1431.1) FASTA-Reader: Warning: FASTA-Reader: First data line in seq is a                                                                                                 bout 42% ambiguous nucleotides (shouldn't be over 40%)
        Error: (1431.1) FASTA-Reader: Warning: FASTA-Reader: First data line in seq is a                                                                                                 bout 48% ambiguous nucleotides (shouldn't be over 40%)
        Error: (1431.1) FASTA-Reader: Warning: FASTA-Reader: First data line in seq is a                                                                                                 bout 44% ambiguous nucleotides (shouldn't be over 40%)
        Error: (1431.1) FASTA-Reader: Warning: FASTA-Reader: First data line in seq is a                                                                                                 bout 48% ambiguous nucleotides (shouldn't be over 40%)
        Error: (1431.1) FASTA-Reader: Warning: FASTA-Reader: First data line in seq is a                                                                                                 bout 42% ambiguous nucleotides (shouldn't be over 40%)
        Error: (1431.1) FASTA-Reader: Warning: FASTA-Reader: First data line in seq is a                                                                                                 bout 44% ambiguous nucleotides (shouldn't be over 40%)
        Error: (1431.1) FASTA-Reader: Warning: FASTA-Reader: First data line in seq is a                                                                                                 bout 42% ambiguous nucleotides (shouldn't be over 40%)
        Error: (1431.1) FASTA-Reader: Warning: FASTA-Reader: First data line in seq is a                                                                                                 bout 42% ambiguous nucleotides (shouldn't be over 40%)
        Error: (1431.1) FASTA-Reader: Warning: FASTA-Reader: First data line in seq is a                                                                                                 bout 42% ambiguous nucleotides (shouldn't be over 40%)
        Error: (1431.1) FASTA-Reader: Warning: FASTA-Reader: First data line in seq is a                                                                                                 bout 42% ambiguous nucleotides (shouldn't be over 40%)
        Error: (1431.1) FASTA-Reader: Warning: FASTA-Reader: First data line in seq is a                                                                                                 bout 44% ambiguous nucleotides (shouldn't be over 40%)
        Error: (1431.1) FASTA-Reader: Warning: FASTA-Reader: First data line in seq is a                                                                                                 bout 42% ambiguous nucleotides (shouldn't be over 40%)
        Error: (1431.1) FASTA-Reader: Warning: FASTA-Reader: First data line in seq is a                                                                                                 bout 42% ambiguous nucleotides (shouldn't be over 40%)
        Error: (1431.1) FASTA-Reader: Warning: FASTA-Reader: First data line in seq is a                                                                                                 bout 54% ambiguous nucleotides (shouldn't be over 40%)
        Error: (1431.1) FASTA-Reader: Warning: FASTA-Reader: First data line in seq is a                                                                                                 bout 44% ambiguous nucleotides (shouldn't be over 40%)
        Error: (1431.1) FASTA-Reader: Warning: FASTA-Reader: First data line in seq is a                                                                                                 bout 48% ambiguous nucleotides (shouldn't be over 40%)
        Error: (1431.1) FASTA-Reader: Warning: FASTA-Reader: First data line in seq is a                                                                                                 bout 46% ambiguous nucleotides (shouldn't be over 40%)
        Error: (1431.1) FASTA-Reader: Warning: FASTA-Reader: First data line in seq is a                                                                                                 bout 42% ambiguous nucleotides (shouldn't be over 40%)
        Error: (1431.1) FASTA-Reader: Warning: FASTA-Reader: First data line in seq is a                                                                                                 bout 48% ambiguous nucleotides (shouldn't be over 40%)
        Error: (1431.1) FASTA-Reader: Warning: FASTA-Reader: First data line in seq is a                                                                                                 bout 46% ambiguous nucleotides (shouldn't be over 40%)
        Adding sequences from FASTA; added 18611 sequences in 2.75514 seconds.
        In summary, the first script will work. It will produce the default output. The second script produces the output.txt file, but it is blank. Weird. No error messages is produced when I run the second script.

        Comment

        • hlyates
          Member
          • Mar 2015
          • 29

          #5
          Originally posted by GenoMax View Post
          Were there any errors reported when you made the database (DB files are non-zero bytes)? Are query/db composed of nucleotides?
          The first script would work and produce output. The second script would produce no errors, but a file called output.txt that is empty. There were errors reported when I made the database. The query is a fasta and the db is nucleotide.

          The output is given below:
          Code:
          Building a new DB, current time: 06/12/2015 22:10:13
          New DB name:   tbd
          New DB title:  /databases/tribolium-est-transcripts.fasta
          Sequence type: Nucleotide
          Deleted existing BLAST database with identical name.
          Keep Linkouts: T
          Keep MBits: T
          Maximum file size: 1000000000B
          Error: (1431.1) FASTA-Reader: Warning: FASTA-Reader: First data line in seq is about 42% ambiguous nucleotides (shouldn't be over 40%)
          Error: (1431.1) FASTA-Reader: Warning: FASTA-Reader: First data line in seq is about 42% ambiguous nucleotides (shouldn't be over 40%)
          Error: (1431.1) FASTA-Reader: Warning: FASTA-Reader: First data line in seq is about 46% ambiguous nucleotides (shouldn't be over 40%)
          Error: (1431.1) FASTA-Reader: Warning: FASTA-Reader: First data line in seq is about 44% ambiguous nucleotides (shouldn't be over 40%)
          Error: (1431.1) FASTA-Reader: Warning: FASTA-Reader: First data line in seq is about 48% ambiguous nucleotides (shouldn't be over 40%)
          Error: (1431.1) FASTA-Reader: Warning: FASTA-Reader: First data line in seq is about 42% ambiguous nucleotides (shouldn't be over 40%)
          Error: (1431.1) FASTA-Reader: Warning: FASTA-Reader: First data line in seq is about 48% ambiguous nucleotides (shouldn't be over 40%)
          Error: (1431.1) FASTA-Reader: Warning: FASTA-Reader: First data line in seq is about 44% ambiguous nucleotides (shouldn't be over 40%)
          Error: (1431.1) FASTA-Reader: Warning: FASTA-Reader: First data line in seq is about 48% ambiguous nucleotides (shouldn't be over 40%)
          I also tested the same script on a protein db, and I have the exact same problem.

          Comment

          • dschika
            Member
            • Mar 2010
            • 56

            #6
            Originally posted by hlyates View Post
            Here are the answer to your questions:
            • The first script works with no errors. It produces output.
            I understood that it produces output. But I would like to know if you have any hits at all. If you don't have hits, you won't get any output in the second case.

            How often can you find the phrase "producing significant alignments" in first, default output file?

            Comment

            • GenoMax
              Senior Member
              • Feb 2008
              • 7142

              #7
              You need to enclose the outfmt options in single quotes (') and not double quotes like you have in your script. Give this a try.

              Code:
              -outfmt '6 qseqid sgi sseqid pident length mismatch qstart qend sstart send evalue bitscore staxids'

              Comment

              • SylvainL
                Senior Member
                • Feb 2012
                • 180

                #8
                Hi GenoMax, are you sure about the quote? Because, I'm using the double quote and it works... My blast version is quite old (2.2.25+) but it works with double quotes...

                Actually, do you get an output if you remove the staxids? Because, I do not have this option in my blast version (don't know if it exist in blast 2.2.29)...
                Last edited by SylvainL; 06-15-2015, 02:20 AM.

                Comment

                • hlyates
                  Member
                  • Mar 2015
                  • 29

                  #9
                  Originally posted by dschika View Post
                  I understood that it produces output. But I would like to know if you have any hits at all. If you don't have hits, you won't get any output in the second case.

                  How often can you find the phrase "producing significant alignments" in first, default output file?

                  I'm sorry I misunderstood your question earlier. Thank you for the elaboration. Since I used an extremely small fasta file, then it follows that I would also not expect many alignments. In other words, the results were also small. I only had 1 producing significant alignments.

                  Now that being said, I would at least expect to see that output for tabular. I'm still not clear why the second script fails. I hope however, that I helped answer questions you had about my approach. If not, please feel free to ask followup questions.

                  Comment

                  • hlyates
                    Member
                    • Mar 2015
                    • 29

                    #10
                    Originally posted by GenoMax View Post
                    You need to enclose the outfmt options in single quotes (') and not double quotes like you have in your script. Give this a try.

                    Code:
                    -outfmt '6 qseqid sgi sseqid pident length mismatch qstart qend sstart send evalue bitscore staxids'
                    Code:
                    $ head lab8output_tabular.txt
                    gi|645685058:1370684-1379063    0       CL3111Contig1   97.78   45      1       5808    5852    1       45      2e-13   78.7    N/A
                    gi|645685058:1370684-1379063    0       CL3747Contig1   89.66   58      5       7069    7126    859     803     1e-11   73.1    N/A
                    gi|645685058:1370684-1379063    0       gi|75720587|gb|DT788599.1|DT788599      94.29   35      2       7069    7103    47      81      4e-06   54.7       N/A
                    This worked. My question is simple. Why? I thought " and ' were equivalent in ncbi blast? Thanks sir.

                    Comment

                    • SylvainL
                      Senior Member
                      • Feb 2012
                      • 180

                      #11
                      Funny it worked. Good news at least... Maybe it depends on the version?

                      Comment

                      Latest Articles

                      Collapse

                      • SEQadmin2
                        Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                        by SEQadmin2


                        I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.


                        Here are nine questions we think about, in roughly the order they matter, before...
                        06-18-2026, 07:11 AM
                      • SEQadmin2
                        From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                        by SEQadmin2


                        Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                        The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                        ...
                        06-02-2026, 10:05 AM
                      • SEQadmin2
                        Single-Cell Sequencing at an Inflection Point: Early Impacts of New Platforms and Emerging Trends
                        by SEQadmin2


                        With the launch of new single-cell sequencing platforms in 2026, the field stands at an exciting inflection point. This article surveys the most impactful advances in the field and discusses how they’re reshaping research in cancer, immunology, and beyond.


                        Introduction

                        Single-cell sequencing technologies have undergone remarkable advances over the past decade, transitioning from low-throughput experimental approaches to highly scalable platforms capable of...
                        05-22-2026, 06:42 AM

                      ad_right_rmr

                      Collapse

                      News

                      Collapse

                      Topics Statistics Last Post
                      Started by SEQadmin2, 06-17-2026, 06:09 AM
                      0 responses
                      21 views
                      0 reactions
                      Last Post SEQadmin2  
                      Started by SEQadmin2, 06-09-2026, 11:58 AM
                      0 responses
                      38 views
                      0 reactions
                      Last Post SEQadmin2  
                      Started by SEQadmin2, 06-05-2026, 10:09 AM
                      0 responses
                      45 views
                      0 reactions
                      Last Post SEQadmin2  
                      Started by SEQadmin2, 06-04-2026, 08:59 AM
                      0 responses
                      49 views
                      0 reactions
                      Last Post SEQadmin2  
                      Working...