Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • BLAST+ strange results

    I am running command line blastn with a local copy of a microbial database. My command looks like

    blastn -query query.txt -db 16SMicrobial -out newres.txt -wordsize 11 -evalue 10 -gapextend 2 -gapopen 5 -penalty -3 -reward 1 This gives me 81 hits, they all have the same score such as

    Sequences producing significant alignments: Score (Bits) E Value

    ref|NR042397.1| Dokdonella fugitiva strain : A3 16S ribosomal R... 28.2 6.3
    ref|NR041093.1| Streptomyces rubiginosohelvolus strain NBRC 129... 28.2 6.3
    ref|NR041539.1| Kitasatospora kazusensis strain SK60 16S riboso... 28.2 6.3
    ref|NR041538.1| Kitasatospora saccharophila strain SK15 16S rib... 28.2 6.3
    ref|NR_044150.1| Streptomyces omiyaensis strain NRRL B-1587 16S ... 28.2 6.3
    ...

    Isn't that strange?

    Further, I repeated the query(same parameters) on the NCBI website, and that gave 162 hits. e.g.

    NC017551.1 Leptospira interrogans serovar Lai str. IPAV chromosome chromosome 1, complete sequence 398 398 100% 1e-108 100%
    NC008463.1 Pseudomonas aeruginosa UCBPP-PA14 chromosome, complete genome 398 398 100% 1e-108 100%
    NC_002516.2 Pseudomonas aeruginosa PAO1 chromosome, complete genome 398 398 100% 1e-108 100%

    I tried to compare the two, e.g. I looked for "Kitasatospora " from the former in the latter set of results, and could not find any matches. What am I doing wrong ?!

  • #2
    Originally posted by nupurgupta View Post
    I am running command line blastn with a local copy of a microbial database. My command looks like

    blastn -query query.txt -db 16SMicrobial -out newres.txt -wordsize 11 -evalue 10 -gapextend 2 -gapopen 5 -penalty -3 -reward 1 This gives me 81 hits, they all have the same score such as

    Sequences producing significant alignments: Score (Bits) E Value

    ref|NR042397.1| Dokdonella fugitiva strain : A3 16S ribosomal R... 28.2 6.3
    ref|NR041093.1| Streptomyces rubiginosohelvolus strain NBRC 129... 28.2 6.3
    ref|NR041539.1| Kitasatospora kazusensis strain SK60 16S riboso... 28.2 6.3
    ref|NR041538.1| Kitasatospora saccharophila strain SK15 16S rib... 28.2 6.3
    ref|NR_044150.1| Streptomyces omiyaensis strain NRRL B-1587 16S ... 28.2 6.3
    ...

    Isn't that strange?
    Not to me. Those scores are very poor and probably the worst that can be reported by BLAST for your database. So, although they are *hits* I doubt they are meaningful.

    Further, I repeated the query(same parameters) on the NCBI website, and that gave 162 hits. e.g.

    NC017551.1 Leptospira interrogans serovar Lai str. IPAV chromosome chromosome 1, complete sequence 398 398 100% 1e-108 100%
    NC008463.1 Pseudomonas aeruginosa UCBPP-PA14 chromosome, complete genome 398 398 100% 1e-108 100%
    NC_002516.2 Pseudomonas aeruginosa PAO1 chromosome, complete genome 398 398 100% 1e-108 100%

    I tried to compare the two, e.g. I looked for "Kitasatospora " from the former in the latter set of results, and could not find any matches. What am I doing wrong ?!
    I suspect your searching against different databases as the NCBI hits don't look like 16S matches to your queries.

    Comment


    • #3
      Thanks for the perspective. But I downloaded the microbial database from ncbi. ftp://ftp.ncbi.nlm.nih.gov/blast/db/16SMicrobial.tar.gz
      . I assumed that was the entire microbial database that online NCBI was also using for BLAST. Maybe not, let me see if I can find a FASTA file for all bacteria/microbes that I can use to build a database.
      Thanks very very much. This is a huge help

      Comment


      • #4
        It would help if you stated the name of the database you used on NCBI. Then a judgement can be made as to whether you're comparing like with like.

        Comment


        • #5
          I was using NCBI blast on microbial data.

          Comment


          • #6
            I am not surprised at all. My guess is that your query has a chance (and very short) match to a conserved sequence within the ribosomal 16s sequence. Have you tried looking at the alignments? I predict that every alignment will be the same.

            Comment


            • #7
              Yes, you are right! Sorry I am a newbie to this. Have found a bacterial database

              ftp://ftp.ncbi.nih.gov/gene/DATA/GEN...haea_Bacteria/

              Am going to get it now.

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Current Approaches to Protein Sequencing
                by seqadmin


                Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                04-04-2024, 04:25 PM
              • seqadmin
                Strategies for Sequencing Challenging Samples
                by seqadmin


                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                03-22-2024, 06:39 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, 04-11-2024, 12:08 PM
              0 responses
              25 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-10-2024, 10:19 PM
              0 responses
              28 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-10-2024, 09:21 AM
              0 responses
              24 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-04-2024, 09:00 AM
              0 responses
              52 views
              0 likes
              Last Post seqadmin  
              Working...
              X