Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Problem with mpiBLAST output subject sequence names

    Hi all,

    I'm experiencing an issue when running mpiBLAST 1.6.0 on our cluster. It runs fine however the results (from a TBLASTX run) do not have the complete name of the subject sequence in the output. Here's an example of part of the output:

    Code:
    Sequences producing significant alignments:                  (bits) Value  N
    
    13893_/work1/xxmwebb/mpi_blast/databas                             73   1e-14  1 
    
    >13893_/work1/xxmwebb/mpi_blast/databas 
              Length = 9798
    
     Score = 73.0 bits (162), Expect = 1e-14
     Identities = 32/35 (91%), Positives = 33/35 (94%)
     Frame = -3 / -3
    
                                                   
    Query: 105  LACQTLKSGYTESSRGSRVYFLVAFSLFLCTILTF 1
                LACQTLKSGYTESSRGS VYFLVAFSLFLC+IL F
    Sbjct: 8944 LACQTLKSGYTESSRGSSVYFLVAFSLFLCSILAF 8840
    The problem is that the subject title is being truncated (and having the full path included in the subject name isn't helping the issue). Does anyone know how to get around this?

  • #2
    Given no updates since 2012, mpiBLAST is effectively dead. I guess that your best plan would be to rebuild your database using shorter names.

    However, I would look at moving to using NCBI BLAST+ instead. This has built in multi-threading which works well on multi-core machines. In terms of exploiting a cluster (which is what mpiBLAST was for), splitting your input by query is the by far the easiest approach (assuming you are running searches with multiple-sequence FASTA files as input).

    Comment


    • #3
      mpiBLAST is inserting the path into the subject name for some reason...so switching to a shorter path for the input might be required....hmmmn....I don't recall this being an issue before.....oh well,BLAST+ it is then....

      Thanks for the advice maubp..

      Comment

      Latest Articles

      Collapse

      • seqadmin
        Current Approaches to Protein Sequencing
        by seqadmin


        Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
        04-04-2024, 04:25 PM
      • seqadmin
        Strategies for Sequencing Challenging Samples
        by seqadmin


        Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
        03-22-2024, 06:39 AM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by seqadmin, 04-11-2024, 12:08 PM
      0 responses
      30 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-10-2024, 10:19 PM
      0 responses
      32 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-10-2024, 09:21 AM
      0 responses
      28 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-04-2024, 09:00 AM
      0 responses
      53 views
      0 likes
      Last Post seqadmin  
      Working...
      X