SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
is there a file of annotated names for NCBI sequence identifiers? favorite Bioinformatics 4 07-28-2014 05:46 AM
Use esearch/efetch to output relationship table of GSM to SRR (SRA file names) apredeus Bioinformatics 1 06-07-2014 04:20 PM
obtaining sequence names after local blast cdes79 Bioinformatics 11 03-20-2014 07:16 AM
Bowtie changes read names in SAM output ashish Bioinformatics 9 07-22-2011 12:33 PM
converting UCSC gene names to Hugo Symbol names efoss Bioinformatics 2 07-16-2011 12:41 PM

Reply
 
Thread Tools
Old 01-04-2015, 03:44 PM   #1
quokka
Member
 
Location: oz

Join Date: Apr 2010
Posts: 12
Default Problem with mpiBLAST output subject sequence names

Hi all,

I'm experiencing an issue when running mpiBLAST 1.6.0 on our cluster. It runs fine however the results (from a TBLASTX run) do not have the complete name of the subject sequence in the output. Here's an example of part of the output:

Code:
Sequences producing significant alignments:                  (bits) Value  N

13893_/work1/xxmwebb/mpi_blast/databas                             73   1e-14  1 

>13893_/work1/xxmwebb/mpi_blast/databas 
          Length = 9798

 Score = 73.0 bits (162), Expect = 1e-14
 Identities = 32/35 (91%), Positives = 33/35 (94%)
 Frame = -3 / -3

                                               
Query: 105  LACQTLKSGYTESSRGSRVYFLVAFSLFLCTILTF 1
            LACQTLKSGYTESSRGS VYFLVAFSLFLC+IL F
Sbjct: 8944 LACQTLKSGYTESSRGSSVYFLVAFSLFLCSILAF 8840
The problem is that the subject title is being truncated (and having the full path included in the subject name isn't helping the issue). Does anyone know how to get around this?
quokka is offline   Reply With Quote
Old 01-04-2015, 06:22 PM   #2
maubp
Peter (Biopython etc)
 
Location: Dundee, Scotland, UK

Join Date: Jul 2009
Posts: 1,543
Default

Given no updates since 2012, mpiBLAST is effectively dead. I guess that your best plan would be to rebuild your database using shorter names.

However, I would look at moving to using NCBI BLAST+ instead. This has built in multi-threading which works well on multi-core machines. In terms of exploiting a cluster (which is what mpiBLAST was for), splitting your input by query is the by far the easiest approach (assuming you are running searches with multiple-sequence FASTA files as input).
maubp is offline   Reply With Quote
Old 01-05-2015, 12:02 AM   #3
quokka
Member
 
Location: oz

Join Date: Apr 2010
Posts: 12
Default

mpiBLAST is inserting the path into the subject name for some reason...so switching to a shorter path for the input might be required....hmmmn....I don't recall this being an issue before.....oh well,BLAST+ it is then....

Thanks for the advice maubp..
quokka is offline   Reply With Quote
Reply

Tags
mpiblast, subject, title, truncated

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 05:12 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO