SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
install local blast shuang Bioinformatics 6 07-07-2019 07:53 AM
retrieving names from inside [ ] of blast outfmt tonybert General 1 03-15-2014 03:42 AM
The use of local blast+ dinkyshmily Bioinformatics 2 08-07-2012 04:59 PM
BLAST database error - when changing to new BLAST+ local program biobio Bioinformatics 4 06-15-2011 05:20 AM
Local BLAST result annotation kvtspavan Bioinformatics 0 05-24-2011 11:12 PM

Reply
 
Thread Tools
Old 03-18-2014, 03:28 AM   #1
cdes79
Junior Member
 
Location: Scotland

Join Date: Sep 2013
Posts: 9
Default obtaining sequence names after local blast

Dear all,

i am trying to create an annotation for a custom array of a non-model species. I am running the latest version of BLAST+ (2.2.29+) and using the blastx. My output does not contain the gene names and i would like to have those.

I know this topic has been covered in a previous thread (http://seqanswers.com/forums/showthread.php?t=14031) and that it is not possible to obtain gene informations using blast+.

I am wondering if there are alternative ways by which other people achieve this tasks (i am not keen in using blast2GO as it runs very slowly). I would appreciate some hints.
cdes79 is offline   Reply With Quote
Old 03-18-2014, 03:51 AM   #2
rhinoceros
Senior Member
 
Location: sub-surface moon base

Join Date: Apr 2013
Posts: 372
Default

The manual. You could try e.g. the stitle (subject title) flag. Alternatively, it shouldn't be very hard to link subject gi's or accessions to other information with entrez direct..
__________________
savetherhino.org
rhinoceros is offline   Reply With Quote
Old 03-18-2014, 03:58 AM   #3
cdes79
Junior Member
 
Location: Scotland

Join Date: Sep 2013
Posts: 9
Default

Thanks rhinoceros,

i was looking at the options via the blastx -help on the terminal and it does not have that stitle flag amongst the options for some reasons. I am certainly trying that.

Also thanks for the entrez direct, i wasnt aware of it. I am new to blast and in general new to bioinformatics so i apologize if the question was very basic. I guess we all ahve to start somewhere. :-)
cdes79 is offline   Reply With Quote
Old 03-18-2014, 04:01 AM   #4
rhinoceros
Senior Member
 
Location: sub-surface moon base

Join Date: Apr 2013
Posts: 372
Default

Quote:
Originally Posted by cdes79 View Post
Thanks rhinoceros,
i was looking at the options via the blastx -help on the terminal and it does not have that stitle flag amongst the options for some reasons. I am certainly trying that.
It's an option within -outfmt, e.g. -outfmt '6 std stitle' gives you the standard tabular output + stitle as the last column.
__________________
savetherhino.org
rhinoceros is offline   Reply With Quote
Old 03-18-2014, 04:12 AM   #5
cdes79
Junior Member
 
Location: Scotland

Join Date: Sep 2013
Posts: 9
Default

Quote:
Originally Posted by rhinoceros View Post
It's an option within -outfmt, e.g. -outfmt '6 std stitle' gives you the standard tabular output + stitle as the last column.
i know, i read the terminal manual. But again it is not there. Pasted below the relevant section. Anyway, i am running it as we speak and it is running fine. I'll see the output when it comes out.

*** Formatting options
-outfmt <String>
alignment view options:
0 = pairwise,
1 = query-anchored showing identities,
2 = query-anchored no identities,
3 = flat query-anchored, show identities,
4 = flat query-anchored, no identities,
5 = XML Blast output,
6 = tabular,
7 = tabular with comment lines,
8 = Text ASN.1,
9 = Binary ASN.1,
10 = Comma-separated values,
11 = BLAST archive format (ASN.1)

Options 6, 7, and 10 can be additionally configured to produce
a custom format specified by space delimited format specifiers.
The supported format specifiers are:
qseqid means Query Seq-id
qgi means Query GI
qacc means Query accesion
qaccver means Query accesion.version
qlen means Query sequence length
sseqid means Subject Seq-id
sallseqid means All subject Seq-id(s), separated by a ';'
sgi means Subject GI
sallgi means All subject GIs
sacc means Subject accession
saccver means Subject accession.version
sallacc means All subject accessions
slen means Subject sequence length
qstart means Start of alignment in query
qend means End of alignment in query
sstart means Start of alignment in subject
send means End of alignment in subject
qseq means Aligned part of query sequence
sseq means Aligned part of subject sequence
evalue means Expect value
bitscore means Bit score
score means Raw score
length means Alignment length
pident means Percentage of identical matches
nident means Number of identical matches
mismatch means Number of mismatches
positive means Number of positive-scoring matches
gapopen means Number of gap openings
gaps means Total number of gaps
ppos means Percentage of positive-scoring matches
frames means Query and subject frames separated by a '/'
qframe means Query frame
sframe means Subject frame
btop means Blast traceback operations (BTOP)
When not provided, the default value is:
'qseqid sseqid pident length mismatch gapopen qstart qend sstart send
evalue bitscore', which is equivalent to the keyword 'std'
Default = `0'
cdes79 is offline   Reply With Quote
Old 03-18-2014, 05:11 AM   #6
Birdman
Member
 
Location: Montreal

Join Date: Jan 2014
Posts: 21
Default

Just use the XML output (outfmt -5) and parse it to obtain gene names.
Birdman is offline   Reply With Quote
Old 03-18-2014, 05:57 AM   #7
westerman
Rick Westerman
 
Location: Purdue University, Indiana, USA

Join Date: Jun 2008
Posts: 1,104
Default

@cdes76: You need a more recent version of blast. Mine is blastx: 2.2.29+
Package: blast 2.2.29, build Dec 10 2013 14:41:40 and has 'stitle' in it.

@Birdman: IMHO parsing XML is not that easy. Oh, you and I can do it but the casual user will have more problems. Did you see the recent note from NCBI saying that they want input on how to make their XML more standard/parsable?
westerman is offline   Reply With Quote
Old 03-18-2014, 09:24 AM   #8
cdes79
Junior Member
 
Location: Scotland

Join Date: Sep 2013
Posts: 9
Default

Quote:
Originally Posted by westerman View Post
@cdes76: You need a more recent version of blast. Mine is blastx: 2.2.29+
Package: blast 2.2.29, build Dec 10 2013 14:41:40 and has 'stitle' in it.

@Birdman: IMHO parsing XML is not that easy. Oh, you and I can do it but the casual user will have more problems. Did you see the recent note from NCBI saying that they want input on how to make their XML more standard/parsable?
thanks westerman for the support, i think it is easy to forget how daunting this field can be, particularly for people that are not dedicated bioinformaticians, but biologists trying to use new tools. Anyway, back to us i think i figured what the problem might be. I said before i could not find the "stitle" and actually when i used it did not add the sequence info to the output.

Then i noticed that although i installed the latest version 2.2.29+ when i go blastx -h it tells me in the description that i have 2.2.25+. I had a previous blast version installed and probably that is why i am experiencing the problem.

I am now running the command giving the path to the right blastx and see what happens (it is running now). Do you know how i can fix this problem and make sure the blastx runs from the right folder? I assume i should change the PATH? How so?

Thanks, Christian
cdes79 is offline   Reply With Quote
Old 03-18-2014, 10:22 AM   #9
westerman
Rick Westerman
 
Location: Purdue University, Indiana, USA

Join Date: Jun 2008
Posts: 1,104
Default

Quote:
Originally Posted by cdes79 View Post
I am now running the command giving the path to the right blastx and see what happens (it is running now). Do you know how i can fix this problem and make sure the blastx runs from the right folder? I assume i should change the PATH? How so?
Changing the PATH is a good idea. I am sure there are many tutorials on how to do so out there. In general, from Bash,

export PATH=/new/path:$PATH
westerman is offline   Reply With Quote
Old 03-18-2014, 10:55 PM   #10
rhinoceros
Senior Member
 
Location: sub-surface moon base

Join Date: Apr 2013
Posts: 372
Default

Quote:
Originally Posted by cdes79 View Post
I am now running the command giving the path to the right blastx and see what happens (it is running now). Do you know how i can fix this problem and make sure the blastx runs from the right folder? I assume i should change the PATH? How so?
At the command line:

Code:
which blastx
Go ahead and delete the whole dir (except if it's something like /usr/bin or /usr/local/bin in which case just delete the blast binaries). Then change paths in your .bashrc (or equivalent depending on your OS)..
__________________
savetherhino.org

Last edited by rhinoceros; 03-19-2014 at 05:59 AM.
rhinoceros is offline   Reply With Quote
Old 03-19-2014, 05:28 AM   #11
cdes79
Junior Member
 
Location: Scotland

Join Date: Sep 2013
Posts: 9
Default

Thanks all of you for the fantastic help! Everything worked fine!!!
cdes79 is offline   Reply With Quote
Old 03-20-2014, 07:16 AM   #12
maubp
Peter (Biopython etc)
 
Location: Dundee, Scotland, UK

Join Date: Jul 2009
Posts: 1,541
Default

You need BLAST+ 2.2.28 or later for the stitle field and related new columns, see:
http://blastedbio.blogspot.co.uk/201...criptions.html
maubp is offline   Reply With Quote
Reply

Tags
annotation, blast+

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 12:41 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO