SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
BLAST+ creating custom blast database and using blast+ filtering features deniz Bioinformatics 3 07-07-2019 08:04 AM
BLAST help horvathdp Bioinformatics 1 12-14-2011 07:33 AM
BLAST+ vs BLASTALL (legacy BLAST) Symphysodon Bioinformatics 4 10-25-2011 02:52 PM
BLAST database error - when changing to new BLAST+ local program biobio Bioinformatics 4 06-15-2011 05:20 AM
blast AndyOD Bioinformatics 3 03-07-2010 05:59 PM

Reply
 
Thread Tools
Old 07-26-2010, 08:37 AM   #21
BioTalk
Member
 
Location: Kansas

Join Date: Feb 2010
Posts: 43
Default

Thanks to you! Sure I will have look into blast+ manual.
BioTalk is offline   Reply With Quote
Old 07-27-2010, 07:17 AM   #22
westerman
Rick Westerman
 
Location: Purdue University, Indiana, USA

Join Date: Jun 2008
Posts: 1,104
Default

Also it looks like your query sequences are very short (20 bp). You will probably have take this into consideration via non-default command line parameters.
westerman is offline   Reply With Quote
Old 07-27-2010, 07:29 AM   #23
BioTalk
Member
 
Location: Kansas

Join Date: Feb 2010
Posts: 43
Default

Quote:
Originally Posted by westerman View Post
Also it looks like your query sequences are very short (20 bp). You will probably have take this into consideration via non-default command line parameters.
Yes, my query sequences are shorter than 20bp. What non default commands do I need to use?
BioTalk is offline   Reply With Quote
Old 07-27-2010, 07:51 AM   #24
westerman
Rick Westerman
 
Location: Purdue University, Indiana, USA

Join Date: Jun 2008
Posts: 1,104
Default

For short sequences and for blast+ then using the commands 'blastn-short' or 'megablast' will be preferable to the regular commands. If those commands are not directly available then run 'blastn' with the command line option '-task blastn-short' or '-task megablast'.

There may be other options that I am unaware of since I do not do many short sequence alignments. The most important concept is to simply be aware that blast is generally used to align longer sequence and that at 20-bp you are getting close to the window sizes that blast uses. Blast, like many tools, is not something to use without some thought.
westerman is offline   Reply With Quote
Old 07-27-2010, 12:02 PM   #25
robs
Senior Member
 
Location: San Diego, CA

Join Date: May 2010
Posts: 116
Default

If you expect errors in your sequences or want to look for more distant relationships, you might want to lower the seed length (default of 11; try 6-8; parameter -W). BLAST(+) also filters regions for low complexity and your short sequences might be filtered out before any alignment. You can turn off the filtering and see if it makes any differences (-nofilter).
robs is offline   Reply With Quote
Old 07-28-2010, 07:24 AM   #26
BioTalk
Member
 
Location: Kansas

Join Date: Feb 2010
Posts: 43
Default

Does anyone know how to get an output file in Blast with only the details of aligned regions?

Because I am trying to compare two files with any fasta sequences in it and I am getting huge file with match as well as not matched regions.
BioTalk is offline   Reply With Quote
Old 07-28-2010, 07:29 AM   #27
rglover
rg
 
Location: uk

Join Date: Dec 2008
Posts: 51
Default

If you use this command you'll only get the alignments:
-num_descriptions 0 -num_alignments <however-many-you-want>

You'll still get an output for the sequences where no matches have been found though. You could also try using BioPerl to process the Blast results.
rglover is offline   Reply With Quote
Old 07-28-2010, 07:45 AM   #28
BioTalk
Member
 
Location: Kansas

Join Date: Feb 2010
Posts: 43
Default

Quote:
Originally Posted by rglover View Post
If you use this command you'll only get the alignments:
-num_descriptions 0 -num_alignments <however-many-you-want>

You'll still get an output for the sequences where no matches have been found though. You could also try using BioPerl to process the Blast results.
I tried -num_descriptions 0 -num_alignment 1 -outfmt 0, but I am still getting all the matched and unmatched regions in the same file.
BioTalk is offline   Reply With Quote
Old 07-28-2010, 11:55 AM   #29
rglover
rg
 
Location: uk

Join Date: Dec 2008
Posts: 51
Default

Just to clarify - you're getting the one alignment that you want, but you're also getting the "No hits found" ones too?
If that's the case, you could use BioPerl to go through the file and then choose to only print out the ones that have hits to a new file.
rglover is offline   Reply With Quote
Old 07-30-2010, 10:44 AM   #30
BioTalk
Member
 
Location: Kansas

Join Date: Feb 2010
Posts: 43
Default

Quote:
Originally Posted by rglover View Post
Just to clarify - you're getting the one alignment that you want, but you're also getting the "No hits found" ones too?
If that's the case, you could use BioPerl to go through the file and then choose to only print out the ones that have hits to a new file.
Yes, that's correct. But the output file generated is of random pattern which makes it more difficult for me to extract only aligned regions. Below if the example of the file.

Please let me know if anyone knows how to deal with this. Thank you!

BLASTN 2.2.23+


Reference: Stephen F. Altschul, Thomas L. Madden, Alejandro A.
Schaffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J.
Lipman (1997), "Gapped BLAST and PSI-BLAST: a new generation of
protein database search programs", Nucleic Acids Res. 25:3389-3402.



Database: Desktop/RNA.fa
15,632 sequences; 339,921 total letters



Query= 1-72342
Length=20


***** No hits found *****



Lambda K H
0.634 0.408 0.912

Gapped
Lambda K H
0.625 0.410 0.780

Effective search space used: 956935


Query= 2-55421
Length=19


***** No hits found *****



Lambda K H
0.634 0.408 0.912

Gapped
Lambda K H
0.625 0.410 0.780

Effective search space used: 1066359


Query= 3-46574
Length=21
Score E
Sequences producing significant alignments: (Bits) Value



>lcl|zma-miR159k MIMAT0013980 Zea mays miR159k
Length=21

Score = 39.2 bits (42), Expect = 1e-06
Identities = 21/21 (100%), Gaps = 0/21 (0%)
Strand=Plus/Plus

Query 1 TTTGGATTGAAGGGAGCTCTG 21
|||||||||||||||||||||
Sbjct 1 TTTGGATTGAAGGGAGCTCTG 21


>lcl|
MIMAT0013979 Zea mays miR159j
Length=21

Score = 39.2 bits (42), Expect = 1e-06
Identities = 21/21 (100%), Gaps = 0/21 (0%)
Strand=Plus/Plus

Query 1 TTTGGATTGAAGGGAGCTCTG 21
|||||||||||||||||||||
Sbjct 1 TTTGGATTGAAGGGAGCTCTG 21


>lcl|zma-miR159f MIMAT0013975 Zea mays miR159f
Length=21

Score = 39.2 bits (42), Expect = 1e-06
Identities = 21/21 (100%), Gaps = 0/21 (0%)
Strand=Plus/Plus

Query 1 TTTGGATTGAAGGGAGCTCTG 21
|||||||||||||||||||||
Sbjct 1 TTTGGATTGAAGGGAGCTCTG 21


>lcl|tae-miR159b MIMAT0005344 Triticum aestivum miR159b
Length=21

Score = 39.2 bits (42), Expect = 1e-06
Identities = 21/21 (100%), Gaps = 0/21 (0%)
Strand=Plus/Plus

Query 1 TTTGGATTGAAGGGAGCTCTG 21
|||||||||||||||||||||
Sbjct 1 TTTGGATTGAAGGGAGCTCTG 21
BioTalk is offline   Reply With Quote
Old 07-30-2010, 01:39 PM   #31
maubp
Peter (Biopython etc)
 
Location: Dundee, Scotland, UK

Join Date: Jul 2009
Posts: 1,542
Default

Quote:
Originally Posted by BioTalk View Post
Yes, that's correct. But the output file generated is of random pattern which makes it more difficult for me to extract only aligned regions. Below if the example of the file.

Please let me know if anyone knows how to deal with this. Thank you!
BioPerl has a parser for this (the default plain text output from BLAST). You can also tell BLAST to output simple tabular data, or XML. Lots of options.
maubp is offline   Reply With Quote
Old 12-09-2011, 12:32 PM   #32
horvathdp
Member
 
Location: Fargo

Join Date: Dec 2011
Posts: 65
Default problems with stand alone blast to nr

Hi all. I downloaded blast, and can make it work fine when I blast one data set against another (after formatting them) , but I can't seem to get the program to do a blast against the nr database that I downloaded and unpacked. My command lines is: blastn -query C:\Users\horvathd\Desktop\DREB.fa -db nr -out C:\Users\horvathd\Desktop\Dreb_vs_nr.txt -evalue 1e-5 -num_alignments 20 -outfmt 0 -num_descriptions 20

I get the following error

BLAST Database error: No alias or index file found for nucleotide database [nr]
in search path [C:\Program Files\NCBI\blast-2.2.24+\bin;;]

I unpacked the nr files in the "bin" folder. I can blast the DREB.fa file against other databases I have created with no problem. Any idea what I am doing wrong?
horvathdp is offline   Reply With Quote
Old 12-09-2011, 02:10 PM   #33
robs
Senior Member
 
Location: San Diego, CA

Join Date: May 2010
Posts: 116
Default

If you take a look at the error message, it tells you why: "No alias or index file found for nucleotide database [nr]". The NR database is a protein database. What you probably want is NT.
robs is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 10:07 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO