View Single Post
Old 07-30-2013, 03:08 AM   #7
rhinoceros
Senior Member
 
Location: sub-surface moon base

Join Date: Apr 2013
Posts: 372
Default

Quote:
Originally Posted by dariober View Post
Hi- I'm not sure how/if it can be done by Blast. However you can extract the top hit from the blast ouput (blastout.txt) with the code below:

Code:
sort -k1,1 -k12,12nr -k11,11n  blastout.txt | sort -u -k1,1 --merge
The first sort orders the blast output by query name then by the 12th column in descending order (bit score - I think), then by 11th column ascending (evalue I think).
The second sort picks the first line from each query. Obviously you can skip the first sort if the output is already sorted in the 'correct' order.

Hope it helps (make sure it does what you want)...

Dario
I've lost count on how many times I've used this sort. However, I just realized that it doesn't work as intended. Sort -n compares according to string numerical value, so for example 1e-10 would be smaller than 2e-100. I believe the command below is 'correct'..

Code:
sort -k1,1 -k12,12gr -k11,11g -k3,3gr blastout.txt | sort -u -k1,1 --merge > bestHits
Although the bitscore sort probably works fine with -k12,12nr too (which might make it somewhat faster perhaps??).

Note, make sure that your locale settings recognize "." as a decimal separator, e.g.

Code:
export LC_ALL=en_US.UTF-8
export LANG=en_US.UTF-8

Last edited by rhinoceros; 05-17-2014 at 03:08 AM.
rhinoceros is offline   Reply With Quote