Register FAQ Members List Calendar Search Today's Posts Mark Forums Read

 Similar Threads Thread Thread Starter Forum Replies Last Post deniz Bioinformatics 3 07-07-2019 08:04 AM biobio Bioinformatics 4 06-15-2011 05:20 AM Dinesh Bioinformatics 1 02-08-2011 03:51 AM jwhittall Bioinformatics 0 07-21-2010 01:20 PM Bharat Bioinformatics 1 03-11-2010 03:04 AM

 04-26-2012, 02:39 PM #1 renesh Junior Member   Location: Louisiana Join Date: Sep 2011 Posts: 9 reciprocal blast How to do reciprocal blast for two datasets?
 04-26-2012, 10:27 PM #2 tomc Member   Location: Oregon Join Date: Feb 2011 Posts: 29 for two datasets A & B where records a are in dataset A and records b are in dataset B Blast query a against target dataset B to obtain best hit b' Blast query b' against target dataset A to obtain best hit a' if a == a' then b' is the reciprocal best hit of a (rbh) otherwise a and b' are not reciprocal best hits this is done for all a in A. (and possible for any b in B not a rbh of something in A) sorry if this too abstract. but the question is not so specific. Last edited by tomc; 04-27-2012 at 02:23 PM. Reason: typo
 01-02-2013, 06:49 PM #3 kenietz Member   Location: Singapore Join Date: Nov 2011 Posts: 85 Hi, firstly i'm sorry for reviving the thread but i have a more specific question on the subject. Further i will use the notation used by TOMC in previous post. So i blast query a against set B and get the best hit b'. But then should i blast with b' or the full sequence of b'? I ask cos i found cases where if i blast with b' then a == a' but if blast with the full seq of b' then a != a' (they don't equal). Logically one should blast with the full seq of b' at least in my opinion but im not sure. Any help is appreciated. Thank you
 11-06-2014, 01:04 PM #4 tomc Member   Location: Oregon Join Date: Feb 2011 Posts: 29 Hi kenietz Since each data set can be one or many sequences of any lenght, the question remains less than fully specified, but under the assumption that both datasets are composed of many shorter sequences each which represent some logical unit (transcript, illumima read) then yes you blast the entire representation of b identified by the hit b' against dataset A. Under different assumptions this could make no sense and you may need to decide what logical unit containing b' to blast against A
 02-08-2017, 10:57 AM #5 clarissaboschi Member   Location: US Join Date: Apr 2010 Posts: 63 And how can I do this comparison? I mean I already have the output results from blast (axb and bxa). So how can I compare them and get the matches between the 2 files? Should I use unix commands or python scripts?
 02-09-2017, 12:47 PM #7 clarissaboschi Member   Location: US Join Date: Apr 2010 Posts: 63 Thanks tomc for your detailed explanation and time to answer it. This was exactly what I did. I wrote a shell script in different steps. I was not able to write a python script because I am learning... I found a few python scripts to do it but did not work for me. But my shell script is working well. In the last step I got the duplicates of my combined file instead of the common ones.
 02-10-2017, 08:33 AM #8 maubp Peter (Biopython etc)   Location: Dundee, Scotland, UK Join Date: Jul 2009 Posts: 1,543 Here's a BLAST RBH tool I wrote earlier in Python, which does consider duplicates and gives warnings about them: https://github.com/peterjc/galaxy_bl...h/blast_rbh.py It has a Galaxy wrapper but you can ignore that, other than perhaps reading the help text included it it - which suggests thinking about setting minimum identity and minimum alignment lengths and reading this paper: Punta and Ofran (2008) The Rough Guide to In Silico Function Prediction, or How To Use Sequence and Structure Information To Predict Protein Function. PLoS Comput Biol 4(10): e1000160. http://dx.doi.org/10.1371/journal.pcbi.1000160