View Single Post
Old 01-05-2018, 03:47 AM   #1
Izal
Junior Member
 
Location: Spain

Join Date: Jan 2018
Posts: 3
Unhappy Reciprocal Blast & Parse and Collect best hits

Hi everyone,

I'm novice doing reciprocal blast and I have spent a lot of time trying it. I don't know how to do a reciprocal blast and how to parse the best hits and compare & find matches. So, an advice would be a big help to me.

I need to do a reciprocal Blast in Python (Jupyter) to identify same/ortholougs proteins bewteen two strains.

- For that, first I make a db with the respective FASTA with the coding sequences.
Code:
!makeblastdb -in $fastap_1 -dbtype 'prot'
!makeblastdb -in $fastap_2 -dbtype 'prot'
- Then, I run blastp for both files:
Code:
blastp_ab = !blastp -query $fastap_1 -db $fastap_2 -out ab.txt
blastp_ba = !blastp -query $fastap_2 -db $fastap_1 -out ba.txt
Up to here everything works. **Now, I have to parse the blast results to collect best hits. I have tried it,** **but not even print "hello??"**

If I pass as argument the file "ab.txt", the function "collect_best_hits" returns me an empty dictionary. And if I pass the handle "blastp_ab" as argument, "collect_best_hits" functions return me and advice of "blastall" and the empty dictionary:

> no BLAST output. Check that blastall is in your PATH
>
> { }


(Options)
Code:
#filename = "ab.txt"
#filename = blastp_ab
filename = open("ab.txt", "r")

collect_best_hits(filename)

The function contains:

Code:
import os
import sys
import csv
import blastparser


def collect_best_hits(filename):
       d = {}
       try:
            for n, record in enumerate(blastparser.parse_fp(filename)):
                print "hola"
                if n % 100 == 0:
                      print >>sys.stderr, 'loading 1 ...', n
               
                print "!-", n, record
    
                best_score = None
                for hit in record.hits:
                    print "!- ", hit, len(record.hits)
                    for match in hit.matches:
                        print "!- ", match, len(hit.matches)
                        query = record.query_name
                        if query.startswith('gi'):
                            query = query.split('|', 2)[2]
                        subject = hit.subject_name
    
                        score = match.score
                        print "!- ", score
    
                        # only keep the best set of scores for any query
                        if best_score and best_score > score:
                            continue
                        best_score = score
    
                        x = d.get(query, [])
                        x.append((subject, score))
                        d[query] = x
    
                    if best_score and best_score != score:
                        break
        except Exception as e:
            print(e)
        return d

What I'm doing wrong?

All help is welcome!
Izal is offline   Reply With Quote