Hi all,
I try to fix the following issue:
I have a BLAST output from the command line, using PacBio data as queries and a custom database. I have some filtering criteria for my hits, e.g. >db_1 should at least cover 80bp, while >db_2 should cover at least 1000bp.
In the end I'd like to have a graphical representation of my queries and the corresponding (filtered) hits.
A refinement would be, if the alignment of db_2 is in total longer than the alignment of db_3, db_3 should be discarded...
I tried to use Bio::Search::Result::ResultI first. I ended up with a very ugly script, which works and processes my custom blast tab, but I even failed to visualize my results.
Next thing I gave a try was Bio::SearchIO::Writer::HTMLResultWriter;
The problem is, the syntax doesn't work. Of cause, because I mixed up hits and hsp. Is it possible to combine hit and hsp filters?
I know, that this question is very specific. But maybe anybody struggles on similar issues parsing blast outputs, trying to filter them somehow and visualize the results. I hope I could make my problem clear and maybe anybody could give me a suggestion how to solve it.
Thanks in advance!
ps: it doesn't matter whether I parse xml, pure blast or tables, just want to get it to run
I try to fix the following issue:
I have a BLAST output from the command line, using PacBio data as queries and a custom database. I have some filtering criteria for my hits, e.g. >db_1 should at least cover 80bp, while >db_2 should cover at least 1000bp.
In the end I'd like to have a graphical representation of my queries and the corresponding (filtered) hits.
A refinement would be, if the alignment of db_2 is in total longer than the alignment of db_3, db_3 should be discarded...
I tried to use Bio::Search::Result::ResultI first. I ended up with a very ugly script, which works and processes my custom blast tab, but I even failed to visualize my results.
Code:
while( my $result = $in->next_result ) { ## $result is a Bio::Search::Result::ResultI compliant object while( my $hit = $result->next_hit ) { ## $hit is a Bio::Search::Hit::HitI compliant object while( my $hsp = $hit->next_hsp ) { ## $hsp is a Bio::Search::HSP::HSPI compliant object if( $hit->name =~ m/someexpression/ ) { if($hit->name =~ m/db_2/ && $hsp->length('hit')>=90){ my $aln = $hsp->get_aln; my $alnIO = Bio::AlignIO->new(-format =>"msf", -file => ">hsp.msf"); $alnIO->write_aln($aln); print $result->query_name, "\t", $hit->name, "\t", $hsp->length('total'), "\t", $hsp->percent_identity, "\t", $hsp->start('query'), "\t", $hsp->end('query'), "\t", $hsp->start('hit'), "\t", $hsp->end('hit'), "\t", $alnIO, "\n"; } elsif($hit->name =~ m/db_2/ && $hsp->length('hit')>=80){ my $aln = $hsp->get_aln; my $alnIO = Bio::AlignIO->new(-format =>"msf", -file => ">hsp.msf"); $alnIO->write_aln($aln); print $result->query_name, "\t", $hit->name, "\t", $hsp->length('total'), "\t", $hsp->percent_identity, "\t", ...
Code:
my $in = new Bio::SearchIO( -format => 'blast', -file => $ARGV[0]); sub hsp_filter { my $hsp = shift; return 1 if ($hit->name =~ m/db_1/ && $hsp->length('hit')>=70); return 1 if ($hit->name =~ m/db_2/ && $hsp->length('hit')>=65); ... } my $writer = Bio::SearchIO::Writer::HTMLResultWriter ->new (-filters => {'HSP' => \&hsp_filter}); my $out = Bio::SearchIO ->new (-writer =>$writer); $out ->write_result($in->next_result);
I know, that this question is very specific. But maybe anybody struggles on similar issues parsing blast outputs, trying to filter them somehow and visualize the results. I hope I could make my problem clear and maybe anybody could give me a suggestion how to solve it.
Thanks in advance!
ps: it doesn't matter whether I parse xml, pure blast or tables, just want to get it to run
Comment