SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
BLAST+ creating custom blast database and using blast+ filtering features deniz Bioinformatics 3 07-07-2019 09:04 AM
Custom local blast results detq182 Bioinformatics 3 07-07-2019 08:58 AM
BLAST+ vs BLASTALL (legacy BLAST) Symphysodon Bioinformatics 4 10-25-2011 03:52 PM
BLAST database error - when changing to new BLAST+ local program biobio Bioinformatics 4 06-15-2011 06:20 AM
Parsing BLAST results using BioPerl Ben Saville Bioinformatics 8 08-24-2010 08:43 AM

Reply
 
Thread Tools
Old 05-11-2011, 05:17 PM   #1
nitinkumar
Member
 
Location: UK

Join Date: Feb 2011
Posts: 11
Default blast results

I want to edit a blast file in such way that if the query sequence has a gap, that should also comes in the reference sequence like if blast output shows:

query: aa-gcaa
|| ||||
reference: aatgcaa
and I want to remove the t from reference and place a gap...

query: aa-gcaa
|| ||||
reference: aa-gcaa
nitinkumar is offline   Reply With Quote
Old 05-12-2011, 04:19 AM   #2
maubp
Peter (Biopython etc)
 
Location: Dundee, Scotland, UK

Join Date: Jul 2009
Posts: 1,543
Default

Why?

Do you know any programming languages such as Perl, Python, Ruby, Java, etc? If so have a look at BioPerl, Biopython, BioRuby, BioJava etc for libraries to work with BLAST files.
maubp is offline   Reply With Quote
Old 05-12-2011, 09:15 AM   #3
nitinkumar
Member
 
Location: UK

Join Date: Feb 2011
Posts: 11
Default

Yes I have tried bioperl. but I am not able to do that. I can extract the fasta sequences from these files only..
nitinkumar is offline   Reply With Quote
Old 05-15-2011, 08:38 PM   #4
A_Morozov
Member
 
Location: Russia, Irkutsk

Join Date: Feb 2011
Posts: 40
Default

Why not use regexps for, say, finding gap-containing piece of query sequence (20 bp or so) in reference and then removing whatever you want from it?
A_Morozov is offline   Reply With Quote
Old 05-16-2011, 12:21 AM   #5
tomc
Member
 
Location: Oregon

Join Date: Feb 2011
Posts: 29
Default

you want to copy your query over your reference...?

If so, why not just pull you query sequence and use those,
they already have the gaps you seem to be looking for.

But it does sound odd, maybe a better explanation of why would help.
tomc is offline   Reply With Quote
Old 05-16-2011, 01:43 AM   #6
A_Morozov
Member
 
Location: Russia, Irkutsk

Join Date: Feb 2011
Posts: 40
Default

Just taking a query won't work, because it can have different nucleotides (but not gaps) at some sites. After some thinking I see that you don't need any regexps, all you need is like

{
reference[i]='-' if query[i]='-';
}

for each position in sequences. Hope you can grab some sequences, dude.
But yes, I'd like to know why he would want to do something like this.
A_Morozov is offline   Reply With Quote
Old 05-16-2011, 04:28 AM   #7
nitinkumar
Member
 
Location: UK

Join Date: Feb 2011
Posts: 11
Default

Thanks for the reply and I am getting these results because of 454 sequencing errors, the query sequence is the gene of rhizobium bacteria and the subject is the sequence from matching contigs. I want to remove these sequencing errors in contigs. So that I can make the phylogeny of the contigs.
nitinkumar is offline   Reply With Quote
Old 05-16-2011, 05:17 AM   #8
A_Morozov
Member
 
Location: Russia, Irkutsk

Join Date: Feb 2011
Posts: 40
Default

But so you lose actual indels that could happen between these two species, don't you? I think that first you should make sure that this particular nucleotide is indeed an error. Maybe, it is of quality much less than of other nearby nucleotides, or it is at long repeat like aaaaaaaaa or something else.
A_Morozov is offline   Reply With Quote
Old 05-16-2011, 08:17 AM   #9
nitinkumar
Member
 
Location: UK

Join Date: Feb 2011
Posts: 11
Default

The contigs are of the rhizobium strains and blast report shows the position where i m getting this type of results have the neighboring nucleotides exactly the same as with query. So I m pretty sure that these are sequencing errors...
nitinkumar is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 08:44 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO