SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
BLAST+ creating custom blast database and using blast+ filtering features deniz Bioinformatics 3 07-07-2019 09:04 AM
samtools/bcftools missing obvious SNPs? SamH Bioinformatics 3 01-17-2012 03:16 PM
BLAST database error - when changing to new BLAST+ local program biobio Bioinformatics 4 06-15-2011 06:20 AM
"Obvious variant" missed by GATK UnifiedGenotyper Yilong Li Bioinformatics 7 04-05-2011 01:39 PM
Viewers for Blast and Blat alignments dgacquer Bioinformatics 4 01-09-2010 09:33 PM

Reply
 
Thread Tools
Old 06-19-2012, 07:05 PM   #1
drdna
Member
 
Location: Kentucky

Join Date: May 2012
Posts: 76
Default BLAST missing obvious alignments

I have been using BLAST to compare genome sequences with the goal of identifying genomic regions that are unique to specific fungal strains. Unfortunately, BLAST misses a large number of long (>1 kb), almost perfect alignments, resulting in too many false positives for my liking. Can anyone recommend another alignment tool that does not miss obvious matches? I know Smith Waterman is more sensitive but I'd like something a little more computationally efficient.
drdna is offline   Reply With Quote
Old 06-20-2012, 01:35 AM   #2
maubp
Peter (Biopython etc)
 
Location: Dundee, Scotland, UK

Join Date: Jul 2009
Posts: 1,543
Default

Try turning off the low complexity filter - if part of your query triggers this, that part won't be used in the searches.
maubp is offline   Reply With Quote
Old 06-20-2012, 04:52 AM   #3
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,138
Default

You can also try Blat (http://genome.ucsc.edu/FAQ/FAQblat.html#blat3).
GenoMax is offline   Reply With Quote
Old 06-20-2012, 07:15 AM   #4
drdna
Member
 
Location: Kentucky

Join Date: May 2012
Posts: 76
Default

I routinely switch off the low complexity filter, so this is not the issue. The problem lies in the initial blast step - breaking the query into short seed "words." Presumably, for long queries, such as genomic contigs, the distribution of the seeds is too sparse.
drdna is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 02:01 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2022, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO