SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
BLAST+ creating custom blast database and using blast+ filtering features deniz Bioinformatics 3 07-07-2019 08:04 AM
Annotate contigs with BLAST hit names; remove contigs with no hit Bueller_007 Bioinformatics 10 02-27-2013 10:22 AM
stand-alone blast problem tujchl Bioinformatics 1 08-17-2011 09:58 PM
Blast problem sammy07 Bioinformatics 8 08-09-2011 04:19 AM
Need Help Regarding Reciprocal best hit in BLAST Bharat Bioinformatics 1 03-11-2010 03:04 AM

Reply
 
Thread Tools
Old 09-28-2010, 01:39 AM   #1
NicoBxl
not just another member
 
Location: Belgium

Join Date: Aug 2010
Posts: 264
Default blast hit problem

Hi,

I've a problem with a blast alignment (blast2 -p blastn or blastall -p blastn or megablast )

When I align a fasta file with about 3M sequences ( ~ 20 nt each sequence ) with a DB, I don't have any good alignment. But if I take a little sample of the sequences in the fasta file ( ~ 100 sequences ) , I have a few very good alignment ( ID = ~100% and e-value < 1e-5 )

I don't understand how this can occur ?

Anybody knows how to solve this problem ?

Thanks,

Nicolas
NicoBxl is offline   Reply With Quote
Old 09-28-2010, 12:18 PM   #2
malachig
Senior Member
 
Location: WashU

Join Date: Aug 2010
Posts: 117
Default

That is strange. In the past I have used BLAST to align blocks of 500,000 to 1 million reads to various databases (not the whole genome, but transcriptome or junction databases). In that scenario I used the '-m 8' option to produce tabular output, and filtered it as a stream to remove alignments below a specified score. If you do not do this, the output files can be massive (because BLAST reports all multi-match hits). Is it possible that you are producing a massive file that consumes all your disk space. Does the job actually run to completion? Remember that the output is buffered so it can take a while before anything gets written to disk if you have a large number of reads but only a small number of alignments being found, this effect could be exaggerated.
malachig is offline   Reply With Quote
Old 10-02-2010, 06:55 PM   #3
Torst
Senior Member
 
Location: The University of Melbourne, AUSTRALIA

Join Date: Apr 2008
Posts: 275
Default

Quote:
Originally Posted by NicoBxl View Post
When I align a fasta file with about 3M sequences ( ~ 20 nt each sequence ) with a DB, I don't have any good alignment. But if I take a little sample of the sequences in the fasta file ( ~ 100 sequences ) , I have a few very good alignment ( ID = ~100% and e-value < 1e-5 )
Older BLAST engines aligned each input sequence one by one, and wrote the result into the output file. Newer BLAST iterate over large chunks of the input to avoid scanning the whole database for each query sequence. Sometimes it can take hours before any output will be written.

Are you letting BLAST finish running first?
Torst is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 08:38 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO