Blast threads always drop to 1

I've seen this issue with every version of blast+ that I've used recently. When I run jobs on a multi-core machine, I specify -num_threads XX to speed things up. Invariably, no matter how many threads I specify, after a short time it seems as though only 1 thread is active on the machine. I've compiled the blast binaries myself using both gcc and the intel icc compilers. When I start the job, top shows blastn/p/x using, say, 800% of processor if I specify 8 threads. After a few minutes this drops to 100%. The job completes in less time than a single-threaded job, but not by much. Is this normal behavior?

