Hi all,
I have a huge dataset from Hiseq (15 million sequences after quality filtering).
I would like to use cd-hit or other clustering software to reduce the redundancy first, and proceed to BLAST against to nr database.
However, for the first step using cd-hit-est (v.4.6.1) with 10 threads, it already took me a week but still not yet finished.
I am new to NGS analysis, anyone could advise to speed up the process?
Your kindly suggestions and help are highly appreciate. Many thanks.
I have a huge dataset from Hiseq (15 million sequences after quality filtering).
I would like to use cd-hit or other clustering software to reduce the redundancy first, and proceed to BLAST against to nr database.
However, for the first step using cd-hit-est (v.4.6.1) with 10 threads, it already took me a week but still not yet finished.
I am new to NGS analysis, anyone could advise to speed up the process?
Your kindly suggestions and help are highly appreciate. Many thanks.
Comment