Hi there,
I performed a de novo RNA seq analysis using oases and trinity and ended up with a list of contigs.
I now want to cluster the contigs to group them by similarity to see the redundancy level I have encountered. I am after the idea that if I have say 50k contigs and get 1 cluster, then the redundancy will be 100% since all detected transcripts will be the same, and the opposite, if I get 50K clusters, I would have 0% redundancy and thus all 50k contigs will be different. What do you think?
I thought of using blastclust but apparently it has been removed from latest blast instalations. From the NCBI blast manual: "Please note that the NCBI C Toolkit applications seedtop and blastclust are not available in this release."
Does anyone know where to get it or if there is another program I could use to achieve this?
Thanks for your help,
Dave
I performed a de novo RNA seq analysis using oases and trinity and ended up with a list of contigs.
I now want to cluster the contigs to group them by similarity to see the redundancy level I have encountered. I am after the idea that if I have say 50k contigs and get 1 cluster, then the redundancy will be 100% since all detected transcripts will be the same, and the opposite, if I get 50K clusters, I would have 0% redundancy and thus all 50k contigs will be different. What do you think?
I thought of using blastclust but apparently it has been removed from latest blast instalations. From the NCBI blast manual: "Please note that the NCBI C Toolkit applications seedtop and blastclust are not available in this release."
Does anyone know where to get it or if there is another program I could use to achieve this?
Thanks for your help,
Dave
Comment