Dear all. I need some help for doing my research in metagenomic binning. Since Bioinformatics is still new in my country, I take this topic for my research. My topic is "Feature selection with Particle Swarm Optimization for metagenome fragment classification ". I use SVM as classifier.
The main goal of my research is to reduce computation needed for classification using k-mers frequency by selecting only a few feature will be used for classification, as we know that high accuracy will be obtained with longer length k (ex: 20-mers), but the number of features wil be huge, it will be (4 power 20 = 1.099.511.627.776 features extracted ) when we use 20-mers. So I will try to use PSO for feature selection.
But I haven't get idea how to
1. Represent the features of DNA dataset to particle in PSO. As the real atribute value of DNA features are in k-mers form (ex: AAA, AAC, AAT,AAG, GTG,...).
2. How to initialize the number of particles, and to compute and updating the position and velocity.
please give me any idea. thank you very much
The main goal of my research is to reduce computation needed for classification using k-mers frequency by selecting only a few feature will be used for classification, as we know that high accuracy will be obtained with longer length k (ex: 20-mers), but the number of features wil be huge, it will be (4 power 20 = 1.099.511.627.776 features extracted ) when we use 20-mers. So I will try to use PSO for feature selection.
But I haven't get idea how to
1. Represent the features of DNA dataset to particle in PSO. As the real atribute value of DNA features are in k-mers form (ex: AAA, AAC, AAT,AAG, GTG,...).
2. How to initialize the number of particles, and to compute and updating the position and velocity.
please give me any idea. thank you very much