Incorporating Genetic Algorithm into Rough Feature Selection for High Dimensional Biomedical Data
Faculty of Computing, Health and Science
School of Computer and Security Science / Artificial Intelligence and Optimisation Research Centre
In this paper, a hybrid approach incorporating genetic algorithm and rough set theory into Feature Selection is proposed for searching for the best subset of optimal features. The approach utilizes K-means clustering for partitioning attribute values, the rough set-based approach for reducing redundant data, and the genetic algorithm for searching for the best subset of features. A set of six attributes was obtained as the best subset using the proposed algorithm on the colon cancer dataset. Classification was carried out using this set of six attributes with 23 classifiers from WEKA (Waikato Environment for Knowledge Analysis) software to examine their significance to classify unseen test data. In addition, the set of 6 genes found by the proposed approach was also examined for their relevance to known biomarkers in the colon cancer domain.