Incorporating Genetic Algorithm into Rough Feature Selection for High Dimensional Biomedical Data

Document Type

Conference Proceeding




Faculty of Computing, Health and Science


School of Computer and Security Science / Artificial Intelligence and Optimisation Research Centre




This article was originally published as: Dang, V. Q., Lam, C. P., & Lee, C. (2011). Incorporating genetic algorithm into rough feature selection for high dimensional biomedical data. Paper presented at the IEEE International Symposium on IT in Medicine and Education (ITME). IEEE. Guangzhou, China. Original article available here


In this paper, a hybrid approach incorporating genetic algorithm and rough set theory into Feature Selection is proposed for searching for the best subset of optimal features. The approach utilizes K-means clustering for partitioning attribute values, the rough set-based approach for reducing redundant data, and the genetic algorithm for searching for the best subset of features. A set of six attributes was obtained as the best subset using the proposed algorithm on the colon cancer dataset. Classification was carried out using this set of six attributes with 23 classifiers from WEKA (Waikato Environment for Knowledge Analysis) software to examine their significance to classify unseen test data. In addition, the set of 6 genes found by the proposed approach was also examined for their relevance to known biomarkers in the colon cancer domain.