NSC-NSGA2: Optimal search for finding multiple thresholds for nearest shrunken centroid

Document Type

Conference Proceeding




Faculty of Health, Engineering and Science


School of Computer and Security Science / Artificial Intelligence and Optimisation Research Group




Dang, V.Q., & Lam, C. (2013). NSC-NSGA2: Optimal search for finding multiple thresholds for nearest shrunken centroid. Proceedings of the 2013 Bioinformatics and Biomedicine. (pp. 367-372 ). Shanghai, China. IEEE. © 2013 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. Available here


The Nearest Shrunken Centroid (NSC) method, with Prediction Analysis for Microarrays being its most well known implementation, has been widely used as a classification method for high dimensional biomedical data. A threshold value must also be provided in this method as input and normally, this is selected manually on a "trial and error" basis by executing the NSC method many times using a number of predetermined shrinkage threshold values. The optimal value is then obtained by minimizing the cross-validated error on the training data. This process can be time-consuming and the optimal threshold value may be limited by the granularity of the predetermined values. In this paper, an approach incorporating the NSC method and a multi-objective evolutionary algorithm, Non-dominated Sorting Algorithm 2, is proposed for obtaining the optimal shrinkage threshold value automatically. The NSC method acts as the fitness evaluator in the evolutionary process. Multiple objectives can be incorporated for determining the threshold values and a number of optimal solutions are obtained, each on the basis of tradeoffs between the objectives. By providing multiple potential solutions, it allows biomedical experts to better explore the joint behaviors of features in their data. The proposed approach also overcomes limitations normally associated with single objective approaches; a single optimum and the need to determine weightings associated with various objective functions in an aggregated objective function. The proposed approach was evaluated using the Alzheimer's Disease, Colon and Leukemia cancer dataset.