Zernike Moments and Genetic Algorithm : Tutorial and Application

Aims/ objectives : To demontrate effectiveness of Zernike Moments for Image Classiﬁcation. Zernike moment(ZM) is an excellent region-based moment which has attracted the attentions of many image processing researchers since its ﬁrst application to image analysis. Many papers have been published on several works done on ZM but no single paper ever give a detailed information of how the computation of ZM is done from the time the image is captured to the computation of ZM. This work showed how to effectively apply ZM on RGB images. We have demonstrated the effectiveness of Zernike moment in image classiﬁcation system. A neuro-genetic intelligent system has been built with PNN classiﬁer. The feature extracted viz ZM and Geometric features were further subjected to GA to bring the best combinatorial features for optimal accuracy. The algebraic structure of our novel ﬁtness function enabled the GA to select the best results. The 10-fold CV used enabled the whole system to be unbiased giving a classiﬁcation accuracy of 90.05%. A demonstration of afﬁne properties of ZM are comprehensively stated and explained. In summary, the ZM enabled the classiﬁer to have improved accuracy of 91% as compared with Geometric features with 89 % accuracy.


Introduction
In object/image recognition, a region can be described using scalar or set of scalars based on the geometric properties of the object.Such scalars or set of scalars are called descriptors since they descripe objects being recognised by artificial vision system.This work presents a novel and detailed framework for efficient computation of Zernike Moment (ZM) -a region-based moments.
Using ZM and geometric features, we extracted 20 features, which were later subjected to a Genetic Algorithm (GA) for dimensionality reduction to obtain best feature set which we then used to build our classification system.

Image Moments
A moment describe the layout (arrangement of image pixels).Moments are global region-based descriptors for shape and is bit like combination of area, compactness, irregularity, and higher order descriptors together [1].An image moment is defined as the integration of an image function with a region-defined polynomial basis ( [2,3]).The region here is defined as the area where that image is valid.From [4], the general moment Mpq of any image f (x, y) of order p + q, where p > 0, q > 0, is defined as: where polpq(x, y), i = 1(0)p, j = 1(1)q are polynomials basis functions defined on domain D.

Zernike Moment
The Zernike moment(ZM) can be defined as a set of complete complex orthogonal basis functions that are square integrable and that are defined over the unit disk.ZM were first applied in image analysis for the first time in [5].ZM are orthogonal moment based on Zernike polynomials (see Table 1).Orthogonality here means that there is no redudancy or overlapping of information between the moments.Thus moments are uniquely quantified based on their orders ( [6,7]).The distinguishing feature of ZM is the invariance of its magnitude with respect to rotation ( [8,9,10,11]).If we are given the ordered pair (m, n) which represents the order of the Zernike polynomial and the multiplicity of its phase angle, then the ZM, can be defined as where are the image pixel radial vector and angle between it and x-axis respectively The Rnm is the Zernike/radial basis polynomial, some of which are listed in Table 1.The following conditions must be satisfied: 126r 9 − 280r 7 + 210r 5 − 60r 3 + 5r R9,3(r) 84r 9 − 168r 7 − 105r 5 − 20r 3 R9,5(r) 36r 9 − 56r 7 + 21r 5 R9,7(r) 9r 9 − 8r 7 R9,9(r) r 9

Computation of Zernike Moments for Images of Plant leaves
Given the definitions in Equations 3.1-3.4,the Zernike moment, Znm for an image {f (xi, yj) : 1 ≤ i ≤ M, 1 ≤ j ≤ N }, can be calculated as Equations (4.1) or (4.2) where x 2 + y 2 ≤ 1, and m = 0, 1, 2, 3, ...∞.The m defines the order of the Zernike Polynomial while n which is either negative or positive, represents the multiplicity of the phase angles in ZM.

Image Pre-Processing Steps
Input: The original RGB image of a plant leaf ( 1) is given and can also be represented as {f (xi, yj) : The dimension of the RGB leaf image in figure (1) is 1200 × 1600 × 3

STEP1: Conversion to grayscale
The RGB image is converted to grayscale using appropriate formula.A good example (which is used here) is the rgb2gray() function from MATLAB and is expressible as equation (4.3) where R, G, and B correspond to the red, green and blue colour of the pixel, respectively, while their coefficients represent human perception of the colour.According to [12], the retina of human eyes has three types of cone which are L-cone (most sensitive to red light), M-cone (most sensitive to green light), and S-cone (most sensitive to blue light).The formula in Equation 4.3 was designed to match human brightness perception based on M-cone of human eyes retina.The simplest colour-tograyscale algorithm, which is the mean of the RGB channels, is given in Equation 4. 4.
The size of the grayscale image in figure 1 is 1200 × 1600.Thus, the total number of pixels here is 1200 x 1600 = 1920000.This implies the pixel values (intensity information) are spread across a rectangular or square region.This is a monochrome image.The numerical output for this image is {Zi}
The dimension of SET B = dimension of SET C but the contents of SET C are bits which are easier for feature extractors and even computers to process.

Computational Algorithms and Image Normalization PROBLEM STATEMENT:
The problem here is to slide pixels information for the binary image from rectangular or square grid over a unit disk as shown in figure 2.

Options for Computation of ZM
The ZM can be computed using (a)Cartesian Coordinate system and (b)Polar Coordinate System where the double summation is performed over the (i, j)

•Polar Coordinate System
The ROI (1(c) is mapped to the unit disc (using Eq 3.2) through polar coordinates, where the center of the ROI is the origin of the unit disk.The conversion from rectangular to polar coordinates is done through Eq (3.2).The coordinates are then described by the length of the vector from the origin of the disk to the coordinate point ρ , and the angle from the x-axis, to the vector ρ, (the polar radius).The polar angle is represented as θ.The pixels falling outside the unit disc are not used in the calculation.The translation invariance is achieved by moving the centroid of the ROI to the origin of the disk and this eventually causes m01 = m10 = 0.The centroid of the ROI is given by the coordinates (x, ȳ) where x = m10 m00 , ȳ = m01 m00 (5.1) The scale invariance for ZM is achieved through normalization of the image so that the total area of the forground pixels is of predetermined value, say, β.

Geometric Features
We extracted the following geometric features from the Flavia dataset: • Diameter: This is the longest distance between any two coordinates on the margin of a leaf.
• Physiological Length: This is the distance between the two terminals (apex and stalk point) [14].
• Physiological Width: This is the perpendicular distance across the physilogical length of a leaf [13].
• Leaf Area: This is the total number of pixels that constitute an image a = ∫ x ∫ y I(x, y)dydx.
• Aspect Ratio: This is also called eccentricity and is defined as ratio between length of the leaf minor axis and the length of the leaf major axis [15].w l • Circularity: This is a measure of similarity between a 2D shape is and a circle.It is the ratio between area a of the leaf and the square of its perimeter p .It is given as a p 2 .[14].
• Irregularity: This is the ratio between the radius of the maximum circle encompassing the region and the minimum circle that can be contained in the region ( [16,17]) .It is given as • Solidity: This is defined as the ratio between the area of the leaf and the area of its convex hull [14].It is given as a AreaOf ConvexHull .• Form Factor: This feature describes the difference between a leaf and a circle.It is represented as 4πa p , where a is the leaf area and p is the perimeter of the leaf.• Rectangularity: This describes the similarity between a leaf and a rectangle.It is represented by lw a where l is the physiological length, w is the physiological width and a is the leafs area.

Feature Space and Feature Selection
We extracted 20 features from the Flavia dataset.The Flavia dataset comprises of 1907 colored images of 32 species of plants [13].These features are (a) 10 Zernike moments and (b) 10 geometric moments.Thus the dimensionality of the dataset is 1907 × 20.High dimensional feature set could pose a great threat to pattern or image recognition systems.In otherwords, too many features sometimes reduce the classification accuracy of the recognition system since some of the features may be redundant and non-informative [18].Different combinatorial set of features should be obtained in order to keep the best combination to achieve optimal accuracy.As such, a GA-based feature selection (a subspace or manifold projection technique) will be used to reduce the number of features needed by the PNN Classifier in this work.A Feature Subset Selection (FSS) is an operator F s or a map from an m-dimensional feature space (input space) to n-dimensional feature space (output) given in mapping, F s : R r×m → R r×n (7.1)where m ≥ n and m, n ∈ Z + , R r×m is any database or matrix containing the original feature set having r instances or observation, R r×n is the reduced feature set containing r observations in the subset selection.

Feature Selection Using Genetic Algorithm
Genetic Algorithms (GA) can be defined as population-based and algorithmic search heuristic methods that mimic natural evolution process of man ( [19,20]).GA iteratively employ the use of one population of chromosomes (solution candidates) to get a new population using a method of natural selection combined with genetic functionals such as crossover and mutation (in the similitude of Charles Darwin evolution principle of reproduction, genetic recombination, and the survival of the fittest).
In comparative terminology to human genetics, chromosomes are the bit strings, gene is the feature, allele is the feature value, locus is the bit position, genotype is the encoded string, and phenotype is the decoded genotype [21].The fitnesses of the chromosomes are evaluated using a function commonly refered to as Objective function or fitness function.In other words, the fitness function (objective function) reports numerical values which are used in ranking the chromosomes in the population.Thus, the five important issues in the GA are chromosome encoding, population initialization, fitness evaluation, selection (followed by genetic operators), and criteria to stop the GA (see Figure 9).The GA operates on binary search space as the chromosomes are bit strings.The GA manipulates the finite binary population in similitude of human natural evolution.First, an initial population is created randomly and evaluated using a fitness function.As regards binary chromosome used in this work, a gene value '1' indicates the particular feature indexed by the position of the '1' is selected.If it is '0', the feature is not selected for evaluation of the chromosome concerned.The chromosomes are then ranked and based on the rankings, the top n fittest kids (Elitism of size n) are selected to survive to the next generation.The fitness evaluation is done through Algorithm 7.1.The fitness function used in 7.1 is shown in Equation 7.2.After the elite individuals are moved to the next generation, the remainning individuals in the current population are used to produce the rest of the next generation through crossover and mutation.Crossover is basically, combination of two individuals to form a crossover kid.Mutation operator on the other hand, is a genetic pertubation of the genes in each chromosomes through flipping of bits depending on the mutation probability.The configuration for our GA is shown in Table 3.The surviving chromosome for GA is the string BestChromosome = {0 0 1 0 0 1 0 1 0 0 1 1 0 1 0 0 1 0 0 0}.The positional indices of "1s" in this string are {3 6 8 11 12 14 17}.The corresponding features from this string are those features positions {3 6 8 11 12 14 17}.These features are Zernike moments with the following orders and repetitions {(2,0), (4,2),(5,3) }, eccentricity, form factor, EulerNumber, and Leaf minor axis.These features are used to train the PNN classifier used in Figure 11.  3 ← NumNeighborskNN Our classification system shown in Figure 11 is based on Probabilistic Neural Network (PNN) which is a feed forward Neural Network that uses kernel methods for density estimation in a multi-category problem and which was introduced by D.F Specht [22].PNN can be seen as a mathematical interpolation [23] or a parallel implementation of Parzen type classifier model.The algorithmic description of our classifier is shown in Algorithm 8.1.The whole system was built completely from MATLAB version 2013.The PNN spread was chosen to be in the neighborhood of 1 n , where n = number of classes (in this case 32).This was a proof by GA-Optimization technique which is not part of the scope of the current paper.So the over fitting of the PNN classifier was properly checked.In the pattern unit, compute unconditional probability p(Xtest) and conditional probability p(Xtest|ci) respectively as The classification of each pattern vector is made according to the Baye's Rule: 12: end procedure

Experimental Validation
We validated our experiment using 10-fold cross validation.The steps taken in the 10-fold CV are shown in Figure 10.The feature space (dataset) X is partitioned into k subsets that are roughly of the same size.This partitioning may be written as X = ∪ k i=1 Xi where each of the subset is called a fold.Thus, there are k folds derived from partitioning the original set (feature space) X.The PNN is trained on k − 1 folds while the k th fold is used for testing.The procedure is repeated such that each subset (fold) is used only once for testing (See Figure 10).The generally recommended value for k is 5 or 10.The k choice for this study is 10.The fascinating merit of the k-Fold CV is that all the observations in the original dataset (feature space) are eventually used for both training and testing.The CV method is much more accurate determinant of the classifier The accuracy of the PNN classifier was computed as where trace(.) is the sum of all the elements in the backward diagonal, and sum(.) is the sum of all the entries in ConfuseMatrix.The confusion matrix is a tabular tool or matrix display of the instances from the training set that were correctly and incorrectly predicted by the (PNN) classifier.It can be represented as ConfuseMatrix ∈ R c×c , a square matrix whose (backward) diagonal elements depicts the actual classification accuracy and c is the number of classes in the dataset.

Results
The results of this experiment proved that Zernike moments are more effective than geometric moments.Figures 5,1,6,7 and Table 2 show that ZM remained constant despite rotation, scaling, and translation.Using ZM alone gave accuracy of 91%, while the geometric features gave 89%.The GA-selected features improved the accuracy of the classifier from 91% to 92.1 % since it combined both features together in an efficient manner.The 10-fold CV has also been useful in eradicating biasness in our classification system.The 10-fold CV result for our system was 90.05%.

Conclusion
We have demonstrated the effectiveness of Zernike moment in image classification system.A neurogenetic intelligent system has been built with PNN classifier.The feature extracted viz ZM and Geometric features were further subjected to GA to bring the best features for optimal accuracy.The 10-fold CV used enabled the whole system to be unbiased.A demonstration of affine properties of ZM are comprehensivel stated and explained.Several

Figure 1 :
Figure 1: RGB, Grayscale, and Binary version of a plant's leaf

Figure 2 :
Figure 2: Conversion from rectangular to polar coordinates Figure (3) showed the results of ZM computation over original ROI, rotated ROI at angle 30, scaled ROI (by 0.75) and translated ROI (at (1,2)).Similarly, Figures (6(a) and (b) showed the graph of ZM aplitude plotted against angles of rotattion and scaling values respectively.This two plots being parrallel to x-axis also mean that the computation of ZM of the same order and repetition across different angles, scalings and tranlated of the original ROI is remain constant.A re-constructed version of the original binary image of a leaf are shown (see Figure 7) across different orders (orders 5 to 60) of ZM showing the correctness of ZM computation in this work.The figure shows that order 60 is sufficient to reconstruct original image.

Figure 3 :Figure 4 :
Figure 3: Leaves of three species of plant taken from the Flavia dataset

Figure 7 :
Figure 7: Original and Reconstructed Image

Figure 8 :
Figure 8: Convergence of GA Algorithm 2)where α = kNN-Based classification error and N f = Cardinality of the selected features.The algebraic structure of Equation 7.2 ensures the learning of the GA, error minimization and reduced number of features selected.

P
N Naccuracy = trace(Conf useM atrix) sum(Conf useM atrix) (9.1) Figures showing the TRS invariant properties of the ZM are shown.The ZM proved more efficient than the geometric features.The numbers of training sample available for each 32 species of plants used are shown in Table4with number of incorrect classifications.

Figure 10 :
Figure 10: Visual Representation of 10-Fold Cross Validation Experiments.The 10-Fold CV runs for 10 iteration, computing the classification accuracy for each fold , storing the accuracies and finally computing the average of these accuracies.

Table 2 :
Invariant ZM under Translation, Rotation, and Scaling