A neuronal classification system for plant leaves using genetic image segmentation

This paper demonstrates the use of radial basis networks (RBF), cellular neural networks (CNN) and genetic algorithm (GA) for automatic classication of plant leaves. A genetic neuronal system herein attempted to solve some of the inherent challenges facing current software being employed for plant leaf classication. The image segmentation module in this work was genetically optimized to bring salient features in the images of plants leaves used in this work. The combination of GA-based CNN with RBF in this work proved more ecient than the existing systems that use conventional edge operators such as Canny, LoG, Prewitt, and Sobel operators. The results herein showed that GA-based CNN edge detector outperforms other edge detector in terms of speed and classication accuracy.


Introduction
Plants are traditionally recognised by manual matching of the plants features such as leaves, flowers, and bark [1]. In view of the large number of plant species available [2,3], the manual approach to plant classification is slow and prone to human error. There are several techniques which are currently being employed to build computer-based vision systems using features of plants extracted from images as input parameters to various classifier systems [4], [5], [6]. In this paper, a technique to argument already existing techniques of plant leaves identification system is described. The main contribution of this paper is to increase the classification speed and accuracy of the existing systems by incorporating Genetic Cellular Neural Networks for image segmentation.

Related Works
The following authors and their associated techniques (in Table 1) have been reported.  [19] Inner Distance Shape Context (IDSC), K-NN, Color image segmentation

Shapes of plants leaves
In the existing works summarized in Table 1, there is need to improve the accuracy of the classifcation model. One way to do this is do employ efficient image segmentation techniques. Once the images are efficiently segmented, the features subsequently generated may help improve the classification accuracy of the system. In this paper, a new classification model involving RBF, genetic algorithm (GA) and cellular neural networks (CNN) was employed to develop a computerbased vision system for automatic identification of plant species.

Radial Basis Networks (RBF)
A radial basis function network is a variant of artificial neural network (ANN) that employ radial basis functions as activation functions [20], [21], [22]. RBF networks are extensively being used in research because they are universal approximator with compact topology. The output of the network is a linear combination of radial basis functions of the inputs and neuron parameters. The input to RBF could be a feature set or a vector of real numbers x ∈ R n . All the elements of the input vector (s) are algebraically mapped to a scalar output as ϕ : R n → R using the functional in Equation 3.1.
where K is the number of neurons in the hidden layer, ci is the center vector for neuron i, wi is the weight of neuron i in the linear output neuron and ρ is a radius-based metric. A radial basis function is a real-valued function whose value depends only on the distance from the origin. Thus ϕ(X) = ϕ(||X||). Functions that depend only on the distance from a center vector are radially symmetric about that vector, hence the name radial basis function (see appendices A1, A2 and A3). In the basic form all inputs are connected to each hidden neuron. An RBF can be trained without back propagation since it has a closed-form solution. The neurons in the hidden layer contain basis functions. A common basis function for RBF network is a kind of Gaussian function without scaling factor. More detailed about RBF can be found in [23], [24], [25], [20], [26], [27].

Cellular Neural Networks
Cellular Neural Networks (CNN) are variants of Artificial Neural Networks (ANN), which are also dynamical systems in which all the elements are locally connected through a neighborhood communication [28]. The dynamical model (see Equation 4.1) representing CNN was modelled from an electrical circuit (see figure 1) in 1988 by Chua and his graduate student [29], [30]. A standard CNN topological structure is made up of an M × N or rectangular array of cells C(i, j)(or dynamic components) with Cartesian coordinates (i, j), i = 1(1)M, j = 1(1)N and M, N ∈ Z + . CNN is a hybrid model, since its features share from both Cellular Automata and Artificial Neural Networks [31], [32]. The circuit structure and element values of all cells of a CNN are homogenous.
The variables appearing in Equations 4.1 and 4.2 are defined in the reference papers [28] and [32].

Experimental Methodology
The steps in the experimental research are provided in figure 7.

The Flavia Dataset
The images of leaves of plant species used in this study are found in the Flavia dataset which is publicly available [9]. The Flavia dataset is a constrained set of leaf images taken against a white background and without any stem present. The leaves in the dataset have a varying number of instances as shown in [33]. The dataset has 1907 images of 32 species of plants. For this study, the dataset was divided into two disjoint sets, each of which contains 1587 images and 320 images for both training and test set respectively. The shapes of one sample for each species of plant in the Flavia dataset are shown in Figure 9.

Image Pre-processing
The first step (after image acquisition), is to pre-process the images found in [9]. The original images (which are colored i.e rgb images), are first converted to grayscale images using the formular in Equation 5.1. The R, G, and B in the equation respectively represents the red, green, and blue components of the colored image [33], [31].

Image Segmentation Using Genetic CNN
The goal of image segmentation is to find regions of interest (ROI) in the images. The division of images into ROIs is often necessary before any processing can be done at a higher level than that of the pixel [34]. Thus, image segmentation is the division of an image into disjoint regions such that the intersection of differently indexed regions is null and the union of all regions is the image itself. Three main classes of image segmentation are (i) region growing, (ii) clustering methods, and (iii) boundary detection. The CNN herein falls into the third category. Next to image pre-processing, a GA (see Table 2) was employed to optimize the CNN edge detection templates in Equation 5.4. Detailed information on GA can be found in [35], [36], [37], [38], [39]. These templates are the matrix coefficient for the systems of equations in 4.  figure 3 using the GA parameters in Table 2. The fitness function used by the GA-CNN process is given in equation 5.2 where Iij and Oij are evaluated and input image pixels respectively. The best chromosome was obtained at generation 101 as shown in figure 6.
The main idea behind the CNN is to input an image and discretize the ODE using an appropriate numerical methods. Prewitt operator: Laplacian operator: The time required by the the conventional edge operatures were longer than the time required by the GA-based CNN templates. It's to be noted that GA was not involved in the the image segmentation process, rather it was used separately to obtain optimal edge templates shown in Equations 5.5. Figure 2 shows the MATLAB-based GUI developed to extract optimal edge from the images of the plants in the Flavia dataset.

Features Extraction
The outputs of the genetic segmentation process are used as parameters for Fourier Descriptors (FD) and while binary version of the images are used by Zernike Moments (ZM). The detailed about ZM and FD can be found in [33] and [43] respectively.Twenty features were extracted from the Flavia dataset disussed in section 5.1 . These features are represented in Table 3.

Learning System Based on Radial Basis Networks
The steps involved in the design of the classification system shown in Figure 8 are described in Figure 7. These steps are similar to most image classification systems. The distinguishing features of the model lies in the application of genetic CNN for image segmentation. Each node in the input layer of the RBF corresponds to a feature vector from Table 3. The second layer is the only hidden layer in the RBF network. The second layer applies non-linear mapping from input vector space into hidden layer space through appropriate non-linear function such as guassian kernel (see equation 5.9) which was used in this work. The x in equation 5.9 is the training sample, ci is the hidden ith neuron and σ is the width of the basis function which is a multiple of the average distance between the centers in the RBF network. The σ determines the receptive width of the RBF. The output layer is made up of neurons that are directly connected to the hidden layer neurons [44]. The output value for the training set or any input to the RBF is expressed as equation 5.10. The w in equation 5.10 is the weight factor normally computed as w = (h T h) −1 C where C is the target class matrix and h T means "Transpose of h". The number of neurons in the output layer is the same as the number of classes in the dataset while the number of neurons in the input layer is equal to the number of features in the training set. For this work, the number of features is 20. GA was used in determining the σ value while K-means clustering was used in forming the centers in the hidden layer. The entire implementation (including the GUI) was done using MATLAB 2013a.

Experimental Validation
The approach used in validating the RBF herein is the k-Fold Cross Validation (k-Fold CV) with k = 10. Generally, a cross validation (CV) is a method of partitioning the feature space into training and testing sets as shown in [33]. Herein, the naive RBF was fitted using training set, while the fitted model was validated through testing set by measuring the error predicted. The training set and testing set were both disjoint to ensure that the testing set for evaluating the RBF are not used in fitting the model. Without the loss of any generality, and suppose the information in Table  3 are true, the dataset (feature space) X is then partitioned into two sets viz X = X1 ∪ X2, such that k elements are in X1 and D − k elements in X2. The RBF was then trained or fitted using the set X2. The historical pattern of X2 was used to produce classifications (predictions) results for observations X1 * ∈ X1 given X2. The algorithm for the 10-fold CV is given in Algorithm 6.1 below:

Results and Discussion
The results of this work are shown in Table 4. Several image segmentation techniques including canny, prewitt, LoG were compared with the genetic CNN used in this work. The matrices describing other edge operators are shown in equations (5.6, 5.7, 5.8). In the diagram shown in figure 4, the edge outputs were the only region of interest (ROI) sent to FD feature extraction module of the system. The ZM accepted only monochrome version of the sample images. The modus operandi of the GA used to optimize both RBF and GA are shown in figure 10. The average width of the RBF neurons is 0.45 with best accuracy occuring at σ = 0.9. The GA enabled the CNN to bring out salient features from the images used. The CNN was solved using finite element method (FEM). FEM have been known to reduce discretization errors from dynamic systems more often than runge-kutta method. The whole system was unbiasedly evaluated using CV with k = 10 (see Algorithm 6.1). Accuracy and computational time (in seconds) were used as performance metric for the developed system. It's shown in the table that the system performed better than the conventional image segmentation techniques (edge detectors). The CNN may also serve other purposes such as solution of partial differential equation (PDE), ordinary differential equation (ODE) and maximum likelihood estimation in signal processing. When integrated into the classification system, the CNN was found to improve both accuracy and speed of the developed system shown in Table 4.

Conclusion and Future Works
A demonstration of novel neuro genetic intelligent system has been done in the work described in this paper. The system was found to perform better when compared to conventional image segmentation techniques (edge detectors) as evidenced by reduction in computational time and increased classification accuracy (see Table 4). This work is adaptive and may further be extended by using optimization techniques apart from GA, to determine it weights and centres and make some analysis and comparison.

Peer-review history:
The peer review history for this paper can be accessed here (Please copy paste the total link in your browser address bar) www.sciencedomain.org/review-history.php?iid=1144&id=6&aid=9437