Abstract
The occurrence of missing values is not uncommon in real life databases like industrial, medical, and life science. The imputation of these values has been realized through the mean/mode of known values (for a quantitative/qualitative attribute) or nearest neighbors. Mean based imputation considerably underestimates the population variance and tends to weaken the attribute relationships. Similarly, the nearest neighbor approach uses only information of the nearest neighbors and leaving other observations aside. Hence to overcome the shortcomings of these methods, we have introduced a method known as medoid based imputation to impute missing values. Further, to achieve better performance, we have devised a novel classifier for imputed datasets, by using the self-adaptive control parameters of differential evolution (DE) with equilibrium of exploitation and exploration optimized radial basis function neural networks (RBFNs). By newly associating a weight parameter with target vector during mutation, we maintain equilibrium on the exploration and exploitation mechanism of DE. The self-adaptive equilibrium DE (SAEDE) is used to explore and exploit the suitable kernel parameters of RBFNs along with bias and then used for classifying unknown samples. The performance of the proposed classifier named as SAEDE-RBFN has been extensively evaluated on seven datasets retrieved from University of California, Irvine (UCI) and KEEL machine learning repositories after imputation by mean, nearest neighbor, and proposed method. The average performance of classifiers has been listed based on the imputation by K-nearest neighbor (Knn = 1, Knn = 3, Knn = 5, and Knn = 7), mean, and medoid, respectively. Outcome of the experimental study shows that the performance of SAEDE-RBFN on medoid based imputed dataset is relatively better than DE-RBFN.
Original language | English |
---|---|
Pages (from-to) | 76-83 |
Number of pages | 8 |
Journal | Pattern Recognition Letters |
Volume | 80 |
DOIs | |
Publication status | Published - 2016 Sep 1 |
Fingerprint
All Science Journal Classification (ASJC) codes
- Software
- Signal Processing
- Computer Vision and Pattern Recognition
- Artificial Intelligence
Cite this
}
Design of self-adaptive and equilibrium differential evolution optimized radial basis function neural network classifier for imputed database. / Dash, Ch Sanjeev Kumar; Saran, Amitav; Sahoo, Pulak; Dehuri, Satchidananda; Cho, Sung-Bae.
In: Pattern Recognition Letters, Vol. 80, 01.09.2016, p. 76-83.Research output: Contribution to journal › Article
TY - JOUR
T1 - Design of self-adaptive and equilibrium differential evolution optimized radial basis function neural network classifier for imputed database
AU - Dash, Ch Sanjeev Kumar
AU - Saran, Amitav
AU - Sahoo, Pulak
AU - Dehuri, Satchidananda
AU - Cho, Sung-Bae
PY - 2016/9/1
Y1 - 2016/9/1
N2 - The occurrence of missing values is not uncommon in real life databases like industrial, medical, and life science. The imputation of these values has been realized through the mean/mode of known values (for a quantitative/qualitative attribute) or nearest neighbors. Mean based imputation considerably underestimates the population variance and tends to weaken the attribute relationships. Similarly, the nearest neighbor approach uses only information of the nearest neighbors and leaving other observations aside. Hence to overcome the shortcomings of these methods, we have introduced a method known as medoid based imputation to impute missing values. Further, to achieve better performance, we have devised a novel classifier for imputed datasets, by using the self-adaptive control parameters of differential evolution (DE) with equilibrium of exploitation and exploration optimized radial basis function neural networks (RBFNs). By newly associating a weight parameter with target vector during mutation, we maintain equilibrium on the exploration and exploitation mechanism of DE. The self-adaptive equilibrium DE (SAEDE) is used to explore and exploit the suitable kernel parameters of RBFNs along with bias and then used for classifying unknown samples. The performance of the proposed classifier named as SAEDE-RBFN has been extensively evaluated on seven datasets retrieved from University of California, Irvine (UCI) and KEEL machine learning repositories after imputation by mean, nearest neighbor, and proposed method. The average performance of classifiers has been listed based on the imputation by K-nearest neighbor (Knn = 1, Knn = 3, Knn = 5, and Knn = 7), mean, and medoid, respectively. Outcome of the experimental study shows that the performance of SAEDE-RBFN on medoid based imputed dataset is relatively better than DE-RBFN.
AB - The occurrence of missing values is not uncommon in real life databases like industrial, medical, and life science. The imputation of these values has been realized through the mean/mode of known values (for a quantitative/qualitative attribute) or nearest neighbors. Mean based imputation considerably underestimates the population variance and tends to weaken the attribute relationships. Similarly, the nearest neighbor approach uses only information of the nearest neighbors and leaving other observations aside. Hence to overcome the shortcomings of these methods, we have introduced a method known as medoid based imputation to impute missing values. Further, to achieve better performance, we have devised a novel classifier for imputed datasets, by using the self-adaptive control parameters of differential evolution (DE) with equilibrium of exploitation and exploration optimized radial basis function neural networks (RBFNs). By newly associating a weight parameter with target vector during mutation, we maintain equilibrium on the exploration and exploitation mechanism of DE. The self-adaptive equilibrium DE (SAEDE) is used to explore and exploit the suitable kernel parameters of RBFNs along with bias and then used for classifying unknown samples. The performance of the proposed classifier named as SAEDE-RBFN has been extensively evaluated on seven datasets retrieved from University of California, Irvine (UCI) and KEEL machine learning repositories after imputation by mean, nearest neighbor, and proposed method. The average performance of classifiers has been listed based on the imputation by K-nearest neighbor (Knn = 1, Knn = 3, Knn = 5, and Knn = 7), mean, and medoid, respectively. Outcome of the experimental study shows that the performance of SAEDE-RBFN on medoid based imputed dataset is relatively better than DE-RBFN.
UR - http://www.scopus.com/inward/record.url?scp=84975469264&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84975469264&partnerID=8YFLogxK
U2 - 10.1016/j.patrec.2016.05.002
DO - 10.1016/j.patrec.2016.05.002
M3 - Article
AN - SCOPUS:84975469264
VL - 80
SP - 76
EP - 83
JO - Pattern Recognition Letters
JF - Pattern Recognition Letters
SN - 0167-8655
ER -