TY - JOUR
T1 - KAEA
T2 - A novel three-stage ensemble model for software defect prediction
AU - Zhang, Nana
AU - Zhu, Kun
AU - Ying, Shi
AU - Wang, Xu
PY - 2020/5/20
Y1 - 2020/5/20
N2 - Software defect prediction is a research hotspot in the field of software engineering. However, due to the limitations of current machine learning algorithms, we can’t achieve good effect for defect prediction by only using machine learning algorithms. In previous studies, some researchers used extreme learning machine (ELM) to conduct defect prediction. However, the initial weights and biases of the ELM are determined randomly, which reduces the prediction performance of ELM. Motivated by the idea of search based software engineering, we propose a novel software defect prediction model named KAEA based on kernel principal component analysis (KPCA), adaptive genetic algorithm, extreme learning machine and Adaboost algorithm, which has three main advantages: (1) KPCA can extract optimal representative features by leveraging a nonlinear mapping function; (2) We leverage adaptive genetic algorithm to optimize the initial weights and biases of ELM, so as to improve the generalization ability and prediction capacity of ELM; (3) We use the Adaboost algorithm to integrate multiple ELM basic predictors optimized by adaptive genetic algorithm into a strong predictor, which can further improve the effect of defect prediction. To effectively evaluate the performance of KAEA, we use eleven datasets from large open source projects, and compare the KAEA with four machine learning basic classifiers, ELM and its three variants. The experimental results show that KAEA is superior to these baseline models in most cases.
AB - Software defect prediction is a research hotspot in the field of software engineering. However, due to the limitations of current machine learning algorithms, we can’t achieve good effect for defect prediction by only using machine learning algorithms. In previous studies, some researchers used extreme learning machine (ELM) to conduct defect prediction. However, the initial weights and biases of the ELM are determined randomly, which reduces the prediction performance of ELM. Motivated by the idea of search based software engineering, we propose a novel software defect prediction model named KAEA based on kernel principal component analysis (KPCA), adaptive genetic algorithm, extreme learning machine and Adaboost algorithm, which has three main advantages: (1) KPCA can extract optimal representative features by leveraging a nonlinear mapping function; (2) We leverage adaptive genetic algorithm to optimize the initial weights and biases of ELM, so as to improve the generalization ability and prediction capacity of ELM; (3) We use the Adaboost algorithm to integrate multiple ELM basic predictors optimized by adaptive genetic algorithm into a strong predictor, which can further improve the effect of defect prediction. To effectively evaluate the performance of KAEA, we use eleven datasets from large open source projects, and compare the KAEA with four machine learning basic classifiers, ELM and its three variants. The experimental results show that KAEA is superior to these baseline models in most cases.
KW - Adaboost
KW - Adaptive genetic algorithm
KW - Extreme learning machine
KW - KPCA
KW - Software defect prediction
UR - http://www.scopus.com/inward/record.url?scp=85090904170&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85090904170&partnerID=8YFLogxK
U2 - 10.32604/CMC.2020.010117
DO - 10.32604/CMC.2020.010117
M3 - Article
AN - SCOPUS:85090904170
SN - 1546-2218
VL - 64
SP - 471
EP - 499
JO - Computers, Materials and Continua
JF - Computers, Materials and Continua
IS - 1
ER -