This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
A Hybrid Tabu Search and Genetic Algorithm Imputation Approach for Incomplete Data
Corresponding Author(s) : Bain Khusnul Khotimah
Kinetik: Game Technology, Information System, Computer Network, Computing, Electronics, and Control,
Vol. 6, No. 4, November 2021
Abstract
The common problem for data collection is happening missing value during the data collection and processing process that the quality of the data testing is decreased. A computational based technique for dealing with missing values, namely Genetic Algorithm Imputation (GAI). The usage was used to estimate the dataset's missing values. GAI generates the optimal set of missing values with the acquisition of information as a function of fitness to measure individual solutions' performance. GAI conducts continuous searching until the missing criteria value is found according to best fitness. So, it is trapped in optimal conditions temporarily. The improvement of GAI with tabu search is known as TS-GAI, that strength is two metaheuristic techniques modified at the mutase stage to distract the local optima's search. In applying missing values, this technique works better when many possible values are used instead of the mixed attribute having missing values. Because the new generation chromosome values generate many opportunities to make up for the missing values. The experimental results show that the TS-GAI shows better performance on 30% MV with a fitness value of 0.212. It converges at 159 iterations. Generally, TS-GAI is a faster iteration than simple GAI and it has a lower RMSE level than other imputation techniques.
Keywords
Download Citation
Endnote/Zotero/Mendeley (RIS)BibTeX
- . C.T. Tran, M. Zhang, P. Andreae, “A genetic programming-based imputation method for classification with missing data,” In: European conference on genetic programming. Springer, pp 149–163, 2016.
- . R. Armina, A. M. Zain, N.A. Ali, R. Sallehuddin, “A Review On Missing Value Estimation Using Imputation Algorithm,” Journal of Physics: Conference Series, Volume 892: 012004, The 6th International Conference on Computer Science and Computational Mathematics (ICCSCM 2017), Langkawi, Malaysia, 4–5 May 2017.
- . S. Alharbi, “A Hybrid Genetic Algorithm with Tabu Search for Optimization of the Traveling Thief Problem,” (IJACSA) International Journal of Advanced Computer Science and Applications, vol. 9, no. 11, pp.276-287, 2018.
- . C. Leke, B. Twala, T. Marwala, ”Modelling of missing data prediction: computational intelligence and optimization algorithms,” In: International Conference on Systems, Man and Cybernetics (SMC), IEEE, pp. 1400–1404, 2014.
- . W. Shahzad, Q. Rehman, E. Ahmed, “Missing Data Imputation using Genetic Algorithm for Supervised Learning,” (IJACSA) International Journal of Advanced Computer Science and Applications, vol. 8, no. 3, pp. 438-445, 2017.
- . J. Josse, F. Husson, “missMDA: a package for handling missing values in multivariate data analysis,” J Stat Softw vol. 70, no. 1, pp.1–31, 2016
- . F. Lobato, C. Sales, I. Araujo, V. Tadaiesky, L. Dias, L. Ramos, "Multi-objective genetic algorithm for missing data imputation", Pattern Recognit. Lett., vol. 68, pp. 126-131, 2015.
- . S. F. Sabbeh, ”Machine-Learning Techniques for Customer Retention: A Comparative Study”, International Journal of Advanced Computer Science and Applications (IJACSA), 2018.
- . H. Hwang, T. Jung, E. Suh, “An LTV model and customer segmentation based on customer value: a case study on the wireless telecommunication industry,” Expert Systems with Applications, vol. 26, pp. 181-188, 2004.
- . S. Nabavi, S. Jafar, “Providing a Customer Churn Prediction Model Using Random Forest and Boosted Trees Techniques,” (Case Study: Solico Food Industries Group, Journal of Basic and Applied Scientific Research, vol. 3, no. 6, pp. 1018-1026, 2013.
- . A. Kazemi, M. E. Babaei, “Modelling Customer Attraction Prediction in Customer Relation Management using Decision Tree: A Data Mining Approach,” journal of Optimization in Industrial Engineering , 2011.
- . A.B. Zorić, "Predicting Customer Churn In Banking Industry Using Neural Networks,” Interdisciplinary Description of Complex Systems, vol. 14, no. 2, pp. 116-124, 2016.
- . G.S. Linoff, M. J. Berry, (2011). Data mining techniques: for marketing, sales, and customer relationship management. John Wiley & Sons, 2011.
- . E. Vigneau, “Segmentation of a panel of consumers with missing data, Food Quality and Preference,” vol. 67, July 2018, pp. 10-17.
- . N.F. Fauziah, Y.H. Putra, “Scheduling Regular Classrooms using Heuristic Genetic and Tabu Search Algorithms,” IOP Conference Series:Materials Science and Engineering: 012116, vol. 407, no. 1, 2018.
- . P. Delima, A.M. Sison, R.P. Medina, “Variable Reduction-based Prediction through Modified Genetic Algorithm,” Allemar Jhone (IJACSA) International Journal of Advanced Computer Science and Applications, vol.10, no.5, pp.356-363, 2019.
- . O. M. Elzeki, M. F. Alrahmawy, S. Elmougy, “A New Hybrid Genetic and Information Gain Algorithm for Imputing Missing Values in Cancer Genes Datasets,” J. Intelligent Systems and Applications, vol. 12, pp. 20-33, 2019.
- . M. Noei, M.S. Abadeh, “A Genetic Asexual Reproduction Optimization Algorithm for Imputing Missing Values,” 2019 9th International Conference on Computer and Knowledge Engineering (ICCKE), IEEE, Ferdowsi Universty of Mashhad, pp. 214-218, 23 January 2020.
- . F. Glover, J.P. Kelly, M. Laguna, “Genetic algorithms and tabu search: Hybrids for optimization,” Computers & Operations Research , vol. 22, no. 1, pp. 111-134, January 1995.
- . X. L. L. Gao, “An effective hybrid genetic algorithm and tabu search for flexible job shop scheduling problem,” International Journal of Production Economics, vol. 174, pp. 93-110 April 2016.
- . M. D. Akbara, R. Aurachmana, Hybrid genetic–tabu search algorithm to optimize the route for capacitated vehicle routing problem with time window, International Journal of Industrial Optimization, vol. 1. no.1, pp. 15-28, February 2020.
- . Li, H. Gu, L.Y. Zhang, 2013. A hybrid genetic algorithm-fuzzy c-means approach for incomplete data clustering based on nearest-neighbor intervals. J. Soft Computing, vol. 17, pp. 1787–1796, 2013.
- . B. K. Khotimah, F. Irhamni, T. Sundarwati, 2016. A Genetic Algorithm for Optimized Initial Centers K-Means Clustering in SMEs. Journal of Theoretical and Applied Information Technology (JATIT), vol. 90, no. 1, pp. 23-30, 15 August 2016.
References
. C.T. Tran, M. Zhang, P. Andreae, “A genetic programming-based imputation method for classification with missing data,” In: European conference on genetic programming. Springer, pp 149–163, 2016.
. R. Armina, A. M. Zain, N.A. Ali, R. Sallehuddin, “A Review On Missing Value Estimation Using Imputation Algorithm,” Journal of Physics: Conference Series, Volume 892: 012004, The 6th International Conference on Computer Science and Computational Mathematics (ICCSCM 2017), Langkawi, Malaysia, 4–5 May 2017.
. S. Alharbi, “A Hybrid Genetic Algorithm with Tabu Search for Optimization of the Traveling Thief Problem,” (IJACSA) International Journal of Advanced Computer Science and Applications, vol. 9, no. 11, pp.276-287, 2018.
. C. Leke, B. Twala, T. Marwala, ”Modelling of missing data prediction: computational intelligence and optimization algorithms,” In: International Conference on Systems, Man and Cybernetics (SMC), IEEE, pp. 1400–1404, 2014.
. W. Shahzad, Q. Rehman, E. Ahmed, “Missing Data Imputation using Genetic Algorithm for Supervised Learning,” (IJACSA) International Journal of Advanced Computer Science and Applications, vol. 8, no. 3, pp. 438-445, 2017.
. J. Josse, F. Husson, “missMDA: a package for handling missing values in multivariate data analysis,” J Stat Softw vol. 70, no. 1, pp.1–31, 2016
. F. Lobato, C. Sales, I. Araujo, V. Tadaiesky, L. Dias, L. Ramos, "Multi-objective genetic algorithm for missing data imputation", Pattern Recognit. Lett., vol. 68, pp. 126-131, 2015.
. S. F. Sabbeh, ”Machine-Learning Techniques for Customer Retention: A Comparative Study”, International Journal of Advanced Computer Science and Applications (IJACSA), 2018.
. H. Hwang, T. Jung, E. Suh, “An LTV model and customer segmentation based on customer value: a case study on the wireless telecommunication industry,” Expert Systems with Applications, vol. 26, pp. 181-188, 2004.
. S. Nabavi, S. Jafar, “Providing a Customer Churn Prediction Model Using Random Forest and Boosted Trees Techniques,” (Case Study: Solico Food Industries Group, Journal of Basic and Applied Scientific Research, vol. 3, no. 6, pp. 1018-1026, 2013.
. A. Kazemi, M. E. Babaei, “Modelling Customer Attraction Prediction in Customer Relation Management using Decision Tree: A Data Mining Approach,” journal of Optimization in Industrial Engineering , 2011.
. A.B. Zorić, "Predicting Customer Churn In Banking Industry Using Neural Networks,” Interdisciplinary Description of Complex Systems, vol. 14, no. 2, pp. 116-124, 2016.
. G.S. Linoff, M. J. Berry, (2011). Data mining techniques: for marketing, sales, and customer relationship management. John Wiley & Sons, 2011.
. E. Vigneau, “Segmentation of a panel of consumers with missing data, Food Quality and Preference,” vol. 67, July 2018, pp. 10-17.
. N.F. Fauziah, Y.H. Putra, “Scheduling Regular Classrooms using Heuristic Genetic and Tabu Search Algorithms,” IOP Conference Series:Materials Science and Engineering: 012116, vol. 407, no. 1, 2018.
. P. Delima, A.M. Sison, R.P. Medina, “Variable Reduction-based Prediction through Modified Genetic Algorithm,” Allemar Jhone (IJACSA) International Journal of Advanced Computer Science and Applications, vol.10, no.5, pp.356-363, 2019.
. O. M. Elzeki, M. F. Alrahmawy, S. Elmougy, “A New Hybrid Genetic and Information Gain Algorithm for Imputing Missing Values in Cancer Genes Datasets,” J. Intelligent Systems and Applications, vol. 12, pp. 20-33, 2019.
. M. Noei, M.S. Abadeh, “A Genetic Asexual Reproduction Optimization Algorithm for Imputing Missing Values,” 2019 9th International Conference on Computer and Knowledge Engineering (ICCKE), IEEE, Ferdowsi Universty of Mashhad, pp. 214-218, 23 January 2020.
. F. Glover, J.P. Kelly, M. Laguna, “Genetic algorithms and tabu search: Hybrids for optimization,” Computers & Operations Research , vol. 22, no. 1, pp. 111-134, January 1995.
. X. L. L. Gao, “An effective hybrid genetic algorithm and tabu search for flexible job shop scheduling problem,” International Journal of Production Economics, vol. 174, pp. 93-110 April 2016.
. M. D. Akbara, R. Aurachmana, Hybrid genetic–tabu search algorithm to optimize the route for capacitated vehicle routing problem with time window, International Journal of Industrial Optimization, vol. 1. no.1, pp. 15-28, February 2020.
. Li, H. Gu, L.Y. Zhang, 2013. A hybrid genetic algorithm-fuzzy c-means approach for incomplete data clustering based on nearest-neighbor intervals. J. Soft Computing, vol. 17, pp. 1787–1796, 2013.
. B. K. Khotimah, F. Irhamni, T. Sundarwati, 2016. A Genetic Algorithm for Optimized Initial Centers K-Means Clustering in SMEs. Journal of Theoretical and Applied Information Technology (JATIT), vol. 90, no. 1, pp. 23-30, 15 August 2016.