
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
A Gradient Boosting–Based Platform with Fuzzy Linguistic Representation for Cardiovascular Disease Risk Prediction
Corresponding Author(s) : Amir Saleh
Kinetik: Game Technology, Information System, Computer Network, Computing, Electronics, and Control,
Vol. 11, No. 3, August 2026 (Article in Progress)
Abstract
Cardiovascular disease (CVD) is one of the most common causes of death around the world. In order to effectively prevent and manage CVD, early detection and prediction of risk are essential. This research introduces a healthcare platform based on CVD risk prediction using advanced machine learning (ML) methods. This platform is designed to provide accurate risk assessment by integrating the gradient boosting (GB) classifier method. Additionally, other ML models are used as comparison algorithms. Initially, this research used preprocessing techniques such as data normalization and data cleaning to tackle outliers in the dataset. Recursive feature elimination (RFE) feature selection approaches are utilized to find features that affect prediction performance, hence lowering the amount of data dimensions and enhancing model performance. Then, using metrics such as accuracy, precision, recall, and F1-score, each model’s performance is evaluated. The modeling results of the suggested approach are then used to create a digital health platform that predicts new input from users. Additionally, fuzzy logic is applied to transform data into linguistic variables to help users find simpler information. Using the proposed GB model and preprocessing method, the platform can make more accurate CVD risk predictions during data validation than other ML methods. When compared to other approaches with lower accuracy, the evaluation results demonstrate that the GB method can achieve the highest prediction accuracy of 94.30%.
Keywords
Download Citation
Endnote/Zotero/Mendeley (RIS)BibTeX
- F. F. Firdaus, H. A. Nugroho, and I. Soesanti, “A Review of Feature Selection and Classification Approaches for Heart Disease Prediction,” IJITEE (International J. Inf. Technol. Electr. Eng., vol. 4, no. 3, p. 75, 2021, doi: 10.22146/ijitee.59193.
- Z. Ahmed, K. Mohamed, S. Zeeshan, and X. Q. Dong, “Artificial intelligence with multi-functional machine learning platform development for better healthcare and precision medicine,” Database, vol. 2020, pp. 1–35, 2020, doi: 10.1093/database/baaa010.
- F. García-Peñalvo et al., “KoopaML: A Graphical Platform for Building Machine Learning Pipelines Adapted to Health Professionals,” Int. J. Interact. Multimed. Artif. Intell., vol. In Press, no. In Press, p. 1, 2023, doi: 10.9781/ijimai.2023.01.006.
- Y. Liu, Z. Ling, B. Huo, B. Wang, T. Chen, and E. Mouine, “Building A Platform for Machine Learning Operations from Open Source Frameworks,” IFAC-PapersOnLine, vol. 53, no. 5, pp. 704–709, 2020, doi: 10.1016/j.ifacol.2021.04.161.
- G. Quer, R. Arnaout, M. Henne, and R. Arnaout, “Machine Learning and the Future of Cardiovascular Care: JACC State-of-the-Art Review,” J. Am. Coll. Cardiol., vol. 77, no. 3, pp. 300–313, 2021, doi: 10.1016/j.jacc.2020.11.030.
- S. Zeadally, F. Siddiqui, Z. Baig, and A. Ibrahim, “Smart healthcare: Challenges and potential solutions using internet of things (IoT) and big data analytics,” PSU Res. Rev., vol. 4, no. 2, pp. 149–168, 2020, doi: 10.1108/PRR-08-2019-0027.
- S. Nashif, M. R. Raihan, M. R. Islam, and M. H. Imam, “Heart Disease Detection by Using Machine Learning Algorithms and a Real-Time Cardiovascular Health Monitoring System,” World J. Eng. Technol., vol. 06, no. 04, pp. 854–873, 2018, doi: 10.4236/wjet.2018.64057.
- O. Faust, N. Lei, E. Chew, E. J. Ciaccio, and U. R. Acharya, “A smart service platform for cost efficient cardiac health monitoring,” Int. J. Environ. Res. Public Health, vol. 17, no. 17, pp. 1–18, 2020, doi: 10.3390/ijerph17176313.
- C. A. Gómez-García, M. Askar-Rodriguez, and J. Velasco-Medina, “Platform for Healthcare Promotion and Cardiovascular Disease Prevention,” IEEE J. Biomed. Heal. Informatics, vol. 25, no. 7, pp. 2758–2767, 2021, doi: 10.1109/JBHI.2021.3051967.
- A. Damayunita, R. S. Fuadi, and C. Juliane, “Comparative Analysis of Naive Bayes, K-Nearest Neighbors (KNN), and Support Vector Machine (SVM) Algorithms for Classification of Heart Disease Patients,” J. Online Inform., vol. 7, no. 2, pp. 219–225, 2022, doi: 10.15575/join.v7i2.919.
- R. Waigi, S. Choudhary, P. Fulzele, G. Mishra, and A. Prof, “Predicting The Risk Of Heart Disease Using Advanced Machine Learning Approach,” Eur. J. Mol. Clin. Med., vol. 7, no. May, p. 2020, 2020, [Online]. Available: https://www.researchgate.net/publication/348192776
- P. Kumar and A. Kumar, “Heart Disease Classification and Recommendation by Optimized Features and Adaptive Boost Learning,” Int. J. Adv. Comput. Sci. Appl., vol. 14, no. 3, pp. 909–914, 2023, doi: 10.14569/IJACSA.2023.01403103.
- N. A. Baghdadi, S. M. Farghaly Abdelaliem, A. Malki, I. Gad, A. Ewis, and E. Atlam, “Advanced machine learning techniques for cardiovascular disease early detection and diagnosis,” J. Big Data, vol. 10, no. 1, pp. 1–29, 2023, doi: 10.1186/s40537-023-00817-1.
- C. E. Belle, V. Aksakalli, and S. P. Russo, “A machine learning platform for the discovery of materials,” J. Cheminform., vol. 13, no. 1, pp. 1–23, 2021, doi: 10.1186/s13321-021-00518-y.
- R. C. Chen, C. Dewi, S. W. Huang, and R. E. Caraka, “Selecting critical features for data classification based on machine learning methods,” J. Big Data, vol. 7, no. 1, 2020, doi: 10.1186/s40537-020-00327-4.
- Priyanka and D. Kumar, “Feature Extraction and Selection of kidney Ultrasound Images Using GLCM and PCA,” Procedia Comput. Sci., vol. 167, no. 2019, pp. 1722–1731, 2020, doi: 10.1016/j.procs.2020.03.382.
- K. Kangra and J. Singh, “A genetic algorithm-based feature selection approach for diabetes prediction,” IAES Int. J. Artif. Intell., vol. 13, no. 2, p. 1489, 2024, doi: 10.11591/ijai.v13.i2.pp1489-1498.
- D. Boldini, F. Grisoni, D. Kuhn, L. Friedrich, and S. A. Sieber, “Practical guidelines for the use of gradient boosting for molecular property prediction,” J. Cheminform., vol. 15, no. 1, pp. 1–13, 2023, doi: 10.1186/s13321-023-00743-7.
- D. A. Setyarini, A. A. M. D. Gayatri, C. S. K. Aditya, and D. R. Chandranegara, “Stroke Prediction with Enhanced Gradient Boosting Classifier and Strategic Hyperparameter,” MATRIK J. Manajemen, Tek. Inform. dan Rekayasa Komput., vol. 23, no. 2, pp. 477–490, 2024, doi: 10.30812/matrik.v23i2.3555.
- M. D. Guillen, J. Aparicio, and M. Esteve, “Gradient tree boosting and the estimation of production frontiers,” Expert Syst. Appl., vol. 214, no. April 2022, p. 119134, 2023, doi: 10.1016/j.eswa.2022.119134.
- A. F. L. Ptr, M. M. Siregar, and I. Daniel, “Analysis of Gradient Boosting, XGBoost, and CatBoost on Mobile Phone Classification,” J. Comput. Networks, Archit. High Perform. Comput., vol. 6, no. 2, pp. 661–670, 2024, doi: 10.47709/cnahpc.v6i2.3790.
- J. B. Jane and E. N. Ganesh, “A review on big data with machine learning and fuzzy logic for better decision making,” Int. J. Sci. Technol. Res., vol. 8, no. 10, pp. 1221–1225, 2019, [Online]. Available: https://www.semanticscholar.org/paper/A-Review-On-Big-Data-With-Machine-Learning-And-For-Jane-E.N.Ganesh/69b508594e32c1c205363194851577cf745ee8c3
- L. Maretto, M. Faccio, and D. Battini, “A Multi-Criteria Decision-Making Model Based on Fuzzy Logic and AHP for the Selection of Digital Technologies,” IFAC-PapersOnLine, vol. 55, no. 2, pp. 319–324, 2022, doi: 10.1016/j.ifacol.2022.04.213.
- C. Fan, M. Chen, X. Wang, J. Wang, and B. Huang, “A Review on Data Preprocessing Techniques Toward Efficient and Reliable Knowledge Discovery From Building Operational Data,” Front. Energy Res., vol. 9, no. March, pp. 1–17, 2021, doi: 10.3389/fenrg.2021.652801.
- E. Blessing and H. Klaus, “Normalization and Standardization: Methods to preprocess data to have consistent scales and distributions,” Res. Gate, vol. 2237, no. December 2023, p. 10, 2023, [Online]. Available: https://www.researchgate.net/publication/377123133
- H. Lattar, A. Ben Salem, and H. H. Ben Ghezala, “Does data cleaning improve heart disease prediction?,” Procedia Comput. Sci., vol. 176, pp. 1131–1140, 2020, doi: 10.1016/j.procs.2020.09.109.
- L. Fanani and N. Priandani, “Data Cleaning and Prototyping Using K-Means to Enhance Classification Accuracy,” Int. J. Appl. Eng. Res., vol. 12, pp. 5242–5247, Jan. 2017, [Online]. Available: https://www.ripublication.com/ijaer17/ijaerv12n15_ %2874%29.pdf
- H. A. Prihanditya, “The Implementation of Z-Score Normalization and Boosting Techniques to Increase Accuracy of C4.5 Algorithm in Diagnosing Chronic Kidney Disease,” J. Soft Comput. Explor., vol. 1, no. 1, pp. 63–69, 2020, doi: 10.52465/joscex.v1i1.8.
- C. Kuzudisli, B. Bakir-Gungor, N. Bulut, B. Qaqish, and M. Yousef, “Review of feature selection approaches based on grouping of features,” PeerJ, vol. 11, 2023, doi: 10.7717/peerj.15666.
- H. Jeon and S. Oh, “Hybrid-recursive feature elimination for efficient feature selection,” Appl. Sci., vol. 10, no. 9, pp. 1–9, 2020, doi: 10.3390/app10093211.
- R. Suhendra et al., “Cardiovascular Disease Prediction Using Gradient Boosting Classifier,” Infolitika J. Data Sci., vol. 1, no. 2, pp. 56–62, 2023, doi: 10.60084/ijds.v1i2.131.
- S. P. Nainggolan and A. Sinaga, “Comparative Analysis of Accuracy of Random Forest and Gradient Boosting Classifier Algorithm for Diabetes Classification,” Sebatik, vol. 27, no. 1, pp. 97–102, 2023, doi: 10.46984/sebatik.v27i1.2157.
- S. A. Hicks et al., “On evaluation metrics for medical applications of artificial intelligence,” Sci. Rep., vol. 12, no. 1, pp. 1–9, 2022, doi: 10.1038/s41598-022-09954-8.
- V. Sharma and S. Singh Samant, “Health Recommendation System by Using Deep Learning and Fuzzy Technique,” SSRN Electron. J., no. Aece, pp. 72–78, 2022, doi: 10.2139/ssrn.4157328.
- D. F. Hasan and A. S. M. Khidhir, “Toward enhancement of deep learning techniques using fuzzy logic: a survey,” Int. J. Electr. Comput. Eng., vol. 13, no. 3, pp. 3041–3055, 2023, doi: 10.11591/ijece.v13i3.pp3041-3055.
- A. A. Nancy, D. Ravindran, P. M. D. Raj Vincent, K. Srinivasan, and D. Gutierrez Reina, “IoT-Cloud-Based Smart Healthcare Monitoring System for Heart Disease Prediction via Deep Learning,” Electron., vol. 11, no. 15, 2022, doi: 10.3390/electronics11152292.
References
F. F. Firdaus, H. A. Nugroho, and I. Soesanti, “A Review of Feature Selection and Classification Approaches for Heart Disease Prediction,” IJITEE (International J. Inf. Technol. Electr. Eng., vol. 4, no. 3, p. 75, 2021, doi: 10.22146/ijitee.59193.
Z. Ahmed, K. Mohamed, S. Zeeshan, and X. Q. Dong, “Artificial intelligence with multi-functional machine learning platform development for better healthcare and precision medicine,” Database, vol. 2020, pp. 1–35, 2020, doi: 10.1093/database/baaa010.
F. García-Peñalvo et al., “KoopaML: A Graphical Platform for Building Machine Learning Pipelines Adapted to Health Professionals,” Int. J. Interact. Multimed. Artif. Intell., vol. In Press, no. In Press, p. 1, 2023, doi: 10.9781/ijimai.2023.01.006.
Y. Liu, Z. Ling, B. Huo, B. Wang, T. Chen, and E. Mouine, “Building A Platform for Machine Learning Operations from Open Source Frameworks,” IFAC-PapersOnLine, vol. 53, no. 5, pp. 704–709, 2020, doi: 10.1016/j.ifacol.2021.04.161.
G. Quer, R. Arnaout, M. Henne, and R. Arnaout, “Machine Learning and the Future of Cardiovascular Care: JACC State-of-the-Art Review,” J. Am. Coll. Cardiol., vol. 77, no. 3, pp. 300–313, 2021, doi: 10.1016/j.jacc.2020.11.030.
S. Zeadally, F. Siddiqui, Z. Baig, and A. Ibrahim, “Smart healthcare: Challenges and potential solutions using internet of things (IoT) and big data analytics,” PSU Res. Rev., vol. 4, no. 2, pp. 149–168, 2020, doi: 10.1108/PRR-08-2019-0027.
S. Nashif, M. R. Raihan, M. R. Islam, and M. H. Imam, “Heart Disease Detection by Using Machine Learning Algorithms and a Real-Time Cardiovascular Health Monitoring System,” World J. Eng. Technol., vol. 06, no. 04, pp. 854–873, 2018, doi: 10.4236/wjet.2018.64057.
O. Faust, N. Lei, E. Chew, E. J. Ciaccio, and U. R. Acharya, “A smart service platform for cost efficient cardiac health monitoring,” Int. J. Environ. Res. Public Health, vol. 17, no. 17, pp. 1–18, 2020, doi: 10.3390/ijerph17176313.
C. A. Gómez-García, M. Askar-Rodriguez, and J. Velasco-Medina, “Platform for Healthcare Promotion and Cardiovascular Disease Prevention,” IEEE J. Biomed. Heal. Informatics, vol. 25, no. 7, pp. 2758–2767, 2021, doi: 10.1109/JBHI.2021.3051967.
A. Damayunita, R. S. Fuadi, and C. Juliane, “Comparative Analysis of Naive Bayes, K-Nearest Neighbors (KNN), and Support Vector Machine (SVM) Algorithms for Classification of Heart Disease Patients,” J. Online Inform., vol. 7, no. 2, pp. 219–225, 2022, doi: 10.15575/join.v7i2.919.
R. Waigi, S. Choudhary, P. Fulzele, G. Mishra, and A. Prof, “Predicting The Risk Of Heart Disease Using Advanced Machine Learning Approach,” Eur. J. Mol. Clin. Med., vol. 7, no. May, p. 2020, 2020, [Online]. Available: https://www.researchgate.net/publication/348192776
P. Kumar and A. Kumar, “Heart Disease Classification and Recommendation by Optimized Features and Adaptive Boost Learning,” Int. J. Adv. Comput. Sci. Appl., vol. 14, no. 3, pp. 909–914, 2023, doi: 10.14569/IJACSA.2023.01403103.
N. A. Baghdadi, S. M. Farghaly Abdelaliem, A. Malki, I. Gad, A. Ewis, and E. Atlam, “Advanced machine learning techniques for cardiovascular disease early detection and diagnosis,” J. Big Data, vol. 10, no. 1, pp. 1–29, 2023, doi: 10.1186/s40537-023-00817-1.
C. E. Belle, V. Aksakalli, and S. P. Russo, “A machine learning platform for the discovery of materials,” J. Cheminform., vol. 13, no. 1, pp. 1–23, 2021, doi: 10.1186/s13321-021-00518-y.
R. C. Chen, C. Dewi, S. W. Huang, and R. E. Caraka, “Selecting critical features for data classification based on machine learning methods,” J. Big Data, vol. 7, no. 1, 2020, doi: 10.1186/s40537-020-00327-4.
Priyanka and D. Kumar, “Feature Extraction and Selection of kidney Ultrasound Images Using GLCM and PCA,” Procedia Comput. Sci., vol. 167, no. 2019, pp. 1722–1731, 2020, doi: 10.1016/j.procs.2020.03.382.
K. Kangra and J. Singh, “A genetic algorithm-based feature selection approach for diabetes prediction,” IAES Int. J. Artif. Intell., vol. 13, no. 2, p. 1489, 2024, doi: 10.11591/ijai.v13.i2.pp1489-1498.
D. Boldini, F. Grisoni, D. Kuhn, L. Friedrich, and S. A. Sieber, “Practical guidelines for the use of gradient boosting for molecular property prediction,” J. Cheminform., vol. 15, no. 1, pp. 1–13, 2023, doi: 10.1186/s13321-023-00743-7.
D. A. Setyarini, A. A. M. D. Gayatri, C. S. K. Aditya, and D. R. Chandranegara, “Stroke Prediction with Enhanced Gradient Boosting Classifier and Strategic Hyperparameter,” MATRIK J. Manajemen, Tek. Inform. dan Rekayasa Komput., vol. 23, no. 2, pp. 477–490, 2024, doi: 10.30812/matrik.v23i2.3555.
M. D. Guillen, J. Aparicio, and M. Esteve, “Gradient tree boosting and the estimation of production frontiers,” Expert Syst. Appl., vol. 214, no. April 2022, p. 119134, 2023, doi: 10.1016/j.eswa.2022.119134.
A. F. L. Ptr, M. M. Siregar, and I. Daniel, “Analysis of Gradient Boosting, XGBoost, and CatBoost on Mobile Phone Classification,” J. Comput. Networks, Archit. High Perform. Comput., vol. 6, no. 2, pp. 661–670, 2024, doi: 10.47709/cnahpc.v6i2.3790.
J. B. Jane and E. N. Ganesh, “A review on big data with machine learning and fuzzy logic for better decision making,” Int. J. Sci. Technol. Res., vol. 8, no. 10, pp. 1221–1225, 2019, [Online]. Available: https://www.semanticscholar.org/paper/A-Review-On-Big-Data-With-Machine-Learning-And-For-Jane-E.N.Ganesh/69b508594e32c1c205363194851577cf745ee8c3
L. Maretto, M. Faccio, and D. Battini, “A Multi-Criteria Decision-Making Model Based on Fuzzy Logic and AHP for the Selection of Digital Technologies,” IFAC-PapersOnLine, vol. 55, no. 2, pp. 319–324, 2022, doi: 10.1016/j.ifacol.2022.04.213.
C. Fan, M. Chen, X. Wang, J. Wang, and B. Huang, “A Review on Data Preprocessing Techniques Toward Efficient and Reliable Knowledge Discovery From Building Operational Data,” Front. Energy Res., vol. 9, no. March, pp. 1–17, 2021, doi: 10.3389/fenrg.2021.652801.
E. Blessing and H. Klaus, “Normalization and Standardization: Methods to preprocess data to have consistent scales and distributions,” Res. Gate, vol. 2237, no. December 2023, p. 10, 2023, [Online]. Available: https://www.researchgate.net/publication/377123133
H. Lattar, A. Ben Salem, and H. H. Ben Ghezala, “Does data cleaning improve heart disease prediction?,” Procedia Comput. Sci., vol. 176, pp. 1131–1140, 2020, doi: 10.1016/j.procs.2020.09.109.
L. Fanani and N. Priandani, “Data Cleaning and Prototyping Using K-Means to Enhance Classification Accuracy,” Int. J. Appl. Eng. Res., vol. 12, pp. 5242–5247, Jan. 2017, [Online]. Available: https://www.ripublication.com/ijaer17/ijaerv12n15_ %2874%29.pdf
H. A. Prihanditya, “The Implementation of Z-Score Normalization and Boosting Techniques to Increase Accuracy of C4.5 Algorithm in Diagnosing Chronic Kidney Disease,” J. Soft Comput. Explor., vol. 1, no. 1, pp. 63–69, 2020, doi: 10.52465/joscex.v1i1.8.
C. Kuzudisli, B. Bakir-Gungor, N. Bulut, B. Qaqish, and M. Yousef, “Review of feature selection approaches based on grouping of features,” PeerJ, vol. 11, 2023, doi: 10.7717/peerj.15666.
H. Jeon and S. Oh, “Hybrid-recursive feature elimination for efficient feature selection,” Appl. Sci., vol. 10, no. 9, pp. 1–9, 2020, doi: 10.3390/app10093211.
R. Suhendra et al., “Cardiovascular Disease Prediction Using Gradient Boosting Classifier,” Infolitika J. Data Sci., vol. 1, no. 2, pp. 56–62, 2023, doi: 10.60084/ijds.v1i2.131.
S. P. Nainggolan and A. Sinaga, “Comparative Analysis of Accuracy of Random Forest and Gradient Boosting Classifier Algorithm for Diabetes Classification,” Sebatik, vol. 27, no. 1, pp. 97–102, 2023, doi: 10.46984/sebatik.v27i1.2157.
S. A. Hicks et al., “On evaluation metrics for medical applications of artificial intelligence,” Sci. Rep., vol. 12, no. 1, pp. 1–9, 2022, doi: 10.1038/s41598-022-09954-8.
V. Sharma and S. Singh Samant, “Health Recommendation System by Using Deep Learning and Fuzzy Technique,” SSRN Electron. J., no. Aece, pp. 72–78, 2022, doi: 10.2139/ssrn.4157328.
D. F. Hasan and A. S. M. Khidhir, “Toward enhancement of deep learning techniques using fuzzy logic: a survey,” Int. J. Electr. Comput. Eng., vol. 13, no. 3, pp. 3041–3055, 2023, doi: 10.11591/ijece.v13i3.pp3041-3055.
A. A. Nancy, D. Ravindran, P. M. D. Raj Vincent, K. Srinivasan, and D. Gutierrez Reina, “IoT-Cloud-Based Smart Healthcare Monitoring System for Heart Disease Prediction via Deep Learning,” Electron., vol. 11, no. 15, 2022, doi: 10.3390/electronics11152292.