XGB-Hybrid Fingerprint Classification Model for Virtual Screening of Meningitis Drug Compounds Candidate

Mohammad Hamim Zajuli Al Faroby; Helisyah Nur Fadhilah; Siti Amiroch; Rahmat Sigit Hidayat

doi:10.22219/kinetik.v7i2.1424

Issue

Vol. 7, No. 2, May 2022

Issue Published : May 31, 2022

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

XGB-Hybrid Fingerprint Classification Model for Virtual Screening of Meningitis Drug Compounds Candidate

https://doi.org/10.22219/kinetik.v7i2.1424

Mohammad Hamim Zajuli Al Faroby

Institut Teknologi Telkom Surabaya

Helisyah Nur Fadhilah

Institut Teknologi Telkom Surabaya

Siti Amiroch

Universitas Islam Darul 'ulum Lamongan

Rahmat Sigit Hidayat

Institut Teknologi Telkom Surabaya

Corresponding Author(s) : Mohammad Hamim Zajuli Al Faroby

alfaroby@ittelkom-sby.ac.id

Kinetik: Game Technology, Information System, Computer Network, Computing, Electronics, and Control, Vol. 7, No. 2, May 2022
Article Published : May 31, 2022

Abstract

Meningitis is an infection of the lining of the brain caused by diffuse inflammation, and this condition is caused by viruses or bacteria that cause Meningitis. Prevention for this disease is still in the form of strengthening antibodies with vaccines. There is no significant compound to relieve or treat Meningitis patients. In previous studies, they got seven proteins vital to Meningitis. We continued to investigate the compounds associated with the seven proteins. We chose the in-silico process by utilizing data in an open database. We use several databases for the data collection process. After that, the compound data were extracted for bonding features and chemical elements using molecular fingerprints. We use two fingerprint methods, where both we combine with three types of combinations. The combined results produce three types of datasets with different matrix sizes. We establish the Extreme Gradient Boosting (XGB) method to form the classification model for the three datasets, select the best classification model, and compare it with other classification algorithms. The XGB model has better quality than the classification model of other algorithms. We used this model to predict and quantify compounds that strongly bind to seven vital meningitis proteins. The compound with the highest predictive score (we found more than 0.99) became a drug candidate to inhibit or neutralize Meningitis.

Keywords

Drug Screening Molecular Fingerprint Extreme Gradient Boosting Machine Learning Meningitis

Al Faroby, M. H. Z., Fadhilah, H. N., Amiroch, S., & Hidayat, R. S. (2022). XGB-Hybrid Fingerprint Classification Model for Virtual Screening of Meningitis Drug Compounds Candidate. Kinetik: Game Technology, Information System, Computer Network, Computing, Electronics, and Control, 7(2). https://doi.org/10.22219/kinetik.v7i2.1424

Download Citation

References

A. Kohil, S. Jemmieh, M. K. Smatti, and H. M. Yassine, “Viral meningitis: an overview,” Arch. Virol., vol. 166, no. 2, pp. 335–345, Jan. 2021, doi: 10.1007/S00705-020-04891-1.
T. A. Erickson et al., “The Epidemiology of Meningitis in Infants under 90 Days of Age in a Large Pediatric Hospital,” Microorganisms, vol. 9, no. 3, p. 526, Mar. 2021, doi: 10.3390/MICROORGANISMS9030526.
Y. Nong, Y. Liang, X. Liang, Y. Li, and B. Yang, “Pharmacological targets and mechanisms of calycosin against meningitis,” Aging (Albany. NY)., vol. 12, no. 19, pp. 19468–19476, 2020, doi: 10.18632/aging.103886.
M. H. Z. Al Faroby, M. I. Irawan, and N. N. T. Puspaningsih, “Prediction insulin-protein interactions associated based on ontology genes using extreme gradient boosting and centrality method,” Kinet. Game Technol. Inf. Syst. Comput. Network, Comput. Electron. Contr, vol. 4, no. 5, pp. 253–262, 2020, doi: https://doi.org/10.22219/kinetik.v5i4.107.
T. B. Kimber, Y. Chen, and A. Volkamer, “Deep Learning in Virtual Screening: Recent Applications and Developments,” Int. J. Mol. Sci., vol. 22, no. 9, p. 4435, Apr. 2021, doi: 10.3390/IJMS22094435.
Y. Liu et al., “Machine Learning Models for the Classification of CK2 Natural Products Inhibitors with Molecular Fingerprint Descriptors,” Processes, vol. 9, no. 11, p. 2074, Nov. 2021, doi: 10.3390/PR9112074.
N. R. Das, S. P. Mishra, and P. G. R. Achary, “Evaluation of molecular structure based descriptors for the prediction of pEC50(M) for the selective adenosine A2A Receptor,” J. Mol. Struct., vol. 1232, p. 130080, May 2021, doi: 10.1016/J.MOLSTRUC.2021.130080.
N. Principi and S. Esposito, “Bacterial meningitis: new treatment options to reduce the risk of brain damage,” Expert Opin. Pharmacother., vol. 21, no. 1, pp. 97–105, Jan. 2019, doi: 10.1080/14656566.2019.1685497.
S. Das, S. Sarmah, S. Lyndem, and A. Singha Roy, “An investigation into the identification of potential inhibitors of SARS-CoV-2 main protease using molecular docking study,” J. Biomol. Struct. Dyn., vol. 39, no. 9, pp. 3347–3357, 2021, doi: 10.1080/07391102.2020.1763201/SUPPL_FILE/TBSD_A_1763201_SM9561.PDF.
F. Fernando, M. I. Irawan, and A. Fadlan, “Bat Algorithm for Solving Molecular Docking of Alkaloid Compound SA2014 Towards Cyclin D1 Protein in Cancer,” J. Phys. Conf. Ser., vol. 1366, no. 1, 2019, doi: 10.1088/1742-6596/1366/1/012089.
S. Lim and Y. O. Lee, “Predicting chemical properties using self-attention multi-task learning based on SMILES representation,” in Proceedings - International Conference on Pattern Recognition, 2020, pp. 3146–3153, doi: 10.1109/ICPR48806.2021.9412555.
L. Gentiluomo et al., “Application of interpretable artificial neural networks to early monoclonal antibodies development,” Eur. J. Pharm. Biopharm., vol. 141, pp. 81–89, Aug. 2019, doi: 10.1016/j.ejpb.2019.05.017.
J. W. Liang, M. Y. Wang, S. Wang, S. L. Li, W. Q. Li, and F. H. Meng, “Identification of novel CDK2 inhibitors by a multistage virtual screening method based on SVM, pharmacophore and docking model,” J. Enzyme Inhib. Med. Chem., vol. 35, no. 1, pp. 235–244, Jan. 2020, doi: 10.1080/14756366.2019.1693702/SUPPL_FILE/IENZ_A_1693702_SM2142.ZIP.
Y. Zhou et al., “Quantitative Structure-Activity Relationship (QSAR) Model for the Severity Prediction of Drug-Induced Rhabdomyolysis by Using Random Forest,” Chem. Res. Toxicol., vol. 34, no. 2, pp. 514–521, Feb. 2021, doi: 10.1021/ACS.CHEMRESTOX.0C00347/SUPPL_FILE/TX0C00347_SI_001.ZIP.
C. Schneider, A. Buchanan, B. Taddese, and C. M. Deane, “DLAB: deep learning methods for structure-based virtual screening of antibodies,” Bioinformatics, vol. 38, no. 2, pp. 377–383, Jan. 2022, doi: 10.1093/BIOINFORMATICS/BTAB660.
S. Pokhrel et al., “Spike protein recognizer receptor ACE2 targeted identification of potential natural antiviral drug candidates against SARS-CoV-2,” Int. J. Biol. Macromol., vol. 191, pp. 1114–1125, Nov. 2021, doi: 10.1016/J.IJBIOMAC.2021.09.146.
F. M. I. Hunter, A. P. Bento, N. Bosc, A. Gaulton, A. Hersey, and A. R. Leach, “Drug Safety Data Curation and Modeling in ChEMBL: Boxed Warnings and Withdrawn Drugs,” Chem. Res. Toxicol., vol. 34, no. 2, pp. 385–395, Feb. 2021, doi: 10.1021/ACS.CHEMRESTOX.0C00296/SUPPL_FILE/TX0C00296_SI_002.ZIP.
K. Nandhini and G. V Sriramakrishnan, “A Review of Drug Target Interaction Prognostication Using Artificial Intelligence,” Ann. Rom. Soc. Cell Biol., vol. 25, pp. 832–838, May 2021, Accessed: Jan. 24, 2022. [Online]. Available: https://www.annalsofrscb.ro/index.php/journal/article/view/4424.
M. D. M. Fernández-Arjona, J. M. Grondona, P. Fernández-Llebrez, and M. D. López-Ávalos, “Microglial activation by microbial neuraminidase through TLR2 and TLR4 receptors,” J. Neuroinflammation, vol. 16, no. 1, 2019, doi: 10.1186/s12974-019-1643-9.
A. Capecchi, M. Awale, D. Probst, and J. Reymond, “PubChem and ChEMBL beyond Lipinski,” Mol. Inform., vol. 38, no. 5, p. 1900016, May 2019, doi: 10.1002/minf.201900016.
Y. Hua, Y. Shi, X. Cui, and X. Li, “In silico prediction of chemical-induced hematotoxicity with machine learning and deep learning methods,” Mol. Divers., vol. 25, no. 3, pp. 1585–1596, Aug. 2021, doi: 10.1007/S11030-021-10255-X/TABLES/4.
T. Chen and C. Guestrin, “XGBoost: A scalable tree boosting system,” in Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2016, vol. 13-17-Augu, pp. 785–794, doi: 10.1145/2939672.2939785.
X. Su and M. Bai, “Stochastic gradient boosting frequency-severity model of insurance claims,” PLoS One, vol. 15, no. 8, p. e0238000, Aug. 2020, doi: 10.1371/JOURNAL.PONE.0238000.
S. Kabiraj et al., “Breast Cancer Risk Prediction using XGBoost and Random Forest Algorithm,” Jul. 2020, doi: 10.1109/ICCCNT49239.2020.9225451.
H. Kuswanto, R. Y. Nurhidayah, and H. Ohwada, “Comparison of Feature Selection Methods to Classify Inhibitors in DUD-E Database,” in Procedia Computer Science, Jan. 2018, vol. 144, pp. 194–202, doi: 10.1016/j.procs.2018.10.519.
S. Kim et al., “PubChem in 2021: new data content and improved web interfaces,” Nucleic Acids Res., vol. 49, no. D1, pp. D1388–D1395, Jan. 2021, doi: 10.1093/NAR/GKAA971.
A. Capecchi, D. Probst, and J. L. Reymond, “One molecular fingerprint to rule them all: Drugs, biomolecules, and the metabolome,” J. Cheminform., vol. 12, no. 1, pp. 1–15, Jun. 2020, doi: 10.1186/S13321-020-00445-4/FIGURES/8.
N. Hecker et al., “SuperTarget goes quantitative: Update on drug-target interactions,” Nucleic Acids Res., vol. 40, no. D1, Jan. 2012, doi: 10.1093/nar/gkr912.
T. Mancini, I. Melatti, and E. Tronci, “Any-horizon uniform random sampling and enumeration of constrained scenarios for simulation-based formal verification,” IEEE Trans. Softw. Eng., 2021, doi: 10.1109/TSE.2021.3109842.
A. Salazar, L. Vergara, and G. Safont, “Generative Adversarial Networks and Markov Random Fields for oversampling very small training sets,” Expert Syst. Appl., vol. 163, p. 113819, Jan. 2021, doi: 10.1016/J.ESWA.2020.113819.
Y. Peng and M. H. Nagata, “An empirical overview of nonlinearity and overfitting in machine learning using COVID-19 data,” Chaos, Solitons & Fractals, vol. 139, p. 110055, Oct. 2020, doi: 10.1016/J.CHAOS.2020.110055.
M. Rahman, Y. Cao, X. Sun, B. Li, and Y. Hao, “Deep pre-trained networks as a feature extractor with XGBoost to detect tuberculosis from chest X-ray,” Comput. Electr. Eng., vol. 93, p. 107252, Jul. 2021, doi: 10.1016/J.COMPELECENG.2021.107252.

References

A. Kohil, S. Jemmieh, M. K. Smatti, and H. M. Yassine, “Viral meningitis: an overview,” Arch. Virol., vol. 166, no. 2, pp. 335–345, Jan. 2021, doi: 10.1007/S00705-020-04891-1.

T. A. Erickson et al., “The Epidemiology of Meningitis in Infants under 90 Days of Age in a Large Pediatric Hospital,” Microorganisms, vol. 9, no. 3, p. 526, Mar. 2021, doi: 10.3390/MICROORGANISMS9030526.

Y. Nong, Y. Liang, X. Liang, Y. Li, and B. Yang, “Pharmacological targets and mechanisms of calycosin against meningitis,” Aging (Albany. NY)., vol. 12, no. 19, pp. 19468–19476, 2020, doi: 10.18632/aging.103886.

M. H. Z. Al Faroby, M. I. Irawan, and N. N. T. Puspaningsih, “Prediction insulin-protein interactions associated based on ontology genes using extreme gradient boosting and centrality method,” Kinet. Game Technol. Inf. Syst. Comput. Network, Comput. Electron. Contr, vol. 4, no. 5, pp. 253–262, 2020, doi: https://doi.org/10.22219/kinetik.v5i4.107.

T. B. Kimber, Y. Chen, and A. Volkamer, “Deep Learning in Virtual Screening: Recent Applications and Developments,” Int. J. Mol. Sci., vol. 22, no. 9, p. 4435, Apr. 2021, doi: 10.3390/IJMS22094435.

Y. Liu et al., “Machine Learning Models for the Classification of CK2 Natural Products Inhibitors with Molecular Fingerprint Descriptors,” Processes, vol. 9, no. 11, p. 2074, Nov. 2021, doi: 10.3390/PR9112074.

N. R. Das, S. P. Mishra, and P. G. R. Achary, “Evaluation of molecular structure based descriptors for the prediction of pEC50(M) for the selective adenosine A2A Receptor,” J. Mol. Struct., vol. 1232, p. 130080, May 2021, doi: 10.1016/J.MOLSTRUC.2021.130080.

N. Principi and S. Esposito, “Bacterial meningitis: new treatment options to reduce the risk of brain damage,” Expert Opin. Pharmacother., vol. 21, no. 1, pp. 97–105, Jan. 2019, doi: 10.1080/14656566.2019.1685497.

S. Das, S. Sarmah, S. Lyndem, and A. Singha Roy, “An investigation into the identification of potential inhibitors of SARS-CoV-2 main protease using molecular docking study,” J. Biomol. Struct. Dyn., vol. 39, no. 9, pp. 3347–3357, 2021, doi: 10.1080/07391102.2020.1763201/SUPPL_FILE/TBSD_A_1763201_SM9561.PDF.

F. Fernando, M. I. Irawan, and A. Fadlan, “Bat Algorithm for Solving Molecular Docking of Alkaloid Compound SA2014 Towards Cyclin D1 Protein in Cancer,” J. Phys. Conf. Ser., vol. 1366, no. 1, 2019, doi: 10.1088/1742-6596/1366/1/012089.

S. Lim and Y. O. Lee, “Predicting chemical properties using self-attention multi-task learning based on SMILES representation,” in Proceedings - International Conference on Pattern Recognition, 2020, pp. 3146–3153, doi: 10.1109/ICPR48806.2021.9412555.

L. Gentiluomo et al., “Application of interpretable artificial neural networks to early monoclonal antibodies development,” Eur. J. Pharm. Biopharm., vol. 141, pp. 81–89, Aug. 2019, doi: 10.1016/j.ejpb.2019.05.017.

J. W. Liang, M. Y. Wang, S. Wang, S. L. Li, W. Q. Li, and F. H. Meng, “Identification of novel CDK2 inhibitors by a multistage virtual screening method based on SVM, pharmacophore and docking model,” J. Enzyme Inhib. Med. Chem., vol. 35, no. 1, pp. 235–244, Jan. 2020, doi: 10.1080/14756366.2019.1693702/SUPPL_FILE/IENZ_A_1693702_SM2142.ZIP.

Y. Zhou et al., “Quantitative Structure-Activity Relationship (QSAR) Model for the Severity Prediction of Drug-Induced Rhabdomyolysis by Using Random Forest,” Chem. Res. Toxicol., vol. 34, no. 2, pp. 514–521, Feb. 2021, doi: 10.1021/ACS.CHEMRESTOX.0C00347/SUPPL_FILE/TX0C00347_SI_001.ZIP.

C. Schneider, A. Buchanan, B. Taddese, and C. M. Deane, “DLAB: deep learning methods for structure-based virtual screening of antibodies,” Bioinformatics, vol. 38, no. 2, pp. 377–383, Jan. 2022, doi: 10.1093/BIOINFORMATICS/BTAB660.

S. Pokhrel et al., “Spike protein recognizer receptor ACE2 targeted identification of potential natural antiviral drug candidates against SARS-CoV-2,” Int. J. Biol. Macromol., vol. 191, pp. 1114–1125, Nov. 2021, doi: 10.1016/J.IJBIOMAC.2021.09.146.

F. M. I. Hunter, A. P. Bento, N. Bosc, A. Gaulton, A. Hersey, and A. R. Leach, “Drug Safety Data Curation and Modeling in ChEMBL: Boxed Warnings and Withdrawn Drugs,” Chem. Res. Toxicol., vol. 34, no. 2, pp. 385–395, Feb. 2021, doi: 10.1021/ACS.CHEMRESTOX.0C00296/SUPPL_FILE/TX0C00296_SI_002.ZIP.

K. Nandhini and G. V Sriramakrishnan, “A Review of Drug Target Interaction Prognostication Using Artificial Intelligence,” Ann. Rom. Soc. Cell Biol., vol. 25, pp. 832–838, May 2021, Accessed: Jan. 24, 2022. [Online]. Available: https://www.annalsofrscb.ro/index.php/journal/article/view/4424.

M. D. M. Fernández-Arjona, J. M. Grondona, P. Fernández-Llebrez, and M. D. López-Ávalos, “Microglial activation by microbial neuraminidase through TLR2 and TLR4 receptors,” J. Neuroinflammation, vol. 16, no. 1, 2019, doi: 10.1186/s12974-019-1643-9.

A. Capecchi, M. Awale, D. Probst, and J. Reymond, “PubChem and ChEMBL beyond Lipinski,” Mol. Inform., vol. 38, no. 5, p. 1900016, May 2019, doi: 10.1002/minf.201900016.

Y. Hua, Y. Shi, X. Cui, and X. Li, “In silico prediction of chemical-induced hematotoxicity with machine learning and deep learning methods,” Mol. Divers., vol. 25, no. 3, pp. 1585–1596, Aug. 2021, doi: 10.1007/S11030-021-10255-X/TABLES/4.

T. Chen and C. Guestrin, “XGBoost: A scalable tree boosting system,” in Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2016, vol. 13-17-Augu, pp. 785–794, doi: 10.1145/2939672.2939785.

X. Su and M. Bai, “Stochastic gradient boosting frequency-severity model of insurance claims,” PLoS One, vol. 15, no. 8, p. e0238000, Aug. 2020, doi: 10.1371/JOURNAL.PONE.0238000.

S. Kabiraj et al., “Breast Cancer Risk Prediction using XGBoost and Random Forest Algorithm,” Jul. 2020, doi: 10.1109/ICCCNT49239.2020.9225451.

H. Kuswanto, R. Y. Nurhidayah, and H. Ohwada, “Comparison of Feature Selection Methods to Classify Inhibitors in DUD-E Database,” in Procedia Computer Science, Jan. 2018, vol. 144, pp. 194–202, doi: 10.1016/j.procs.2018.10.519.

S. Kim et al., “PubChem in 2021: new data content and improved web interfaces,” Nucleic Acids Res., vol. 49, no. D1, pp. D1388–D1395, Jan. 2021, doi: 10.1093/NAR/GKAA971.

A. Capecchi, D. Probst, and J. L. Reymond, “One molecular fingerprint to rule them all: Drugs, biomolecules, and the metabolome,” J. Cheminform., vol. 12, no. 1, pp. 1–15, Jun. 2020, doi: 10.1186/S13321-020-00445-4/FIGURES/8.

N. Hecker et al., “SuperTarget goes quantitative: Update on drug-target interactions,” Nucleic Acids Res., vol. 40, no. D1, Jan. 2012, doi: 10.1093/nar/gkr912.

T. Mancini, I. Melatti, and E. Tronci, “Any-horizon uniform random sampling and enumeration of constrained scenarios for simulation-based formal verification,” IEEE Trans. Softw. Eng., 2021, doi: 10.1109/TSE.2021.3109842.

A. Salazar, L. Vergara, and G. Safont, “Generative Adversarial Networks and Markov Random Fields for oversampling very small training sets,” Expert Syst. Appl., vol. 163, p. 113819, Jan. 2021, doi: 10.1016/J.ESWA.2020.113819.

Y. Peng and M. H. Nagata, “An empirical overview of nonlinearity and overfitting in machine learning using COVID-19 data,” Chaos, Solitons & Fractals, vol. 139, p. 110055, Oct. 2020, doi: 10.1016/J.CHAOS.2020.110055.

M. Rahman, Y. Cao, X. Sun, B. Li, and Y. Hao, “Deep pre-trained networks as a feature extractor with XGBoost to detect tuberculosis from chest X-ray,” Comput. Electr. Eng., vol. 93, p. 107252, Jul. 2021, doi: 10.1016/J.COMPELECENG.2021.107252.

Issue

Vol. 7, No. 2, May 2022

XGB-Hybrid Fingerprint Classification Model for Virtual Screening of Meningitis Drug Compounds Candidate

Corresponding Author(s) : Mohammad Hamim Zajuli Al Faroby

Abstract

Keywords

Download Citation

References

Downloads