This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Feature Selection Based on Multi-Filters for Classification of Mammogram Images to Look for Signs of Breast Cancer
Corresponding Author(s) : Shofwatul ‘Uyun
Kinetik: Game Technology, Information System, Computer Network, Computing, Electronics, and Control,
Vol. 7, No. 3, August 2022
Abstract
The accuracy of classification results on mammogram images has a significant role in breast cancer diagnosis. Therefore, many stages consider finding the model has a high level of accuracy and minimizing the computing load, one of which is the accuracy in using the best feature. This needs to be prioritized considering that mammogram image has many features resulting from the mammogram extraction process. Our research has four stages: feature extraction, feature selection-multi filters, classification, and performance evaluation. Thus, in this research, we propose algorithms that can select the features by utilizing multiple filters simultaneously on the filter model for feature selection of mammogram images based on multi-filters/FSbMF. There are six feature selection algorithms with a filter approach (information gain, rule, relief, correlation, gini index, and chi-square) used in this research. Based on the testing result using 10-fold cross-validation, the features resulting from the FSbMF algorithm have the best performance based on the accuracy, recall, and precision from 72,63%, 70,38%, 75,01% to be 100%. Furthermore, the number of resulting features is the minimum because it results from intersection operation from the feature subsets resulting from the multi-filter.
Keywords
Download Citation
Endnote/Zotero/Mendeley (RIS)BibTeX
- D. Carvalho, P. R. Pinheiro, and M. C. D. Pinheiro, “A Hybrid Model to Support the Early Diagnosis of Breast Cancer,” in Procedia Computer Science, 2016, vol. 91, pp. 927–934. https://doi.org/10.1016/j.procs.2016.07.112
- F. F. Ting, Y. J. Tan, and K. S. Sim, “Convolution Neural Network Improvemenr for Breast Cancer Classification,” Expert Systems With Applications, vol. 120, pp. 103–115, 2018. https://doi.org/10.1016/j.eswa.2018.11.008
- V. Bolón-Canedo and A. Alonso-Betanzos, “Feature selection,” Intelligent Systems Reference Library, vol. 147, pp. 13–37, 2018. https://doi.org/10.1007/978-3-319-90080-3_2
- S. Dash, M. R. Senapati, and U. R. Jena, “K-NN based automated reasoning using bilateral filter based texture descriptor for computing texture classification,” Egyptian Informatics Journal, vol. 19, no. 2, pp. 133–144, Jul. 2018. https://doi.org/10.1016/j.eij.2018.01.003
- O. Russakovsky et al., “ImageNet Large Scale Visual Recognition Challenge,” International Journal of Computer Vision, vol. 115, no. 3, pp. 211–252, 2015. https://doi.org/10.1007/s11263-015-0816-y
- X. S. Zhou, I. Cohen, Q. Tian, and T. S. Huang, “Feature Extraction and Selection for Image Retrieval,” ACM Multimedia, pp. 1–7, 2000.
- M. N. Injadat, A. Moubayed, A. B. Nassif, and A. Shami, “Systematic ensemble model selection approach for educational data mining,” Knowledge-Based Systems, vol. 200, pp. 1–16, Jul. 2020. https://doi.org/10.1016/j.knosys.2020.105992
- S. Uyun and L. Choridah, “Feature selection mammogram based on breast cancer mining,” International Journal of Electrical and Computer Engineering, vol. 8, no. 1, 2018 http://doi.org/10.11591/ijece.v8i1.pp60-69.
- M. S. Abirami, U. Yash, and S. Singh, “Building an Ensemble Learning Based Algorithm for Improving Intrusion Detection System,” in Advances in Intelligent Systems and Computing, 2020, pp. 635–649. https://doi.org/10.1007/978-981-15-0199-9_55
- X. Gao, C. Hu, C. Shan, B. Liu, Z. Niu, and H. Xie, “Malware classification for the cloud via semi-supervised transfer learning,” Journal of Information Security and Applications, vol. 55, Dec. 2020. https://doi.org/10.1016/j.jisa.2020.102661
- V. de A. Campos and D. C. G. Pedronette, “A framework for speaker retrieval and identification through unsupervised learning,” Computer Speech and Language, vol. 58, pp. 153–174, Nov. 2019. https://doi.org/10.1016/j.csl.2019.04.004
- Y. Jiao and P. Du, “Performance measures in evaluating machine learning based bioinformatics predictors for classifications,” Quantitative Biology, vol. 4, no. 4, pp. 320–330, 2016. https://doi.org/10.1007/s40484-016-0081-2
- F. Fabris, J. P. de Magalhães, and A. A. Freitas, “A review of supervised machine learning applied to ageing research,” Biogerontology, vol. 18, no. 2, pp. 171–188, 2017. https://doi.org/10.1007/s10522-017-9683-y
- H. Li, L. Zhang, M. Jiang, and Y. Li, “Multi-focus image fusion algorithm based on supervised learning for fully convolutional neural network,” Pattern Recognition Letters, vol. 141, pp. 45–53, Dec. 2020. https://doi.org/10.1016/j.patrec.2020.11.014
- T. Jiang, J. L. Gradus, and A. J. Rosellini, “Supervised Machine Learning: A Brief Primer,” 2020. [Online]. Available: www.elsevier.com/locate/bt
- L. Vivona, D. Cascio, F. Fauci, and G. Raso, “Fuzzy technique for microcalcifications clustering in digital mammograms,” BMC Medical Imaging, vol. 14, no. 1, pp. 1–18, 2014. https://doi.org/10.1186/1471-2342-14-10
- X. Chang, Z. Ma, X. Wei, X. Hong, and Y. Gong, “Transductive semi-supervised metric learning for person re-identification,” Pattern Recognition, vol. 108, pp. 1–12, Dec. 2020. https://doi.org/10.1016/j.patcog.2020.107569
- L. H. N. Lorena, A. C. P. L. F. Carvalho, and A. C. Lorena, “Filter Feature Selection for One-Class Classification,” Journal of Intelligent and Robotic Systems: Theory and Applications, vol. 80, no. Icmc, pp. 227–243, 2015. https://doi.org/10.1007/s10846-014-0101-2
- R. Panthong and A. Srivihok, “Wrapper Feature Subset Selection for Dimension Reduction Based on Ensemble Learning Algorithm,” in Procedia Computer Science, 2015, vol. 72, pp. 162–169. https://doi.org/10.1016/j.procs.2015.12.117
- Z. Hu, Y. Bao, T. Xiong, and R. Chiong, “Hybrid filter-wrapper feature selection for short-term load forecasting,” Engineering Applications of Artificial Intelligence, vol. 40, pp. 17–27, Apr. 2015.https://doi.org/10.1016/j.engappai.2014.12.014
- J. Apolloni, G. Leguizamón, and E. Alba, “Two hybrid wrapper-filter feature selection algorithms applied to high-dimensional microarray experiments,” Applied Soft Computing Journal, vol. 38, pp. 922–932, Jan. 2016. https://doi.org/10.1016/j.asoc.2015.10.037
- T. A. Alhaj, M. M. Siraj, A. Zainal, H. T. Elshoush, and F. Elhaj, “Feature selection using information gain for improved structural-based alert correlation,” PLoS ONE, vol. 11, no. 11, pp. 1–18, 2016. https://doi.org/10.1371/journal.pone.0166017
- I. Koprinska, M. Rana, and V. G. Agelidis, “Correlation and instance based feature selection for electricity load forecasting,” Knowledge-Based Systems, vol. 82, pp. 29–40, 2015. https://doi.org/10.1016/j.knosys.2015.02.017
- R. J. Urbanowicz, M. Meeker, W. la Cava, R. S. Olson, and J. H. Moore, “Relief-based feature selection: Introduction and review,” Journal of Biomedical Informatics, vol. 85. Academic Press Inc., pp. 189–203, Sep. 01, 2018. https://doi.org/10.1016/j.jbi.2018.07.014
- M. A. Berbar, “Hybrid methods for feature extraction for breast masses classification,” Egyptian Informatics Journal, vol. 19, no. 1, pp. 63–73, Mar. 2018. https://doi.org/10.1016/j.eij.2017.08.001
- J. Singh, K. Singh, and J. Singh, “Reengineering framework for open source software using decision tree approach,” International Journal of Electrical and Computer Engineering, vol. 9, no. 3, pp. 2041–2048, 2019. http://doi.org/10.11591/ijece.v9i3.pp2041-2048
- K. S. Reddy and E. S. Reddy, “Integrated approach to detect spam in social media networks using hybrid features,” International Journal of Electrical and Computer Engineering (IJECE), vol. 9, no. 1, p. 562, 2019. http://doi.org/10.11591/ijece.v9i1.pp562-569
- M. Ouzzani, I. Ilyas, H. Hammady, A. Elmagarmid, and M. Khabsa, “Learning to identify relevant studies for systematic reviews using random forest and external information,” Machine Learning, vol. 102, no. 3, pp. 465–482, 2015. https://doi.org/10.1007/s10994-015-5535-7
- L. Ma and S. Fan, “CURE-SMOTE algorithm and hybrid algorithm for feature selection and parameter optimization based on random forests,” BMC Bioinformatics, vol. 18, no. 1, pp. 1–18, 2017. https://doi.org/10.1186/s12859-017-1578-z
- M. Dong, Z. Wang, C. Dong, X. Mu, and Y. Ma, “Classification of Region of Interest in Mammograms Using Dual Contourlet Transform and Improved KNN,” vol. 2017, 2017.
- K. Huang, S. Li, X. Kang, and L. Fang, “Spectral–Spatial Hyperspectral Image Classification Based on KNN,” Sensing and Imaging, vol. 17, no. 1, pp. 1–13, 2016. https://doi.org/10.1007/s11220-015-0126-z
References
D. Carvalho, P. R. Pinheiro, and M. C. D. Pinheiro, “A Hybrid Model to Support the Early Diagnosis of Breast Cancer,” in Procedia Computer Science, 2016, vol. 91, pp. 927–934. https://doi.org/10.1016/j.procs.2016.07.112
F. F. Ting, Y. J. Tan, and K. S. Sim, “Convolution Neural Network Improvemenr for Breast Cancer Classification,” Expert Systems With Applications, vol. 120, pp. 103–115, 2018. https://doi.org/10.1016/j.eswa.2018.11.008
V. Bolón-Canedo and A. Alonso-Betanzos, “Feature selection,” Intelligent Systems Reference Library, vol. 147, pp. 13–37, 2018. https://doi.org/10.1007/978-3-319-90080-3_2
S. Dash, M. R. Senapati, and U. R. Jena, “K-NN based automated reasoning using bilateral filter based texture descriptor for computing texture classification,” Egyptian Informatics Journal, vol. 19, no. 2, pp. 133–144, Jul. 2018. https://doi.org/10.1016/j.eij.2018.01.003
O. Russakovsky et al., “ImageNet Large Scale Visual Recognition Challenge,” International Journal of Computer Vision, vol. 115, no. 3, pp. 211–252, 2015. https://doi.org/10.1007/s11263-015-0816-y
X. S. Zhou, I. Cohen, Q. Tian, and T. S. Huang, “Feature Extraction and Selection for Image Retrieval,” ACM Multimedia, pp. 1–7, 2000.
M. N. Injadat, A. Moubayed, A. B. Nassif, and A. Shami, “Systematic ensemble model selection approach for educational data mining,” Knowledge-Based Systems, vol. 200, pp. 1–16, Jul. 2020. https://doi.org/10.1016/j.knosys.2020.105992
S. Uyun and L. Choridah, “Feature selection mammogram based on breast cancer mining,” International Journal of Electrical and Computer Engineering, vol. 8, no. 1, 2018 http://doi.org/10.11591/ijece.v8i1.pp60-69.
M. S. Abirami, U. Yash, and S. Singh, “Building an Ensemble Learning Based Algorithm for Improving Intrusion Detection System,” in Advances in Intelligent Systems and Computing, 2020, pp. 635–649. https://doi.org/10.1007/978-981-15-0199-9_55
X. Gao, C. Hu, C. Shan, B. Liu, Z. Niu, and H. Xie, “Malware classification for the cloud via semi-supervised transfer learning,” Journal of Information Security and Applications, vol. 55, Dec. 2020. https://doi.org/10.1016/j.jisa.2020.102661
V. de A. Campos and D. C. G. Pedronette, “A framework for speaker retrieval and identification through unsupervised learning,” Computer Speech and Language, vol. 58, pp. 153–174, Nov. 2019. https://doi.org/10.1016/j.csl.2019.04.004
Y. Jiao and P. Du, “Performance measures in evaluating machine learning based bioinformatics predictors for classifications,” Quantitative Biology, vol. 4, no. 4, pp. 320–330, 2016. https://doi.org/10.1007/s40484-016-0081-2
F. Fabris, J. P. de Magalhães, and A. A. Freitas, “A review of supervised machine learning applied to ageing research,” Biogerontology, vol. 18, no. 2, pp. 171–188, 2017. https://doi.org/10.1007/s10522-017-9683-y
H. Li, L. Zhang, M. Jiang, and Y. Li, “Multi-focus image fusion algorithm based on supervised learning for fully convolutional neural network,” Pattern Recognition Letters, vol. 141, pp. 45–53, Dec. 2020. https://doi.org/10.1016/j.patrec.2020.11.014
T. Jiang, J. L. Gradus, and A. J. Rosellini, “Supervised Machine Learning: A Brief Primer,” 2020. [Online]. Available: www.elsevier.com/locate/bt
L. Vivona, D. Cascio, F. Fauci, and G. Raso, “Fuzzy technique for microcalcifications clustering in digital mammograms,” BMC Medical Imaging, vol. 14, no. 1, pp. 1–18, 2014. https://doi.org/10.1186/1471-2342-14-10
X. Chang, Z. Ma, X. Wei, X. Hong, and Y. Gong, “Transductive semi-supervised metric learning for person re-identification,” Pattern Recognition, vol. 108, pp. 1–12, Dec. 2020. https://doi.org/10.1016/j.patcog.2020.107569
L. H. N. Lorena, A. C. P. L. F. Carvalho, and A. C. Lorena, “Filter Feature Selection for One-Class Classification,” Journal of Intelligent and Robotic Systems: Theory and Applications, vol. 80, no. Icmc, pp. 227–243, 2015. https://doi.org/10.1007/s10846-014-0101-2
R. Panthong and A. Srivihok, “Wrapper Feature Subset Selection for Dimension Reduction Based on Ensemble Learning Algorithm,” in Procedia Computer Science, 2015, vol. 72, pp. 162–169. https://doi.org/10.1016/j.procs.2015.12.117
Z. Hu, Y. Bao, T. Xiong, and R. Chiong, “Hybrid filter-wrapper feature selection for short-term load forecasting,” Engineering Applications of Artificial Intelligence, vol. 40, pp. 17–27, Apr. 2015.https://doi.org/10.1016/j.engappai.2014.12.014
J. Apolloni, G. Leguizamón, and E. Alba, “Two hybrid wrapper-filter feature selection algorithms applied to high-dimensional microarray experiments,” Applied Soft Computing Journal, vol. 38, pp. 922–932, Jan. 2016. https://doi.org/10.1016/j.asoc.2015.10.037
T. A. Alhaj, M. M. Siraj, A. Zainal, H. T. Elshoush, and F. Elhaj, “Feature selection using information gain for improved structural-based alert correlation,” PLoS ONE, vol. 11, no. 11, pp. 1–18, 2016. https://doi.org/10.1371/journal.pone.0166017
I. Koprinska, M. Rana, and V. G. Agelidis, “Correlation and instance based feature selection for electricity load forecasting,” Knowledge-Based Systems, vol. 82, pp. 29–40, 2015. https://doi.org/10.1016/j.knosys.2015.02.017
R. J. Urbanowicz, M. Meeker, W. la Cava, R. S. Olson, and J. H. Moore, “Relief-based feature selection: Introduction and review,” Journal of Biomedical Informatics, vol. 85. Academic Press Inc., pp. 189–203, Sep. 01, 2018. https://doi.org/10.1016/j.jbi.2018.07.014
M. A. Berbar, “Hybrid methods for feature extraction for breast masses classification,” Egyptian Informatics Journal, vol. 19, no. 1, pp. 63–73, Mar. 2018. https://doi.org/10.1016/j.eij.2017.08.001
J. Singh, K. Singh, and J. Singh, “Reengineering framework for open source software using decision tree approach,” International Journal of Electrical and Computer Engineering, vol. 9, no. 3, pp. 2041–2048, 2019. http://doi.org/10.11591/ijece.v9i3.pp2041-2048
K. S. Reddy and E. S. Reddy, “Integrated approach to detect spam in social media networks using hybrid features,” International Journal of Electrical and Computer Engineering (IJECE), vol. 9, no. 1, p. 562, 2019. http://doi.org/10.11591/ijece.v9i1.pp562-569
M. Ouzzani, I. Ilyas, H. Hammady, A. Elmagarmid, and M. Khabsa, “Learning to identify relevant studies for systematic reviews using random forest and external information,” Machine Learning, vol. 102, no. 3, pp. 465–482, 2015. https://doi.org/10.1007/s10994-015-5535-7
L. Ma and S. Fan, “CURE-SMOTE algorithm and hybrid algorithm for feature selection and parameter optimization based on random forests,” BMC Bioinformatics, vol. 18, no. 1, pp. 1–18, 2017. https://doi.org/10.1186/s12859-017-1578-z
M. Dong, Z. Wang, C. Dong, X. Mu, and Y. Ma, “Classification of Region of Interest in Mammograms Using Dual Contourlet Transform and Improved KNN,” vol. 2017, 2017.
K. Huang, S. Li, X. Kang, and L. Fang, “Spectral–Spatial Hyperspectral Image Classification Based on KNN,” Sensing and Imaging, vol. 17, no. 1, pp. 1–13, 2016. https://doi.org/10.1007/s11220-015-0126-z