This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Aspect-Level Sentiment Analysis on GoPay App Reviews Using Multilayer Perceptron and Word Embeddings
Corresponding Author(s) : Henzi Juandri
Kinetik: Game Technology, Information System, Computer Network, Computing, Electronics, and Control,
Vol. 9, No. 4, November 2024 (Article in Progress)
Abstract
The increase of smartphone usage in Indonesia has encouraged the development of digital wallet applications, one of which is GoPay. Nowadays, GoPay has gained significant popularity among the public in Indonesia. Therefore, this research conducts aspect-level sentiment analysis to analyze user reviews of the GoPay application in more detail and depth. The sentiment analysis process in this study utilizes the Multilayer Perceptron (MLP) with fastText and word2vec as word embeddings. The dataset used is GoPay application reviews, which consist of 15,000 reviews collected from Google Play Store. The dataset is categorized into three main aspects: Feature and functionality, App Interface, and User Satisfaction. The stages of the research include data preparation, data preprocessing, word embeddings, model training, and model testing and evaluation. This research explores the effect of fastText and word2vec as word embeddings on model performance. Furthermore, this research examines the application of oversampling techniques, such as SMOTE and Random Oversampling. Based on the experiments conducted, utilizing fastText as word embeddings in MLP with a balanced dataset resulted the best model performance, with an F1-Score of 97%, Recall of 96%, and Precision of 95% for aspect category classification. Then, for sentiment classification, using fastText on MLP with a balanced dataset resulted in a value of 98% for each of the F1-score, Recall, and Precision metrics. This research validates that MLP is effective for aspect-level sentiment analysis, delivering strong evaluation results.
Keywords
Download Citation
Endnote/Zotero/Mendeley (RIS)BibTeX
- Statista, "Number of smartphone users in Indonesia from 2018 to 2028 (in millions) [Graph]," Jul. 25, 2023. [Online]. Available: https://www.statista.com/forecasts/266729/smartphone-users-in-indonesia
- Rakuten Insight, "Major e-payment services used among respondents in Indonesia as of October 2022 [Graph]," Dec. 29, 2022. [Online]. Available: https://www.statista.com/statistics/1105870/indonesia-leading-e-payment-services/
- Google Play Store, "Gojek - GoPay," [Online]. Available: https://play.google.com/store/apps/details?id=com.gojek.gopay&hl=en-ID. [Accessed: Jun. 6, 2024].
- K. L. Tan, C. P. Lee, and K. M. Lim, “4 A Survey of Sentiment Analysis: Approaches, Datasets, and Future Research,” Applied Sciences (Switzerland), vol. 13, no. 7. MDPI, Apr. 01, 2023. doi: 10.3390/app13074550.
- N. C. Dang, M. N. Moreno-García, and F. De la Prieta, “5 Sentiment analysis based on deep learning: A comparative study,” Electronics (Switzerland), vol. 9, no. 3, Mar. 2020, doi: 10.3390/electronics9030483.
- W. Zhang, X. Li, Y. Deng, L. Bing, and W. Lam, “6 A Survey on Aspect-Based Sentiment Analysis: Tasks, Methods, and Challenges,” Mar. 2022, [Online]. Available: http://arxiv.org/abs/2203.01054
- S. Behl, A. Rao, S. Aggarwal, S. Chadha, and H. S. Pannu, “7 Twitter for disaster relief through sentiment analysis for COVID-19 and natural hazard crises,” International Journal of Disaster Risk Reduction, vol. 55, Mar. 2021, doi: 10.1016/j.ijdrr.2021.102101.
- A. E. O. Carosia, G. P. Coelho, and A. E. A. Silva, “8 Analyzing the Brazilian Financial Market through Portuguese Sentiment Analysis in Social Media,” Applied Artificial Intelligence, vol. 34, no. 1, pp. 1–19, Jan. 2020, doi: 10.1080/08839514.2019.1673037.
- R. P. Aluna, I. N. Yulita, and R. Sudrajat, “9 Electronic News Sentiment Analysis Application to New Normal Policy During The Covid-19 Pandemic Using Fasttext And Machine Learning,” in 2021 International Conference on Artificial Intelligence and Big Data Analytics, 2021, pp. 236–241. doi: 10.1109/ICAIBDA53487.2021.9689756.
- S. H. Janjua, G. F. Siddiqui, M. A. Sindhu, and U. Rashid, “10 Multi-level aspect based sentiment classification of Twitter data: using hybrid approach in deep learning,” PeerJ Comput Sci, vol. 7, pp. 1–25, Apr. 2021, doi: 10.7717/peerj-cs.433.
- N. Alturaief, H. Aljamaan, and M. Baslyman, “11 AWARE: Aspect-Based Sentiment Analysis Dataset of Apps Reviews for Requirements Elicitation,” in Proceedings - 2021 36th IEEE/ACM International Conference on Automated Software Engineering Workshops, ASEW 2021, Institute of Electrical and Electronics Engineers Inc., 2021, pp. 211–218. doi: 10.1109/ASEW52652.2021.00049.
- S. Gunathilaka and N. De Silva, “12 Aspect-based Sentiment Analysis on Mobile Application Reviews,” in 22nd International Conference on Advances in ICT for Emerging Regions, ICTer 2022, Institute of Electrical and Electronics Engineers Inc., 2022, pp. 183–188. doi: 10.1109/ICTer58063.2022.10024070.
- T. Mikolov, K. Chen, G. Corrado, and J. Dean, “13 Efficient Estimation of Word Representations in Vector Space,” Jan. 2013, [Online]. Available: http://arxiv.org/abs/1301.3781
- E. Grave, P. Bojanowski, P. Gupta, A. Joulin, and T. Mikolov, “14 Learning Word Vectors for 157 Languages,” CoRR, vol. abs/1802.06893, 2018, [Online]. Available: http://arxiv.org/abs/1802.06893
- T. Mikolov, K. Chen, G. Corrado, and J. Dean, “15 Efficient Estimation of Word Representations in Vector Space,” Jan. 2013, [Online]. Available: http://arxiv.org/abs/1301.3781
- N. Dilawar et al., “16 Understanding citizen issues through reviews: A step towards data informed planning in Smart Cities,” Applied Sciences (Switzerland), vol. 8, no. 9, Sep. 2018, doi: 10.3390/app8091589.
- A. Hafeez et al., “17 Addressing Imbalance Problem for Multi Label Classification of Scholarly Articles,” IEEE Access, vol. PP, p. 1, Jun. 2023, doi: 10.1109/ACCESS.2023.3293852.
- R. Ali, J. Hussain, and S. W. Lee, “18 Multilayer perceptron-based self-care early prediction of children with disabilities,” Digit Health, vol. 9, Jan. 2023, doi: 10.1177/20552076231184054.
- S. R. Dubey, S. K. Singh, and B. B. Chaudhuri, “19 Activation Functions in Deep Learning: A Comprehensive Survey and Benchmark,” Sep. 2021, [Online]. Available: http://arxiv.org/abs/2109.14545
- . Usha Ruby Dr.A, “20 Binary cross entropy with deep learning technique for Image classification,” International Journal of Advanced Trends in Computer Science and Engineering, vol. 9, no. 4, pp. 5393–5397, Aug. 2020, doi: 10.30534/ijatcse/2020/175942020.
- S. Chatterjee and A. Keprate, “21 Predicting Remaining Fatigue Life of Topside Piping Using Deep Learning,” in 2021 International Conference on Applied Artificial Intelligence, ICAPAI 2021, Institute of Electrical and Electronics Engineers Inc., May 2021. doi: 10.1109/ICAPAI49758.2021.9462055.
- A. I. Ramadhan and E. B. Setiawan, “22 Aspect-based Sentiment Analysis on Social Media Using Convolutional Neural Network (CNN) Method,” Building of Informatics, Technology and Science (BITS), vol. 4, no. 4, Mar. 2023, doi: 10.47065/bits.v4i4.3103.
- S. Riyanto, I. S. Sitanggang, T. Djatna, and T. D. Atikah, “23 Comparative Analysis using Various Performance Metrics in Imbalanced Data for Multi-class Text Classification.” [Online]. Available: http://gcancer.org/pdr
- C. Padurariu and M. E. Breaban, “24 Dealing with data imbalance in text classification,” in Procedia Computer Science, Elsevier B.V., 2019, pp. 736–745. doi: 10.1016/j.procs.2019.09.229.
- M. Hayaty, S. Muthmainah, and S. M. Ghufran, “25 Random and Synthetic Over-Sampling Approach to Resolve Data Imbalance in Classification,” International Journal of Artificial Intelligence Research, vol. 4, no. 2, p. 86, Jan. 2021, doi: 10.29099/ijair.v4i2.152.
- M. R. Ilham and A. D. Laksito, “26 Comparative Analysis of Using Word Embedding in Deep Learning for Text Classification,” Jurnal Riset Informatika, vol. 5, no. 2, pp. 195–202, Mar. 2023, doi: 10.34288/jri.v5i2.507.
- P. Mojumder, M. Hasan, M. F. Hossain, and K. M. A. Hasan, “27 A study of fasttext word embedding effects in document classification in bangla language,” in Lecture Notes of the Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering, LNICST, Springer, 2020, pp. 441–453. doi: 10.1007/978-3-030-52856-0_35.
- C. Yang, E. A. Fridgeirsson, J. A. Kors, J. M. Reps, and P. R. Rijnbeek, “28 Impact of random oversampling and random undersampling on the performance of prediction models developed using observational health data,” J Big Data, vol. 11, no. 1, Dec. 2024, doi: 10.1186/s40537-023-00857-7.
References
Statista, "Number of smartphone users in Indonesia from 2018 to 2028 (in millions) [Graph]," Jul. 25, 2023. [Online]. Available: https://www.statista.com/forecasts/266729/smartphone-users-in-indonesia
Rakuten Insight, "Major e-payment services used among respondents in Indonesia as of October 2022 [Graph]," Dec. 29, 2022. [Online]. Available: https://www.statista.com/statistics/1105870/indonesia-leading-e-payment-services/
Google Play Store, "Gojek - GoPay," [Online]. Available: https://play.google.com/store/apps/details?id=com.gojek.gopay&hl=en-ID. [Accessed: Jun. 6, 2024].
K. L. Tan, C. P. Lee, and K. M. Lim, “4 A Survey of Sentiment Analysis: Approaches, Datasets, and Future Research,” Applied Sciences (Switzerland), vol. 13, no. 7. MDPI, Apr. 01, 2023. doi: 10.3390/app13074550.
N. C. Dang, M. N. Moreno-García, and F. De la Prieta, “5 Sentiment analysis based on deep learning: A comparative study,” Electronics (Switzerland), vol. 9, no. 3, Mar. 2020, doi: 10.3390/electronics9030483.
W. Zhang, X. Li, Y. Deng, L. Bing, and W. Lam, “6 A Survey on Aspect-Based Sentiment Analysis: Tasks, Methods, and Challenges,” Mar. 2022, [Online]. Available: http://arxiv.org/abs/2203.01054
S. Behl, A. Rao, S. Aggarwal, S. Chadha, and H. S. Pannu, “7 Twitter for disaster relief through sentiment analysis for COVID-19 and natural hazard crises,” International Journal of Disaster Risk Reduction, vol. 55, Mar. 2021, doi: 10.1016/j.ijdrr.2021.102101.
A. E. O. Carosia, G. P. Coelho, and A. E. A. Silva, “8 Analyzing the Brazilian Financial Market through Portuguese Sentiment Analysis in Social Media,” Applied Artificial Intelligence, vol. 34, no. 1, pp. 1–19, Jan. 2020, doi: 10.1080/08839514.2019.1673037.
R. P. Aluna, I. N. Yulita, and R. Sudrajat, “9 Electronic News Sentiment Analysis Application to New Normal Policy During The Covid-19 Pandemic Using Fasttext And Machine Learning,” in 2021 International Conference on Artificial Intelligence and Big Data Analytics, 2021, pp. 236–241. doi: 10.1109/ICAIBDA53487.2021.9689756.
S. H. Janjua, G. F. Siddiqui, M. A. Sindhu, and U. Rashid, “10 Multi-level aspect based sentiment classification of Twitter data: using hybrid approach in deep learning,” PeerJ Comput Sci, vol. 7, pp. 1–25, Apr. 2021, doi: 10.7717/peerj-cs.433.
N. Alturaief, H. Aljamaan, and M. Baslyman, “11 AWARE: Aspect-Based Sentiment Analysis Dataset of Apps Reviews for Requirements Elicitation,” in Proceedings - 2021 36th IEEE/ACM International Conference on Automated Software Engineering Workshops, ASEW 2021, Institute of Electrical and Electronics Engineers Inc., 2021, pp. 211–218. doi: 10.1109/ASEW52652.2021.00049.
S. Gunathilaka and N. De Silva, “12 Aspect-based Sentiment Analysis on Mobile Application Reviews,” in 22nd International Conference on Advances in ICT for Emerging Regions, ICTer 2022, Institute of Electrical and Electronics Engineers Inc., 2022, pp. 183–188. doi: 10.1109/ICTer58063.2022.10024070.
T. Mikolov, K. Chen, G. Corrado, and J. Dean, “13 Efficient Estimation of Word Representations in Vector Space,” Jan. 2013, [Online]. Available: http://arxiv.org/abs/1301.3781
E. Grave, P. Bojanowski, P. Gupta, A. Joulin, and T. Mikolov, “14 Learning Word Vectors for 157 Languages,” CoRR, vol. abs/1802.06893, 2018, [Online]. Available: http://arxiv.org/abs/1802.06893
T. Mikolov, K. Chen, G. Corrado, and J. Dean, “15 Efficient Estimation of Word Representations in Vector Space,” Jan. 2013, [Online]. Available: http://arxiv.org/abs/1301.3781
N. Dilawar et al., “16 Understanding citizen issues through reviews: A step towards data informed planning in Smart Cities,” Applied Sciences (Switzerland), vol. 8, no. 9, Sep. 2018, doi: 10.3390/app8091589.
A. Hafeez et al., “17 Addressing Imbalance Problem for Multi Label Classification of Scholarly Articles,” IEEE Access, vol. PP, p. 1, Jun. 2023, doi: 10.1109/ACCESS.2023.3293852.
R. Ali, J. Hussain, and S. W. Lee, “18 Multilayer perceptron-based self-care early prediction of children with disabilities,” Digit Health, vol. 9, Jan. 2023, doi: 10.1177/20552076231184054.
S. R. Dubey, S. K. Singh, and B. B. Chaudhuri, “19 Activation Functions in Deep Learning: A Comprehensive Survey and Benchmark,” Sep. 2021, [Online]. Available: http://arxiv.org/abs/2109.14545
. Usha Ruby Dr.A, “20 Binary cross entropy with deep learning technique for Image classification,” International Journal of Advanced Trends in Computer Science and Engineering, vol. 9, no. 4, pp. 5393–5397, Aug. 2020, doi: 10.30534/ijatcse/2020/175942020.
S. Chatterjee and A. Keprate, “21 Predicting Remaining Fatigue Life of Topside Piping Using Deep Learning,” in 2021 International Conference on Applied Artificial Intelligence, ICAPAI 2021, Institute of Electrical and Electronics Engineers Inc., May 2021. doi: 10.1109/ICAPAI49758.2021.9462055.
A. I. Ramadhan and E. B. Setiawan, “22 Aspect-based Sentiment Analysis on Social Media Using Convolutional Neural Network (CNN) Method,” Building of Informatics, Technology and Science (BITS), vol. 4, no. 4, Mar. 2023, doi: 10.47065/bits.v4i4.3103.
S. Riyanto, I. S. Sitanggang, T. Djatna, and T. D. Atikah, “23 Comparative Analysis using Various Performance Metrics in Imbalanced Data for Multi-class Text Classification.” [Online]. Available: http://gcancer.org/pdr
C. Padurariu and M. E. Breaban, “24 Dealing with data imbalance in text classification,” in Procedia Computer Science, Elsevier B.V., 2019, pp. 736–745. doi: 10.1016/j.procs.2019.09.229.
M. Hayaty, S. Muthmainah, and S. M. Ghufran, “25 Random and Synthetic Over-Sampling Approach to Resolve Data Imbalance in Classification,” International Journal of Artificial Intelligence Research, vol. 4, no. 2, p. 86, Jan. 2021, doi: 10.29099/ijair.v4i2.152.
M. R. Ilham and A. D. Laksito, “26 Comparative Analysis of Using Word Embedding in Deep Learning for Text Classification,” Jurnal Riset Informatika, vol. 5, no. 2, pp. 195–202, Mar. 2023, doi: 10.34288/jri.v5i2.507.
P. Mojumder, M. Hasan, M. F. Hossain, and K. M. A. Hasan, “27 A study of fasttext word embedding effects in document classification in bangla language,” in Lecture Notes of the Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering, LNICST, Springer, 2020, pp. 441–453. doi: 10.1007/978-3-030-52856-0_35.
C. Yang, E. A. Fridgeirsson, J. A. Kors, J. M. Reps, and P. R. Rijnbeek, “28 Impact of random oversampling and random undersampling on the performance of prediction models developed using observational health data,” J Big Data, vol. 11, no. 1, Dec. 2024, doi: 10.1186/s40537-023-00857-7.