
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Analysis of Public Opinion on The Governor Candidate Debate Using LDA and IndoBERT
Corresponding Author(s) : Ahmad Abdul Chamid
Kinetik: Game Technology, Information System, Computer Network, Computing, Electronics, and Control,
Vol. 10, No. 3, August 2025
Abstract
The gubernatorial candidate debate was broadcast live streaming through various YouTube channels, which attracted public attention. Many discussions and multiple conversations appeared in the comment’s column of each YouTube channel that broadcasted the debate. With the many public talks, it is undoubtedly interesting to analyze the contents of the conversation, as well as the expectations and input from the public. However, conversations in the form of text data will be challenging to analyze using conventional methods. So, in this study, public opinion will be analyzed using the topic identification and sentiment classification approaches. Topic identification is carried out to obtain accurate information about what the public is talking about, while sentiment classification is used to find out whether each comment contains positive or negative sentences. This research is novel because it uses data collected from various major media YouTube channels and qualitative analysis of the findings. This study uses public comment data taken from the KPU, NarasiTV, and KompasTV YouTube channels; the results obtained were 4.147 data. Preprocessing data carries out the process, identifying topics using the LDA method, evaluating the LDA model, then sentiment classification using IndoBERT and visualizing the results of the public opinion analysis. The results obtained were five topics with a perplexity value = -7.7909 and a coherence score = 0.5109. In addition, topic 4 is the most dominant compared to other topics, and there are 1.146 comments classified as positive sentiment and 504 negative comments. Topic 4 reflects how religion, culture, and frequently mentioned figures are perceived and discussed by the public, especially in relation to the gubernatorial election (pilgub) or gubernatorial candidate debates.
Keywords
Download Citation
Endnote/Zotero/Mendeley (RIS)BibTeX
- Komisi Pemilihan Umum, PERATURAN KOMISI PEMILIHAN UMUM. Indonesia, 2024.
- I. Gjorshoska, A. Dedinec, J. Prodanova, A. Dedinec, and L. Kocarev, “Public perception of waste regulations implementation. Natural language processing vs real GHG emission reduction modeling,” Ecol Inform, vol. 76, Sep. 2023, doi: 10.1016/j.ecoinf.2023.102130.
- O. Olabanjo et al., “From Twitter to Aso-Rock: A sentiment analysis framework for understanding Nigeria 2023 presidential election,” Heliyon, vol. 9, no. 5, May 2023, doi: 10.1016/j.heliyon.2023.e16085.
- S. Ha and E. Grubert, “Hybridizing qualitative coding with natural language processing and deep learning to assess public comments: A case study of the clean power plan,” Energy Res Soc Sci, vol. 98, Apr. 2023, doi: 10.1016/j.erss.2023.103016.
- A. A. Chamid, Widowati, and R. Kusumaningrum, “Multi-Label Text Classification on Indonesian User Reviews Using Semi-Supervised Graph Neural Networks,” ICIC Express Letters, vol. 17, no. 10, pp. 1075–1084, 2023, doi: 10.24507/icicel.17.10.1075.
- A. A. Chamid, Widowati, and R. Kusumaningrum, “Graph-Based Semi-Supervised Deep Learning for Indonesian Aspect-Based Sentiment Analysis,” Big Data and Cognitive Computing, vol. 7, no. 1, p. 5, 2023, doi: 10.3390/bdcc7010005.
- M. C. Rahmadan, A. N. Hidayanto, D. S. Ekasari, B. Purwandari, and Theresiawati, “Sentiment Analysis and Topic Modelling Using the LDA Method related to the Flood Disaster in Jakarta on Twitter,” in International Conference on Informatics, Multimedia, Cyber and Information System (ICIMCIS) Sentiment, 2020, pp. 126–130.
- M. Paramarta and J. B. B. Darmawan, “Implementasi Metode Support Vector Machine dalam Analisis Sentimen Opini Masyarakat Terhadap Pilkada 2020 pada Media Sosial Twitter,” in Prosiding Nasional Rekayasa Teknologi Industri dan Informasi XVIII, Nov. 2023, pp. 836–841. [Online]. Available: http://journal.itny.ac.id/index.php/ReTII
- A. Rahmawati, A. Marjuni, and J. Zeniarja, “Analisis Sentimen Publik Pada Media Sosial Twitter Terhadap Pelaksanaan Pilkada Serentak Menggunakan Algoritma Support Vector Machine,” CCIT Journal, vol. 10, no. 2, pp. 197–206, 2017, doi: 10.33050/ccit.v10i2.539.
- R. Pohan et al., “Implementasi Algoritma Support Vector Machine dan Model Bag-of-Words dalam Analisis Sentimen mengenai PILKADA 2020 pada Pengguna Twitter,” Jurnal Pengembangan Teknologi Informasi dan Ilmu Komputer, vol. 6, no. 10, pp. 4924–4931, 2022, [Online]. Available: http://j-ptiik.ub.ac.id
- A. Muzaki and A. Witanti, “Sentimen Analisis Masyarakat Di Twitter Terhadap Pilkada 2020 Ditengah Pandemic Covid-19 Dengan Metode NaïVe Bayes Classifier,” Jurnal Teknik Informatika (Jutif), vol. 2, no. 2, pp. 101–107, 2021.
- S. N. Listyarini and D. A. Anggoro, “Analisis Sentimen Pilkada di Tengah Pandemi Covid-19 Menggunakan Convolution Neural Network (CNN),” Jurnal Pendidikan dan Teknologi Indonesia, vol. 1, no. 7, pp. 261–268, 2021, doi: 10.52436/1.jpti.60.
- N. Habbat, H. Anoun, and L. Hassouni, “Sentiment Analysis and Topic Modeling on Arabic Twitter Data during Covid-19 Pandemic,” Indonesian Journal of Innovation and Applied Sciences (IJIAS), vol. 2, no. 1, pp. 60–67, 2022, doi: 10.47540/ijias.v2i1.432.
- I. Alagha, “Topic Modeling and Sentiment Analysis of Twitter Discussions on COVID-19 from Spatial and Temporal Perspectives,” Journal of Information Science Theory and Practice, vol. 9, no. 1, pp. 35–53, 2021.
- A. Verbytska, “Topic modelling as a method for framing analysis of news coverage of the Russia-Ukraine war in 2022–2023,” Lang Commun, vol. 99, pp. 174–193, Nov. 2024, doi: 10.1016/j.langcom.2024.10.004.
- S. Ying, “Guests’ Aesthetic experience with lifestyle hotels: An application of LDA topic modelling analysis,” Heliyon, vol. 10, no. 16, Aug. 2024, doi: 10.1016/j.heliyon.2024.e35894.
- S. E. Uthirapathy and D. Sandanam, “Topic Modelling and Opinion Analysis on Climate Change Twitter Data Using LDA and BERT Model.,” in Procedia Computer Science, Elsevier B.V., 2022, pp. 908–917. doi: 10.1016/j.procs.2023.01.071.
- M. N. P. Ma’ady, A. F. A. Rahim, T. S. N. Syahda, A. F. Rizqi, and M. C. A. Ratna, “Malaysia Citizen Sentiment on Government Response Towards Covid-19 Disaster Management: Using LDA-based Topic Visualization on Twitter,” in Procedia Computer Science, Elsevier B.V., 2024, pp. 561–569. doi: 10.1016/j.procs.2024.03.040.
- K. Taha, P. D. Yoo, C. Yeun, D. Homouz, and A. Taha, “A comprehensive survey of text classification techniques and their research applications: Observational and experimental insights,” Nov. 01, 2024, Elsevier Ireland Ltd. doi: 10.1016/j.cosrev.2024.100664.
- A. A. Chamid, Widowati, and R. Kusumaningrum, “Labeling Consistency Test of Multi-Label Data for Aspect and Sentiment Classification Using the Cohen Kappa Method,” Ingénierie des Systèmes d’Information, vol. 29, no. 1, pp. 161–167, 2024.
- Supriyono, A. P. Wibawa, Suyono, and F. Kurniawan, “Advancements in natural language processing: Implications, challenges, and future directions,” Telematics and Informatics Reports, vol. 16, Dec. 2024, doi: 10.1016/j.teler.2024.100173.
- A. A. Firdaus, A. Yudhana, and I. Riadi, “Public Opinion Analysis of Presidential Candidate Using Naïve Bayes Method,” Kinetik: Game Technology, Information System, Computer Network, Computing, Electronics, and Control, May 2023, doi: 10.22219/kinetik.v8i2.1686.
- A. A. Chamid, W. Widowati, and R. Kusumaningrum, “Text data labeling process for semi-supervised learning modeling,” 12TH INTERNATIONAL SEMINAR ON NEW PARADIGM AND INNOVATION ON NATURAL SCIENCES AND ITS APPLICATIONS (12TH ISNPINSA): Contribution of Science and Technology in the Changing World, vol. 3165, p. 030011, 2024, doi: 10.1063/5.0216320.
- A. W. Pradana and M. Hayaty, “The Effect of Stemming and Removal of Stopwords on the Accuracy of Sentiment Analysis on Indonesian-language Texts,” Kinetik: Game Technology, Information System, Computer Network, Computing, Electronics, and Control, pp. 375–380, Oct. 2019, doi: 10.22219/kinetik.v4i4.912.
- J. Zimmermann, L. E. Champagne, J. M. Dickens, and B. T. Hazen, “Approaches to improve preprocessing for Latent Dirichlet Allocation topic modeling,” Decis Support Syst, vol. 185, Oct. 2024, doi: 10.1016/j.dss.2024.114310.
- Y. Jiang, M. Fu, J. Fang, M. Rossi, Y. Wang, and C. W. Tan, “Advancing an LDA-GMM-CorEx topic model with prior domain knowledge in information systems research,” Information and Management, vol. 62, no. 2, Mar. 2025, doi: 10.1016/j.im.2024.104097.
- D. Colla, M. Delsanto, M. Agosto, B. Vitiello, and D. P. Radicioni, “Semantic coherence markers: The contribution of perplexity metrics,” Artif Intell Med, vol. 134, Dec. 2022, doi: 10.1016/j.artmed.2022.102393.
- R. He, C. Palominos, H. Zhang, M. F. Alonso-Sánchez, L. Palaniyappan, and W. Hinzen, “Navigating the semantic space: Unraveling the structure of meaning in psychosis using different computational language models,” Psychiatry Res, vol. 333, Mar. 2024, doi: 10.1016/j.psychres.2024.115752.
- T. Cohen, W. Xu, Y. Guo, S. Pakhomov, and G. Leroy, “Coherence and comprehensibility: Large language models predict lay understanding of health-related content,” J Biomed Inform, vol. 161, Jan. 2025, doi: 10.1016/j.jbi.2024.104758.
- Q. Xie, X. Zhang, Y. Ding, and M. Song, “Monolingual and multilingual topic analysis using LDA and BERT embeddings,” J Informetr, vol. 14, no. 3, Aug. 2020, doi: 10.1016/j.joi.2020.101055.
- J. Liu, R. Long, H. Chen, M. Wu, W. Ma, and Q. Li, “Topic-sentiment analysis of citizen environmental complaints in China: Using a Stacking-BERT model,” J Environ Manage, vol. 371, Dec. 2024, doi: 10.1016/j.jenvman.2024.123112.
- J. Lim and J. Hwang, “Exploring diverse interests of collaborators in smart cities: A topic analysis using LDA and BERT,” Heliyon, vol. 10, no. 9, May 2024, doi: 10.1016/j.heliyon.2024.e30367.
- H. J. Juandri, H. Hasmawati, and B. Bunyamin, “Aspect-Level Sentiment Analysis on GoPay App Reviews Using Multilayer Perceptron and Word Embeddings,” Kinetik: Game Technology, Information System, Computer Network, Computing, Electronics, and Control, Aug. 2024, doi: 10.22219/kinetik.v9i4.2041.
- R. A. Rajagede, “Improving Automatic Essay Scoring for Indonesian Language using Simpler Model and Richer Feature,” Kinetik: Game Technology, Information System, Computer Network, Computing, Electronics, and Control, pp. 11–18, Feb. 2021, doi: 10.22219/kinetik.v6i1.1196.
- A. Salsabil, E. B. Setiawan, and I. Kurniawan, “Content-based filtering movie recommender system using semantic approach with recurrent neural network classification and SGD,” Computer Network, Computing, Electronics, and Control Journal, vol. 9, no. 2, pp. 193–202, 2024, [Online]. Available: https://kinetik.umm.ac.id/index.php/kinetik/article/view/1940https://kinetik.umm.ac.id/index.php/kinetik/article/view/1940
- A. B. Y. A. Putra, Y. Sibaroni, and A. F. Ihsan, “Disinformation Detection on 2024 Indonesia Presidential Election using IndoBERT,” in 2023 International Conference on Data Science and Its Applications, ICoDSA 2023, Institute of Electrical and Electronics Engineers Inc., 2023, pp. 350–355. doi: 10.1109/ICoDSA58501.2023.10277572.
- R. I. Yulfa, B. H. Setiawan, G. G. Lourensius, and K. Purwandari, “Enhancing Hate Speech Detection in Social Media Using IndoBERT Model: A Study of Sentiment Analysis during the 2024 Indonesia Presidential Election,” in ICCA 2023 - 2023 5th International Conference on Computer and Applications, Proceedings, Institute of Electrical and Electronics Engineers Inc., 2023. doi: 10.1109/ICCA59364.2023.10401700.
References
Komisi Pemilihan Umum, PERATURAN KOMISI PEMILIHAN UMUM. Indonesia, 2024.
I. Gjorshoska, A. Dedinec, J. Prodanova, A. Dedinec, and L. Kocarev, “Public perception of waste regulations implementation. Natural language processing vs real GHG emission reduction modeling,” Ecol Inform, vol. 76, Sep. 2023, doi: 10.1016/j.ecoinf.2023.102130.
O. Olabanjo et al., “From Twitter to Aso-Rock: A sentiment analysis framework for understanding Nigeria 2023 presidential election,” Heliyon, vol. 9, no. 5, May 2023, doi: 10.1016/j.heliyon.2023.e16085.
S. Ha and E. Grubert, “Hybridizing qualitative coding with natural language processing and deep learning to assess public comments: A case study of the clean power plan,” Energy Res Soc Sci, vol. 98, Apr. 2023, doi: 10.1016/j.erss.2023.103016.
A. A. Chamid, Widowati, and R. Kusumaningrum, “Multi-Label Text Classification on Indonesian User Reviews Using Semi-Supervised Graph Neural Networks,” ICIC Express Letters, vol. 17, no. 10, pp. 1075–1084, 2023, doi: 10.24507/icicel.17.10.1075.
A. A. Chamid, Widowati, and R. Kusumaningrum, “Graph-Based Semi-Supervised Deep Learning for Indonesian Aspect-Based Sentiment Analysis,” Big Data and Cognitive Computing, vol. 7, no. 1, p. 5, 2023, doi: 10.3390/bdcc7010005.
M. C. Rahmadan, A. N. Hidayanto, D. S. Ekasari, B. Purwandari, and Theresiawati, “Sentiment Analysis and Topic Modelling Using the LDA Method related to the Flood Disaster in Jakarta on Twitter,” in International Conference on Informatics, Multimedia, Cyber and Information System (ICIMCIS) Sentiment, 2020, pp. 126–130.
M. Paramarta and J. B. B. Darmawan, “Implementasi Metode Support Vector Machine dalam Analisis Sentimen Opini Masyarakat Terhadap Pilkada 2020 pada Media Sosial Twitter,” in Prosiding Nasional Rekayasa Teknologi Industri dan Informasi XVIII, Nov. 2023, pp. 836–841. [Online]. Available: http://journal.itny.ac.id/index.php/ReTII
A. Rahmawati, A. Marjuni, and J. Zeniarja, “Analisis Sentimen Publik Pada Media Sosial Twitter Terhadap Pelaksanaan Pilkada Serentak Menggunakan Algoritma Support Vector Machine,” CCIT Journal, vol. 10, no. 2, pp. 197–206, 2017, doi: 10.33050/ccit.v10i2.539.
R. Pohan et al., “Implementasi Algoritma Support Vector Machine dan Model Bag-of-Words dalam Analisis Sentimen mengenai PILKADA 2020 pada Pengguna Twitter,” Jurnal Pengembangan Teknologi Informasi dan Ilmu Komputer, vol. 6, no. 10, pp. 4924–4931, 2022, [Online]. Available: http://j-ptiik.ub.ac.id
A. Muzaki and A. Witanti, “Sentimen Analisis Masyarakat Di Twitter Terhadap Pilkada 2020 Ditengah Pandemic Covid-19 Dengan Metode NaïVe Bayes Classifier,” Jurnal Teknik Informatika (Jutif), vol. 2, no. 2, pp. 101–107, 2021.
S. N. Listyarini and D. A. Anggoro, “Analisis Sentimen Pilkada di Tengah Pandemi Covid-19 Menggunakan Convolution Neural Network (CNN),” Jurnal Pendidikan dan Teknologi Indonesia, vol. 1, no. 7, pp. 261–268, 2021, doi: 10.52436/1.jpti.60.
N. Habbat, H. Anoun, and L. Hassouni, “Sentiment Analysis and Topic Modeling on Arabic Twitter Data during Covid-19 Pandemic,” Indonesian Journal of Innovation and Applied Sciences (IJIAS), vol. 2, no. 1, pp. 60–67, 2022, doi: 10.47540/ijias.v2i1.432.
I. Alagha, “Topic Modeling and Sentiment Analysis of Twitter Discussions on COVID-19 from Spatial and Temporal Perspectives,” Journal of Information Science Theory and Practice, vol. 9, no. 1, pp. 35–53, 2021.
A. Verbytska, “Topic modelling as a method for framing analysis of news coverage of the Russia-Ukraine war in 2022–2023,” Lang Commun, vol. 99, pp. 174–193, Nov. 2024, doi: 10.1016/j.langcom.2024.10.004.
S. Ying, “Guests’ Aesthetic experience with lifestyle hotels: An application of LDA topic modelling analysis,” Heliyon, vol. 10, no. 16, Aug. 2024, doi: 10.1016/j.heliyon.2024.e35894.
S. E. Uthirapathy and D. Sandanam, “Topic Modelling and Opinion Analysis on Climate Change Twitter Data Using LDA and BERT Model.,” in Procedia Computer Science, Elsevier B.V., 2022, pp. 908–917. doi: 10.1016/j.procs.2023.01.071.
M. N. P. Ma’ady, A. F. A. Rahim, T. S. N. Syahda, A. F. Rizqi, and M. C. A. Ratna, “Malaysia Citizen Sentiment on Government Response Towards Covid-19 Disaster Management: Using LDA-based Topic Visualization on Twitter,” in Procedia Computer Science, Elsevier B.V., 2024, pp. 561–569. doi: 10.1016/j.procs.2024.03.040.
K. Taha, P. D. Yoo, C. Yeun, D. Homouz, and A. Taha, “A comprehensive survey of text classification techniques and their research applications: Observational and experimental insights,” Nov. 01, 2024, Elsevier Ireland Ltd. doi: 10.1016/j.cosrev.2024.100664.
A. A. Chamid, Widowati, and R. Kusumaningrum, “Labeling Consistency Test of Multi-Label Data for Aspect and Sentiment Classification Using the Cohen Kappa Method,” Ingénierie des Systèmes d’Information, vol. 29, no. 1, pp. 161–167, 2024.
Supriyono, A. P. Wibawa, Suyono, and F. Kurniawan, “Advancements in natural language processing: Implications, challenges, and future directions,” Telematics and Informatics Reports, vol. 16, Dec. 2024, doi: 10.1016/j.teler.2024.100173.
A. A. Firdaus, A. Yudhana, and I. Riadi, “Public Opinion Analysis of Presidential Candidate Using Naïve Bayes Method,” Kinetik: Game Technology, Information System, Computer Network, Computing, Electronics, and Control, May 2023, doi: 10.22219/kinetik.v8i2.1686.
A. A. Chamid, W. Widowati, and R. Kusumaningrum, “Text data labeling process for semi-supervised learning modeling,” 12TH INTERNATIONAL SEMINAR ON NEW PARADIGM AND INNOVATION ON NATURAL SCIENCES AND ITS APPLICATIONS (12TH ISNPINSA): Contribution of Science and Technology in the Changing World, vol. 3165, p. 030011, 2024, doi: 10.1063/5.0216320.
A. W. Pradana and M. Hayaty, “The Effect of Stemming and Removal of Stopwords on the Accuracy of Sentiment Analysis on Indonesian-language Texts,” Kinetik: Game Technology, Information System, Computer Network, Computing, Electronics, and Control, pp. 375–380, Oct. 2019, doi: 10.22219/kinetik.v4i4.912.
J. Zimmermann, L. E. Champagne, J. M. Dickens, and B. T. Hazen, “Approaches to improve preprocessing for Latent Dirichlet Allocation topic modeling,” Decis Support Syst, vol. 185, Oct. 2024, doi: 10.1016/j.dss.2024.114310.
Y. Jiang, M. Fu, J. Fang, M. Rossi, Y. Wang, and C. W. Tan, “Advancing an LDA-GMM-CorEx topic model with prior domain knowledge in information systems research,” Information and Management, vol. 62, no. 2, Mar. 2025, doi: 10.1016/j.im.2024.104097.
D. Colla, M. Delsanto, M. Agosto, B. Vitiello, and D. P. Radicioni, “Semantic coherence markers: The contribution of perplexity metrics,” Artif Intell Med, vol. 134, Dec. 2022, doi: 10.1016/j.artmed.2022.102393.
R. He, C. Palominos, H. Zhang, M. F. Alonso-Sánchez, L. Palaniyappan, and W. Hinzen, “Navigating the semantic space: Unraveling the structure of meaning in psychosis using different computational language models,” Psychiatry Res, vol. 333, Mar. 2024, doi: 10.1016/j.psychres.2024.115752.
T. Cohen, W. Xu, Y. Guo, S. Pakhomov, and G. Leroy, “Coherence and comprehensibility: Large language models predict lay understanding of health-related content,” J Biomed Inform, vol. 161, Jan. 2025, doi: 10.1016/j.jbi.2024.104758.
Q. Xie, X. Zhang, Y. Ding, and M. Song, “Monolingual and multilingual topic analysis using LDA and BERT embeddings,” J Informetr, vol. 14, no. 3, Aug. 2020, doi: 10.1016/j.joi.2020.101055.
J. Liu, R. Long, H. Chen, M. Wu, W. Ma, and Q. Li, “Topic-sentiment analysis of citizen environmental complaints in China: Using a Stacking-BERT model,” J Environ Manage, vol. 371, Dec. 2024, doi: 10.1016/j.jenvman.2024.123112.
J. Lim and J. Hwang, “Exploring diverse interests of collaborators in smart cities: A topic analysis using LDA and BERT,” Heliyon, vol. 10, no. 9, May 2024, doi: 10.1016/j.heliyon.2024.e30367.
H. J. Juandri, H. Hasmawati, and B. Bunyamin, “Aspect-Level Sentiment Analysis on GoPay App Reviews Using Multilayer Perceptron and Word Embeddings,” Kinetik: Game Technology, Information System, Computer Network, Computing, Electronics, and Control, Aug. 2024, doi: 10.22219/kinetik.v9i4.2041.
R. A. Rajagede, “Improving Automatic Essay Scoring for Indonesian Language using Simpler Model and Richer Feature,” Kinetik: Game Technology, Information System, Computer Network, Computing, Electronics, and Control, pp. 11–18, Feb. 2021, doi: 10.22219/kinetik.v6i1.1196.
A. Salsabil, E. B. Setiawan, and I. Kurniawan, “Content-based filtering movie recommender system using semantic approach with recurrent neural network classification and SGD,” Computer Network, Computing, Electronics, and Control Journal, vol. 9, no. 2, pp. 193–202, 2024, [Online]. Available: https://kinetik.umm.ac.id/index.php/kinetik/article/view/1940https://kinetik.umm.ac.id/index.php/kinetik/article/view/1940
A. B. Y. A. Putra, Y. Sibaroni, and A. F. Ihsan, “Disinformation Detection on 2024 Indonesia Presidential Election using IndoBERT,” in 2023 International Conference on Data Science and Its Applications, ICoDSA 2023, Institute of Electrical and Electronics Engineers Inc., 2023, pp. 350–355. doi: 10.1109/ICoDSA58501.2023.10277572.
R. I. Yulfa, B. H. Setiawan, G. G. Lourensius, and K. Purwandari, “Enhancing Hate Speech Detection in Social Media Using IndoBERT Model: A Study of Sentiment Analysis during the 2024 Indonesia Presidential Election,” in ICCA 2023 - 2023 5th International Conference on Computer and Applications, Proceedings, Institute of Electrical and Electronics Engineers Inc., 2023. doi: 10.1109/ICCA59364.2023.10401700.