Issue
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Public Opinion Analysis of Presidential Candidate Using Naïve Bayes Method
Corresponding Author(s) : Asno Azzawagama Firdaus
Kinetik: Game Technology, Information System, Computer Network, Computing, Electronics, and Control,
Vol. 8, No. 2, May 2023
Abstract
Elections for president and vice president will take place in 2024. Heading into the election, promoted candidates were vying for public sympathy. People often discussed as presidential candidates are Anies Baswedan, Ganjar Pranowo, and Prabowo Subianto. Therefore, we need a way to predict potential candidates and voter demographics from public opinion on Twitter using sentiment analysis. One of his methods commonly used to classify sentiment analysis is Naive Bayes. This study used the naive Bayes classifier and the TF-IDF extraction function to add weights to the text. Use the scikit-learn Python library to help determine the polarity of negative and positive sentiment classes in your dataset. The datasets used were Twitter datasets acquired from October to December 2022, for a total of 15,000 datasets. The best test scenario obtained by splitting the test and training data is 70% test data and 30% training data, with the highest accuracy generated from the 95% Ganjar dataset. Using the Anies, Ganjar, and Prabowo test data, the positive mood scores for each candidate were 833, 77, and 524, respectively, while the negative mood scores were 637, 1423, and 976, respectively. The test was performed using a confusion matrix and k-fold cross-validation, and the best results were obtained on the Ganjar data set. That is a confusion matrix of 94.93% and a k-fold cross-validation of 94.46%. The lowest f1-score for the positive class is 67% for the Anies dataset and 27% for the negative class for the Ganjar dataset.
Keywords
Download Citation
Endnote/Zotero/Mendeley (RIS)BibTeX
- Admin, “Tahapan dan Jadwal Penyelenggaraan Pemilu Tahun 2024,” Komisi Pemilihan Umum, 2022.
- Admin, “Survei Poltracking: Ganjar, Prabowo, Anies Jadi Capres Terkuat di 2024,” Poltracking Indonesia, 2022.
- I. Riadi, D. Aprilliansyah, and S. Sunardi, “Mobile Device Security Evaluation using Reverse TCP Method,” Kinet. Game Technol. Inf. Syst. Comput. Network, Comput. Electron. Control, vol. 4, no. 3, 2022, https://doi.org/10.22219/kinetik.v7i3.1433.
- A. Karami et al., “2020 U.S. presidential election in swing states: Gender differences in Twitter conversations,” Int. J. Inf. Manag. Data Insights, vol. 2, no. 2, 2022, https://doi.org/10.1016/j.jjimei.2022.100097.
- A. W. Pradana and M. Hayaty, “The Effect of Stemming and Removal of Stopwords on the Accuracy of Sentiment Analysis on Indonesian-language Texts,” Kinet. Game Technol. Inf. Syst. Comput. Network, Comput. Electron. Control, vol. 4, no. 3, pp. 375–380, 2019, https://doi.org/10.22219/kinetik.v4i4.912.
- N. Meliana, Sunardi, and A. Fadlil, “Identification of Cyber Bullying by using Clustering Methods on Social Media Twitter,” J. Phys. Conf. Ser., vol. 1373, no. 1, 2019, https://doi.org/10.1088/1742-6596/1373/1/012040.
- I. Riadi, H. Herman, and A. Z. Ifani, “Optimization of System Authentication Services using Blockchain Technology,” Kinet. Game Technol. Inf. Syst. Comput. Network, Comput. Electron. Control, vol. 4, 2021, https://doi.org/10.22219/kinetik.v6i4.1325.
- H. Sujaini, “Performance of Methods in Identifying Similar Languages Based on String to Word Vector,” Khazanah Inform. J. Ilmu Komput. dan Inform., vol. 6, no. 1, pp. 9–14, 2020, https://doi.org/10.23917/khif.v6i1.8199.
- A. Abayomi-Alli, O. Abayomi-Alli, S. Misra, and L. Fernandez-Sanz, “Study of the Yahoo-Yahoo Hash-Tag Tweets Using Sentiment Analysis and Opinion Mining Algorithms,” Inf., vol. 13, no. 3, pp. 1–22, 2022, https://doi.org/10.3390/info13030152.
- S. Nurul, J. Fitriyyah, N. Safriadi, E. Esyudha, and P. #3, “JEPIN (Jurnal Edukasi dan Penelitian Informatika) Analisis Sentimen Calon Presiden Indonesia 2019 dari Media Sosial Twitter Menggunakan Metode Naive Bayes,” (Jurnal Edukasi dan Penelit. Inform., vol. 5, no. 3, pp. 279–285, 2019, https://doi.org/10.26418/jp.v5i3.34368.
- A. M. Soesanto, C. Chandra, A. M. Soesanto, and C. Chandra, “ScienceDirect Sentiments comparison on on Twitter Twitter about about LGBT LGBT Sentiments comparison Sentiments comparison Twitter about LGBT,” Procedia Comput. Sci., vol. 216, pp. 765–773, 2023, https://doi.org/10.1016/j.procs.2022.12.194.
- H. A. Santoso, E. H. Rachmawanto, A. Nugraha, A. A. Nugroho, D. R. I. M. Setiadi, and R. S. Basuki, “Hoax classification and sentiment analysis of Indonesian news using Naive Bayes optimization,” Telkomnika (Telecommunication Comput. Electron. Control., vol. 18, no. 2, pp. 799–806, 2020, https://doi.org/10.12928/TELKOMNIKA.V18I2.14744.
- M. Liang and T. Niu, “Research on Text Classification Techniques Based on Improved TF-IDF Algorithm and LSTM Inputs,” Procedia Comput. Sci., vol. 208, pp. 460–470, 2022, https://doi.org/10.1016/j.procs.2022.10.064.
- V. A. Fitri, R. Andreswari, and M. A. Hasibuan, “Sentiment analysis of social media Twitter with case of Anti-LGBT campaign in Indonesia using Naïve Bayes, decision tree, and random forest algorithm,” Procedia Comput. Sci., vol. 161, pp. 765–772, 2019, https://doi.org/10.1016/j.procs.2019.11.181.
- M. Qorib, T. Oladunni, M. Denis, E. Ososanya, and P. Cotae, “Covid-19 vaccine hesitancy: Text mining, sentiment analysis and machine learning on COVID-19 vaccination Twitter dataset,” Expert Syst. Appl., vol. 212, no. September 2022, p. 118715, 2023, https://doi.org/10.1016/j.eswa.2022.118715.
- A. M. U. D. Khanday, S. T. Rabani, Q. R. Khan, and S. H. Malik, “Detecting twitter hate speech in COVID-19 era using machine learning and ensemble learning techniques,” Int. J. Inf. Manag. Data Insights, vol. 2, no. 2, p. 100120, 2022, https://doi.org/10.1016/j.jjimei.2022.100120.
- F. Alzami, E. D. Udayanti, D. P. Prabowo, and R. A. Megantara, “Document Preprocessing with TF-IDF to Improve the Polarity Classification Performance of Unstructured Sentiment Analysis,” Kinet. Game Technol. Inf. Syst. Comput. Network, Comput. Electron. Control, vol. 4, no. 3, pp. 235–242, 2020, https://doi.org/10.22219/kinetik.v5i3.1066.
- V. P. Ramadhan, P. Purwanto, and F. Alzami, “Sentiment Analysis of Community Response Indonesia Against Covid-19 on Twitter Based on Negation Handling,” Kinet. Game Technol. Inf. Syst. Comput. Network, Comput. Electron. Control, vol. 4, no. 2, 2022, https://doi.org/10.22219/kinetik.v7i2.1429.
- B. Kholifah, I. Syarif, and T. Badriyah, “Mental Disorder Detection via Social Media Mining using Deep Learning,” Kinet. Game Technol. Inf. Syst. Comput. Network, Comput. Electron. Control, vol. 4, pp. 309–316, 2020, https://doi.org/10.22219/kinetik.v5i4.1120.
- A. Yudhana, A. Fadlil, and M. Rosidin, “Indonesian words error detection system using nazief adriani stemmer algorithm,” Int. J. Adv. Comput. Sci. Appl., vol. 10, no. 12, pp. 219–225, 2019, https://doi.org/10.14569/ijacsa.2019.0101231.
- R. H. Muhammadi, T. G. Laksana, and A. B. Arifa, “Combination of Support Vector Machine and Lexicon-Based Algorithm in Twitter Sentiment Analysis,” Khazanah Inform. J. Ilmu Komput. dan Inform., vol. 8, no. 1, pp. 59–71, 2022, https://doi.org/10.23917/khif.v8i1.15213.
- M. Shaden, A. Fadel, S. Achmad, and R. Sutoyo, “ScienceDirect ScienceDirect Sentiment analysis for customer review : Case study of Traveloka Sentiment analysis for customer review : Case study of Traveloka,” Procedia Comput. Sci., vol. 216, no. 2022, pp. 682–690, 2023, https://doi.org/10.1016/j.procs.2022.12.184.
- D. Samuel, L. Aparecido, A. Adeel, and J. Paulo, “PL-kNN : A Python-based implementation of a parameterless ? -Nearest Neighbors classifier,” Softw. Impacts, vol. 15, no. November 2022, p. 100459, 2023, https://doi.org/10.1016/j.simpa.2022.100459.
- I. Riadi, S. Sunardi, and P. Widiandana, “Mobile Forensics for Cyberbullying Detection using Term Frequency - Inverse Document Frequency (TF-IDF),” J. Ilm. Tek. Elektro Komput. dan Inform., vol. 5, no. 2, p. 68, 2020, https://doi.org/10.26555/jiteki.v5i2.14510.
- Imamah and F. H. Rachman, “Twitter sentiment analysis of Covid-19 using term weighting TF-IDF and logistic regresion,” Proceeding - 6th Inf. Technol. Int. Semin. ITIS 2020, pp. 238–242, 2020, https://doi.org/10.1109/ITIS50118.2020.9320958.
- D. Jollyta, G. Gusrianty, and D. Sukrianto, “Analysis of Slow Moving Goods Classification Technique: Random Forest and Naïve Bayes,” Khazanah Inform. J. Ilmu Komput. dan Inform., vol. 5, no. 2, pp. 134–139, 2019, https://doi.org/10.23917/khif.v5i2.8263.
- I. Riadi, R. Umar, and F. D. Aini, “Analisis Perbandingan Detection Traffic Anomaly Dengan Metode Naive Bayes Dan Support Vector Machine (Svm),” Ilk. J. Ilm., vol. 11, no. 1, pp. 17–24, 2019, https://doi.org/10.33096/ilkom.v11i1.361.17-24.
- A. Yudhana, D. Sulistyo, and I. Mufandi, “GIS-based and Naïve Bayes for nitrogen soil mapping in Lendah, Indonesia,” Sens. Bio-Sensing Res., vol. 33, p. 100435, 2021, https://doi.org/10.1016/j.sbsr.2021.100435.
- A. Yudhana, I. Riadi, and F. Ridho, “DDoS classification using neural network and naïve bayes methods for network forensics,” Int. J. Adv. Comput. Sci. Appl., vol. 9, no. 11, pp. 177–183, 2018, https://doi.org/10.14569/ijacsa.2018.091125.
- N. Hayatin, G. I. Marthasari, and L. Nuraini, “Optimization of Sentiment Analysis for Indonesian Presidential Election using Naïve Bayes and Particle Swarm Optimization,” J. Online Inform., vol. 5, no. 1, pp. 81–88, 2020, https://doi.org/10.15575/join.v5i1.558.
- F. F. Zain and Y. Sibaroni, “Effectiveness of SVM Method by Naïve Bayes Weighting in Movie Review Classification,” Khazanah Inform. J. Ilmu Komput. dan Inform., vol. 5, no. 2, pp. 108–114, 2019, https://doi.org/10.23917/khif.v5i2.7770.
- B. Al sari et al., “Sentiment analysis for cruises in Saudi Arabia on social media platforms using machine learning algorithms,” J. Big Data, vol. 9, no. 1, 2022, https://doi.org/10.1186/s40537-022-00568-5.
- F. Rahmad, Y. Suryanto, and K. Ramli, “Performance Comparison of Anti-Spam Technology Using Confusion Matrix Classification,” IOP Conf. Ser. Mater. Sci. Eng., vol. 879, no. 1, pp. 1–12, 2020, https://doi.org/10.1088/1757-899X/879/1/012076.
- K. Brito and P. J. L. Adeodato, “Machine learning for predicting elections in Latin America based on social media engagement and polls,” Gov. Inf. Q., vol. 40, no. 1, 2023, https://doi.org/10.1016/j.giq.2022.101782.
- M. Raihan, F. Sya’ Bani |, F. Sya’ Bani, U. Enri, and T. N. Padilah, “Analisis Sentimen Terhadap Bakal Calon Presiden 2024 dengan Algoritma Naïve Bayes,” J. Ris. Komputer), vol. 9, no. 2, pp. 2407–389, 2022, https://doi.org/10.30865/jurikom.v9i2.3989.
References
Admin, “Tahapan dan Jadwal Penyelenggaraan Pemilu Tahun 2024,” Komisi Pemilihan Umum, 2022.
Admin, “Survei Poltracking: Ganjar, Prabowo, Anies Jadi Capres Terkuat di 2024,” Poltracking Indonesia, 2022.
I. Riadi, D. Aprilliansyah, and S. Sunardi, “Mobile Device Security Evaluation using Reverse TCP Method,” Kinet. Game Technol. Inf. Syst. Comput. Network, Comput. Electron. Control, vol. 4, no. 3, 2022, https://doi.org/10.22219/kinetik.v7i3.1433.
A. Karami et al., “2020 U.S. presidential election in swing states: Gender differences in Twitter conversations,” Int. J. Inf. Manag. Data Insights, vol. 2, no. 2, 2022, https://doi.org/10.1016/j.jjimei.2022.100097.
A. W. Pradana and M. Hayaty, “The Effect of Stemming and Removal of Stopwords on the Accuracy of Sentiment Analysis on Indonesian-language Texts,” Kinet. Game Technol. Inf. Syst. Comput. Network, Comput. Electron. Control, vol. 4, no. 3, pp. 375–380, 2019, https://doi.org/10.22219/kinetik.v4i4.912.
N. Meliana, Sunardi, and A. Fadlil, “Identification of Cyber Bullying by using Clustering Methods on Social Media Twitter,” J. Phys. Conf. Ser., vol. 1373, no. 1, 2019, https://doi.org/10.1088/1742-6596/1373/1/012040.
I. Riadi, H. Herman, and A. Z. Ifani, “Optimization of System Authentication Services using Blockchain Technology,” Kinet. Game Technol. Inf. Syst. Comput. Network, Comput. Electron. Control, vol. 4, 2021, https://doi.org/10.22219/kinetik.v6i4.1325.
H. Sujaini, “Performance of Methods in Identifying Similar Languages Based on String to Word Vector,” Khazanah Inform. J. Ilmu Komput. dan Inform., vol. 6, no. 1, pp. 9–14, 2020, https://doi.org/10.23917/khif.v6i1.8199.
A. Abayomi-Alli, O. Abayomi-Alli, S. Misra, and L. Fernandez-Sanz, “Study of the Yahoo-Yahoo Hash-Tag Tweets Using Sentiment Analysis and Opinion Mining Algorithms,” Inf., vol. 13, no. 3, pp. 1–22, 2022, https://doi.org/10.3390/info13030152.
S. Nurul, J. Fitriyyah, N. Safriadi, E. Esyudha, and P. #3, “JEPIN (Jurnal Edukasi dan Penelitian Informatika) Analisis Sentimen Calon Presiden Indonesia 2019 dari Media Sosial Twitter Menggunakan Metode Naive Bayes,” (Jurnal Edukasi dan Penelit. Inform., vol. 5, no. 3, pp. 279–285, 2019, https://doi.org/10.26418/jp.v5i3.34368.
A. M. Soesanto, C. Chandra, A. M. Soesanto, and C. Chandra, “ScienceDirect Sentiments comparison on on Twitter Twitter about about LGBT LGBT Sentiments comparison Sentiments comparison Twitter about LGBT,” Procedia Comput. Sci., vol. 216, pp. 765–773, 2023, https://doi.org/10.1016/j.procs.2022.12.194.
H. A. Santoso, E. H. Rachmawanto, A. Nugraha, A. A. Nugroho, D. R. I. M. Setiadi, and R. S. Basuki, “Hoax classification and sentiment analysis of Indonesian news using Naive Bayes optimization,” Telkomnika (Telecommunication Comput. Electron. Control., vol. 18, no. 2, pp. 799–806, 2020, https://doi.org/10.12928/TELKOMNIKA.V18I2.14744.
M. Liang and T. Niu, “Research on Text Classification Techniques Based on Improved TF-IDF Algorithm and LSTM Inputs,” Procedia Comput. Sci., vol. 208, pp. 460–470, 2022, https://doi.org/10.1016/j.procs.2022.10.064.
V. A. Fitri, R. Andreswari, and M. A. Hasibuan, “Sentiment analysis of social media Twitter with case of Anti-LGBT campaign in Indonesia using Naïve Bayes, decision tree, and random forest algorithm,” Procedia Comput. Sci., vol. 161, pp. 765–772, 2019, https://doi.org/10.1016/j.procs.2019.11.181.
M. Qorib, T. Oladunni, M. Denis, E. Ososanya, and P. Cotae, “Covid-19 vaccine hesitancy: Text mining, sentiment analysis and machine learning on COVID-19 vaccination Twitter dataset,” Expert Syst. Appl., vol. 212, no. September 2022, p. 118715, 2023, https://doi.org/10.1016/j.eswa.2022.118715.
A. M. U. D. Khanday, S. T. Rabani, Q. R. Khan, and S. H. Malik, “Detecting twitter hate speech in COVID-19 era using machine learning and ensemble learning techniques,” Int. J. Inf. Manag. Data Insights, vol. 2, no. 2, p. 100120, 2022, https://doi.org/10.1016/j.jjimei.2022.100120.
F. Alzami, E. D. Udayanti, D. P. Prabowo, and R. A. Megantara, “Document Preprocessing with TF-IDF to Improve the Polarity Classification Performance of Unstructured Sentiment Analysis,” Kinet. Game Technol. Inf. Syst. Comput. Network, Comput. Electron. Control, vol. 4, no. 3, pp. 235–242, 2020, https://doi.org/10.22219/kinetik.v5i3.1066.
V. P. Ramadhan, P. Purwanto, and F. Alzami, “Sentiment Analysis of Community Response Indonesia Against Covid-19 on Twitter Based on Negation Handling,” Kinet. Game Technol. Inf. Syst. Comput. Network, Comput. Electron. Control, vol. 4, no. 2, 2022, https://doi.org/10.22219/kinetik.v7i2.1429.
B. Kholifah, I. Syarif, and T. Badriyah, “Mental Disorder Detection via Social Media Mining using Deep Learning,” Kinet. Game Technol. Inf. Syst. Comput. Network, Comput. Electron. Control, vol. 4, pp. 309–316, 2020, https://doi.org/10.22219/kinetik.v5i4.1120.
A. Yudhana, A. Fadlil, and M. Rosidin, “Indonesian words error detection system using nazief adriani stemmer algorithm,” Int. J. Adv. Comput. Sci. Appl., vol. 10, no. 12, pp. 219–225, 2019, https://doi.org/10.14569/ijacsa.2019.0101231.
R. H. Muhammadi, T. G. Laksana, and A. B. Arifa, “Combination of Support Vector Machine and Lexicon-Based Algorithm in Twitter Sentiment Analysis,” Khazanah Inform. J. Ilmu Komput. dan Inform., vol. 8, no. 1, pp. 59–71, 2022, https://doi.org/10.23917/khif.v8i1.15213.
M. Shaden, A. Fadel, S. Achmad, and R. Sutoyo, “ScienceDirect ScienceDirect Sentiment analysis for customer review : Case study of Traveloka Sentiment analysis for customer review : Case study of Traveloka,” Procedia Comput. Sci., vol. 216, no. 2022, pp. 682–690, 2023, https://doi.org/10.1016/j.procs.2022.12.184.
D. Samuel, L. Aparecido, A. Adeel, and J. Paulo, “PL-kNN : A Python-based implementation of a parameterless ? -Nearest Neighbors classifier,” Softw. Impacts, vol. 15, no. November 2022, p. 100459, 2023, https://doi.org/10.1016/j.simpa.2022.100459.
I. Riadi, S. Sunardi, and P. Widiandana, “Mobile Forensics for Cyberbullying Detection using Term Frequency - Inverse Document Frequency (TF-IDF),” J. Ilm. Tek. Elektro Komput. dan Inform., vol. 5, no. 2, p. 68, 2020, https://doi.org/10.26555/jiteki.v5i2.14510.
Imamah and F. H. Rachman, “Twitter sentiment analysis of Covid-19 using term weighting TF-IDF and logistic regresion,” Proceeding - 6th Inf. Technol. Int. Semin. ITIS 2020, pp. 238–242, 2020, https://doi.org/10.1109/ITIS50118.2020.9320958.
D. Jollyta, G. Gusrianty, and D. Sukrianto, “Analysis of Slow Moving Goods Classification Technique: Random Forest and Naïve Bayes,” Khazanah Inform. J. Ilmu Komput. dan Inform., vol. 5, no. 2, pp. 134–139, 2019, https://doi.org/10.23917/khif.v5i2.8263.
I. Riadi, R. Umar, and F. D. Aini, “Analisis Perbandingan Detection Traffic Anomaly Dengan Metode Naive Bayes Dan Support Vector Machine (Svm),” Ilk. J. Ilm., vol. 11, no. 1, pp. 17–24, 2019, https://doi.org/10.33096/ilkom.v11i1.361.17-24.
A. Yudhana, D. Sulistyo, and I. Mufandi, “GIS-based and Naïve Bayes for nitrogen soil mapping in Lendah, Indonesia,” Sens. Bio-Sensing Res., vol. 33, p. 100435, 2021, https://doi.org/10.1016/j.sbsr.2021.100435.
A. Yudhana, I. Riadi, and F. Ridho, “DDoS classification using neural network and naïve bayes methods for network forensics,” Int. J. Adv. Comput. Sci. Appl., vol. 9, no. 11, pp. 177–183, 2018, https://doi.org/10.14569/ijacsa.2018.091125.
N. Hayatin, G. I. Marthasari, and L. Nuraini, “Optimization of Sentiment Analysis for Indonesian Presidential Election using Naïve Bayes and Particle Swarm Optimization,” J. Online Inform., vol. 5, no. 1, pp. 81–88, 2020, https://doi.org/10.15575/join.v5i1.558.
F. F. Zain and Y. Sibaroni, “Effectiveness of SVM Method by Naïve Bayes Weighting in Movie Review Classification,” Khazanah Inform. J. Ilmu Komput. dan Inform., vol. 5, no. 2, pp. 108–114, 2019, https://doi.org/10.23917/khif.v5i2.7770.
B. Al sari et al., “Sentiment analysis for cruises in Saudi Arabia on social media platforms using machine learning algorithms,” J. Big Data, vol. 9, no. 1, 2022, https://doi.org/10.1186/s40537-022-00568-5.
F. Rahmad, Y. Suryanto, and K. Ramli, “Performance Comparison of Anti-Spam Technology Using Confusion Matrix Classification,” IOP Conf. Ser. Mater. Sci. Eng., vol. 879, no. 1, pp. 1–12, 2020, https://doi.org/10.1088/1757-899X/879/1/012076.
K. Brito and P. J. L. Adeodato, “Machine learning for predicting elections in Latin America based on social media engagement and polls,” Gov. Inf. Q., vol. 40, no. 1, 2023, https://doi.org/10.1016/j.giq.2022.101782.
M. Raihan, F. Sya’ Bani |, F. Sya’ Bani, U. Enri, and T. N. Padilah, “Analisis Sentimen Terhadap Bakal Calon Presiden 2024 dengan Algoritma Naïve Bayes,” J. Ris. Komputer), vol. 9, no. 2, pp. 2407–389, 2022, https://doi.org/10.30865/jurikom.v9i2.3989.