Quick jump to page content
  • Main Navigation
  • Main Content
  • Sidebar

  • Home
  • Current
  • Archives
  • Join As Reviewer
  • Info
  • Announcements
  • Statistics
  • About
    • About the Journal
    • Submissions
    • Editorial Team
    • Privacy Statement
    • Contact
  • Register
  • Login
  • Home
  • Current
  • Archives
  • Join As Reviewer
  • Info
  • Announcements
  • Statistics
  • About
    • About the Journal
    • Submissions
    • Editorial Team
    • Privacy Statement
    • Contact
  1. Home
  2. Archives
  3. Vol. 6, No. 4, November 2021
  4. Articles

Issue

Vol. 6, No. 4, November 2021

Issue Published : Nov 30, 2021
Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Opinion Spam Classification on Steam Review using Support Vector Machine with Lexicon-Based Features

https://doi.org/10.22219/kinetik.v6i4.1323
Rafif Taqiuddin
Universitas Brawijaya
Fitra A. Bachtiar
Universitas Brawijaya
Welly Purnomo
Universitas Brawijaya

Corresponding Author(s) : Rafif Taqiuddin

rafiftq@gmail.com

Kinetik: Game Technology, Information System, Computer Network, Computing, Electronics, and Control, Vol. 6, No. 4, November 2021
Article Published : Nov 30, 2021

Share
WA Share on Facebook Share on Twitter Pinterest Email Telegram
  • Abstract
  • Cite
  • References
  • Authors Details

Abstract

Steam is a video game digital distribution platform developed by Valve Software. Steam provides a user review feature, where users can write about criticism or comments on games that can contain positive or negative sentiments. Based on the questionnaire that the author conducted to Steam users from all over Indonesia, the user review feature provided by Steam was not sufficient. This is because there are fake reviews that allow biased opinions from certain parties so that a phenomenon called review bombing often occurs where users review only to drop or raise the image of a product, not to review it sincerely. From these problems, a solution design is needed that can classify fake reviews on the Steam service. The Support Vector Machine (SVM) classification method was chosen as the model in combination with lexicon-based feature retrieval and Term Frequency – Inverse Document Frequency (TF-IDF) weighting. Of the 236 classification test data conducted by SVM, it produced 105 reviews which were categorized as Valid Reviews. Meanwhile, those categorized as Opinion Spam by SVM are 131 reviews. The accuracy level of the data classification model using Support Vector Machine method is of 81% by dividing training data by 70% and test data by 30% with a random state level of 109. A dashboard in the form of a web application has also been made that contains the classification model to be used for buying reference for Steam user.

Keywords

Opinion Spam Text Mining Support Vector Machine Lexicon-Based Stream Review
Taqiuddin, R., Bachtiar, F. A., & Purnomo, W. (2021). Opinion Spam Classification on Steam Review using Support Vector Machine with Lexicon-Based Features. Kinetik: Game Technology, Information System, Computer Network, Computing, Electronics, and Control, 6(4). https://doi.org/10.22219/kinetik.v6i4.1323
  • ACM
  • ACS
  • APA
  • ABNT
  • Chicago
  • Harvard
  • IEEE
  • MLA
  • Turabian
  • Vancouver
Download Citation
Endnote/Zotero/Mendeley (RIS)
BibTeX
References
  1. Jindal, N., & Liu, B. (2008). Opinion spam and analysis. ACM Press. https://doi.org/10.1145/1341531.1341560
  2. Lin, D., Bezemer, C. P., & Hassan, A. E. (2017). Studying the urgent updates of popular games on the steam platform. Empirical Software Engineering, 22(4), 2095-2126. https://doi.org/10.1007/s10664-016-9480-2
  3. Bulygin, D. (2020). Game Experience Evaluation. A Study of Game Reviews on the Steam Platform. In Digital Transformation and Global Society: 5th International Conference, DTGS 2020, St. Petersburg, Russia, June 17-19, 2020, Revised Selected Papers (Vol. 1242, p. 117). Springer Nature. https://doi.org/10.1007/978-3-030-65218-0_9
  4. Bian, P., Liu, L., & Sweetser Kyburz, P. (2021). Detecting Spam Game Reviews on Steam with a Semi-Supervised Approach. In International Conference on the Foundations of Digital Game. ACM.
  5. Tomaselli, V., Cantone, G. G., & Mazzeo, V. (2021). The Polarising Effect of Review Bomb. arXiv preprint .
  6. Pasaribu, B. E., Herdiani, A., & Astuti, W. (2019). Deteksi fake reviews menggunakan support vector machine. E-Proceeding of Engineering, 6(2), 8788.
  7. Li, F., Huang, M., & Zhu, X. (Eds.). (2011). Learning to identify review spam (Issue IJCAI International Joint Conference on Artificial Intelligence). https://doi.org/10.5591/978-1-57735-516-8/IJCAI11-414
  8. Cho, H., Kim, S., Lee, J., & Lee, J. S. (2014). Data-driven integration of multiple sentiment dictionaries for lexicon-based sentiment classification of product reviews. Knowledge-Based Systems, 71, 61-71. https://doi.org/10.1016/j.knosys.2014.06.001
  9. Ferlin, J., Bachtiar, F. A., & Rusydi, A. N. (2020). Klasifikasi Customer Intent Untuk Mengetahui Tingkat Kepuasan Pelanggan Menggunakan Metode Support Vector Machine Pada Restoran Bakso President. Jurnal Pengembangan Teknologi Informasi Dan Ilmu Komputer, 3(7), 9867–9875.
  10. Ruslim, K. I., Adikara, P., & Indriati. (2019). Analisis sentimen pada ulasan aplikasi mobile banking menggunakan metode support vector machine dan lexicon based features. Jurnal Pengembangan Teknologi Informasi Dan Ilmu Komputer, 3(7), 6694–6702.
  11. Campbell, C., & Ying, Y. (2011). Learning with Support Vector Machines. Morgan & Claypool. https://doi.org/10.2200/S00324ED1V01Y201102AIM010
  12. Augustyniak, L., Kajdanowicz, T., Szymański, P., Tuligłowicz, W., Kazienko, P., Alhajj, R., & Szymanski, B. (2014, August). Simpler is better? Lexicon-based ensemble sentiment classification beats supervised methods. In 2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2014) (pp. 924-929). IEEE. https://doi.org/10.1109/ASONAM.2014.6921696
  13. Taboada, M., Brooke, J., Tofiloski, M., Voll, K., & Stede, M. (2011). Lexicon-Based Methods for Sentiment Analysis. Computational Linguistics, 37(2), 267–307. https://doi.org/10.1162/coli_a_00049
  14. Bafna, P., Pramod, D., & Vaidya, A. (2016, March). Document clustering: TF-IDF approach. In 2016 International Conference on Electrical, Electronics, and Optimization Techniques (ICEEOT) (pp. 61-66). IEEE. https://doi.org/10.1109/ICEEOT.2016.7754750
  15. Singh, M., & Pamula, R. (2018, September). Email spam classification by support vector machine. In 2018 International Conference on Computing, Power and Communication Technologies (GUCON) (pp. 878-882). IEEE. https://doi.org/10.1109/GUCON.2018.8674973
  16. Colhon, M., Vlăduţescu, T., & Negrea, X. (2017). How Objective a Neutral Word Is? A Neutrosophic Approach for the Objectivity Degrees of Neutral Words. Symmetry, 9(11), 280. https://doi.org/10.3390/sym9110280
  17. Fusilier, D. H., Montes-y-Gómez, M., Rosso, P., & Cabrera, R. G. (2015). Detection of Opinion Spam with Character n-grams. Computational Linguistics and Intelligent Text Processing, 285–294. https://doi.org/10.1007/978-3-319-18117-2_21
  18. Guillet, F., & Hamilton, H. J. (2010). Quality Measures in Data Mining (Studies in Computational Intelligence, 43) (Softcover reprint of hardcover 1st ed. 2007 ed.). Springer. https://doi.org/10.1007/978-3-540-44918-8
  19. Kao, A., & Poteet, S. R. (2010). Natural Language Processing and Text Mining (Softcover reprint of hardcover 1st ed. 2007 ed.). Springer. https://doi.org/10.1007/978-1-84628-754-1
  20. Anandarajan, M., Hill, C., & Nolan, T. (2018). Practical Text Analytics: Maximizing the Value of Text Data (Advances in Analytics and Data Science, 2) (Softcover reprint of the original 1st ed. 2019 ed.). Springer. https://doi.org/10.1007/978-3-319-95663-3_1
  21. Yun-tao, Z., Ling, G. & Yong-cheng, W. An improved TF-IDF approach for text classification. J. Zheijang Univ.-Sci. A 6, 49–55 (2005). https://doi.org/10.1007/BF02842477
  22. Ahuja, S., & Dubey, G. (2017, August). Clustering and sentiment analysis on Twitter data. In 2017 2nd International Conference on Telecommunication and Networks (TEL-NET) (pp. 1-5). IEEE. https://doi.org/10.1109/TEL-NET.2017.8343568
  23. Laksono, R. A., Sungkono, K. R., Sarno, R., & Wahyuni, C. S. (2019, July). Sentiment analysis of restaurant customer reviews on tripadvisor using naïve bayes. In 2019 12th International Conference on Information & Communication Technology and System (ICTS) (pp. 49-54). IEEE. https://doi.org/10.1109/ICTS.2019.8850982
  24. Gharehchopogh, F. S., & Khalifelu, Z. A. (2011, October). Analysis and evaluation of unstructured data: text mining versus natural language processing. In 2011 5th International Conference on Application of Information and Communication Technologies (AICT) (pp. 1-4). IEEE. https://doi.org/10.1109/ICAICT.2011.6111017
  25. Stone, M. (1974). Cross-Validatory Choice and Assessment of Statistical Predictions. Journal of the Royal Statistical Society. Series B (Methodological), 36(2), 111–147.
  26. Gharehchopogh, F. S., & Khalifelu, Z. A. (2011, October). Analysis and evaluation of unstructured data: text mining versus natural language processing. In 2011 5th International Conference on Application of Information and Communication Technologies (AICT) (pp. 1-4). IEEE. https://doi.org/10.1109/ICAICT.2011.6111017
  27. Ahmed H, Traore I, Saad S (2018). Detecting opinion spams and fake news using text classification, Security and Privacy, 2018;1:e9. https://doi.org/10.1001/spy2.9
Read More

References


Jindal, N., & Liu, B. (2008). Opinion spam and analysis. ACM Press. https://doi.org/10.1145/1341531.1341560

Lin, D., Bezemer, C. P., & Hassan, A. E. (2017). Studying the urgent updates of popular games on the steam platform. Empirical Software Engineering, 22(4), 2095-2126. https://doi.org/10.1007/s10664-016-9480-2

Bulygin, D. (2020). Game Experience Evaluation. A Study of Game Reviews on the Steam Platform. In Digital Transformation and Global Society: 5th International Conference, DTGS 2020, St. Petersburg, Russia, June 17-19, 2020, Revised Selected Papers (Vol. 1242, p. 117). Springer Nature. https://doi.org/10.1007/978-3-030-65218-0_9

Bian, P., Liu, L., & Sweetser Kyburz, P. (2021). Detecting Spam Game Reviews on Steam with a Semi-Supervised Approach. In International Conference on the Foundations of Digital Game. ACM.

Tomaselli, V., Cantone, G. G., & Mazzeo, V. (2021). The Polarising Effect of Review Bomb. arXiv preprint .

Pasaribu, B. E., Herdiani, A., & Astuti, W. (2019). Deteksi fake reviews menggunakan support vector machine. E-Proceeding of Engineering, 6(2), 8788.

Li, F., Huang, M., & Zhu, X. (Eds.). (2011). Learning to identify review spam (Issue IJCAI International Joint Conference on Artificial Intelligence). https://doi.org/10.5591/978-1-57735-516-8/IJCAI11-414

Cho, H., Kim, S., Lee, J., & Lee, J. S. (2014). Data-driven integration of multiple sentiment dictionaries for lexicon-based sentiment classification of product reviews. Knowledge-Based Systems, 71, 61-71. https://doi.org/10.1016/j.knosys.2014.06.001

Ferlin, J., Bachtiar, F. A., & Rusydi, A. N. (2020). Klasifikasi Customer Intent Untuk Mengetahui Tingkat Kepuasan Pelanggan Menggunakan Metode Support Vector Machine Pada Restoran Bakso President. Jurnal Pengembangan Teknologi Informasi Dan Ilmu Komputer, 3(7), 9867–9875.

Ruslim, K. I., Adikara, P., & Indriati. (2019). Analisis sentimen pada ulasan aplikasi mobile banking menggunakan metode support vector machine dan lexicon based features. Jurnal Pengembangan Teknologi Informasi Dan Ilmu Komputer, 3(7), 6694–6702.

Campbell, C., & Ying, Y. (2011). Learning with Support Vector Machines. Morgan & Claypool. https://doi.org/10.2200/S00324ED1V01Y201102AIM010

Augustyniak, L., Kajdanowicz, T., Szymański, P., Tuligłowicz, W., Kazienko, P., Alhajj, R., & Szymanski, B. (2014, August). Simpler is better? Lexicon-based ensemble sentiment classification beats supervised methods. In 2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2014) (pp. 924-929). IEEE. https://doi.org/10.1109/ASONAM.2014.6921696

Taboada, M., Brooke, J., Tofiloski, M., Voll, K., & Stede, M. (2011). Lexicon-Based Methods for Sentiment Analysis. Computational Linguistics, 37(2), 267–307. https://doi.org/10.1162/coli_a_00049

Bafna, P., Pramod, D., & Vaidya, A. (2016, March). Document clustering: TF-IDF approach. In 2016 International Conference on Electrical, Electronics, and Optimization Techniques (ICEEOT) (pp. 61-66). IEEE. https://doi.org/10.1109/ICEEOT.2016.7754750

Singh, M., & Pamula, R. (2018, September). Email spam classification by support vector machine. In 2018 International Conference on Computing, Power and Communication Technologies (GUCON) (pp. 878-882). IEEE. https://doi.org/10.1109/GUCON.2018.8674973

Colhon, M., Vlăduţescu, T., & Negrea, X. (2017). How Objective a Neutral Word Is? A Neutrosophic Approach for the Objectivity Degrees of Neutral Words. Symmetry, 9(11), 280. https://doi.org/10.3390/sym9110280

Fusilier, D. H., Montes-y-Gómez, M., Rosso, P., & Cabrera, R. G. (2015). Detection of Opinion Spam with Character n-grams. Computational Linguistics and Intelligent Text Processing, 285–294. https://doi.org/10.1007/978-3-319-18117-2_21

Guillet, F., & Hamilton, H. J. (2010). Quality Measures in Data Mining (Studies in Computational Intelligence, 43) (Softcover reprint of hardcover 1st ed. 2007 ed.). Springer. https://doi.org/10.1007/978-3-540-44918-8

Kao, A., & Poteet, S. R. (2010). Natural Language Processing and Text Mining (Softcover reprint of hardcover 1st ed. 2007 ed.). Springer. https://doi.org/10.1007/978-1-84628-754-1

Anandarajan, M., Hill, C., & Nolan, T. (2018). Practical Text Analytics: Maximizing the Value of Text Data (Advances in Analytics and Data Science, 2) (Softcover reprint of the original 1st ed. 2019 ed.). Springer. https://doi.org/10.1007/978-3-319-95663-3_1

Yun-tao, Z., Ling, G. & Yong-cheng, W. An improved TF-IDF approach for text classification. J. Zheijang Univ.-Sci. A 6, 49–55 (2005). https://doi.org/10.1007/BF02842477

Ahuja, S., & Dubey, G. (2017, August). Clustering and sentiment analysis on Twitter data. In 2017 2nd International Conference on Telecommunication and Networks (TEL-NET) (pp. 1-5). IEEE. https://doi.org/10.1109/TEL-NET.2017.8343568

Laksono, R. A., Sungkono, K. R., Sarno, R., & Wahyuni, C. S. (2019, July). Sentiment analysis of restaurant customer reviews on tripadvisor using naïve bayes. In 2019 12th International Conference on Information & Communication Technology and System (ICTS) (pp. 49-54). IEEE. https://doi.org/10.1109/ICTS.2019.8850982

Gharehchopogh, F. S., & Khalifelu, Z. A. (2011, October). Analysis and evaluation of unstructured data: text mining versus natural language processing. In 2011 5th International Conference on Application of Information and Communication Technologies (AICT) (pp. 1-4). IEEE. https://doi.org/10.1109/ICAICT.2011.6111017

Stone, M. (1974). Cross-Validatory Choice and Assessment of Statistical Predictions. Journal of the Royal Statistical Society. Series B (Methodological), 36(2), 111–147.

Gharehchopogh, F. S., & Khalifelu, Z. A. (2011, October). Analysis and evaluation of unstructured data: text mining versus natural language processing. In 2011 5th International Conference on Application of Information and Communication Technologies (AICT) (pp. 1-4). IEEE. https://doi.org/10.1109/ICAICT.2011.6111017

Ahmed H, Traore I, Saad S (2018). Detecting opinion spams and fake news using text classification, Security and Privacy, 2018;1:e9. https://doi.org/10.1001/spy2.9

Author biographies is not available.
Download this PDF file
PDF
Statistic
Read Counter : 373 Download : 210

Downloads

Download data is not yet available.

Quick Link

  • Author Guidelines
  • Download Manuscript Template
  • Peer Review Process
  • Editorial Board
  • Reviewer Acknowledgement
  • Aim and Scope
  • Publication Ethics
  • Licensing Term
  • Copyright Notice
  • Open Access Policy
  • Important Dates
  • Author Fees
  • Indexing and Abstracting
  • Archiving Policy
  • Scopus Citation Analysis
  • Statistic
  • Article Withdrawal

Meet Our Editorial Team

Ir. Amrul Faruq, M.Eng., Ph.D
Editor in Chief
Universitas Muhammadiyah Malang
Google Scholar Scopus
Agus Eko Minarno
Editorial Board
Universitas Muhammadiyah Malang
Google Scholar  Scopus
Hanung Adi Nugroho
Editorial Board
Universitas Gadjah Mada
Google Scholar Scopus
Roman Voliansky
Editorial Board
Dniprovsky State Technical University, Ukraine
Google Scholar Scopus
Read More
 

KINETIK: Game Technology, Information System, Computer Network, Computing, Electronics, and Control
eISSN : 2503-2267
pISSN : 2503-2259


Address

Program Studi Elektro dan Informatika

Fakultas Teknik, Universitas Muhammadiyah Malang

Jl. Raya Tlogomas 246 Malang

Phone 0341-464318 EXT 247

Contact Info

Principal Contact

Amrul Faruq
Phone: +62 812-9398-6539
Email: faruq@umm.ac.id

Support Contact

Fauzi Dwi Setiawan Sumadi
Phone: +62 815-1145-6946
Email: fauzisumadi@umm.ac.id

© 2020 KINETIK, All rights reserved. This is an open-access article distributed under the terms of the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License