Document Preprocessing with TF-IDF to Improve the Polarity Classification Performance of Unstructured Sentiment Analysis
Corresponding Author(s) : Farrikh Alzami
Kinetik: Game Technology, Information System, Computer Network, Computing, Electronics, and Control,
Vol. 5, No. 3, August 2020
Abstract
Keywords
Download Citation
Endnote/Zotero/Mendeley (RIS)BibTeX
- Agarwal, B., Mittal, N., Bansal, P., & Garg, S. (2015). Sentiment Analysis Using Common-Sense and Context Information. Computational Intelligence and Neuroscience, 2015, 1–9. https://doi.org/10.1155/2015/715730
- Cambria, E., Hussain, A., Durrani, T., Havasi, C., Eckl, C., & Munro, J. (2010). Sentic Computing for patient centered applications. IEEE 10th International Conference On Signal Processing Proceedings, 1279–1282. https://doi.org/10.1109/ICOSP.2010.5657072
- Ebrahimi, M., Yazdavar, A. H., & Sheth, A. (2017). Challenges of Sentiment Analysis for Dynamic Events. IEEE Intelligent Systems, 32(5), 70–75. https://doi.org/10.1109/MIS.2017.3711649
- Xing, F. Z., Cambria, E., & Welsch, R. E. (2018). Natural language based financial forecasting: a survey. Artificial Intelligence Review, 50(1), 49–73. https://doi.org/10.1007/s10462-017-9588-9
- Van de Kauter, M., Breesch, D., & Hoste, V. (2015). Fine-grained analysis of explicit and implicit sentiment in financial news articles. Expert Systems with Applications, 42(11), 4999–5010. https://doi.org/10.1016/j.eswa.2015.02.007
- Valdivia, A., Luzon, M. V., & Herrera, F. (2017). Sentiment Analysis in TripAdvisor. IEEE Intelligent Systems, 32(4), 72–77. https://doi.org/10.1109/MIS.2017.3121555
- Vázquez, S., Muñoz-García, Ó., Campanella, I., Poch, M., Fisas, B., Bel, N., & Andreu, G. (2014). A classification of user-generated content into consumer decision journey stages. Neural Networks, 58, 68–81. https://doi.org/10.1016/j.neunet.2014.05.026
- Thompson, J. J., Leung, B. H., Blair, M. R., & Taboada, M. (2017). Sentiment analysis of player chat messaging in the video game StarCraft 2: Extending a lexicon-based model. Knowledge-Based Systems, 137, 149–162. https://doi.org/10.1016/j.knosys.2017.09.022
- Wang, K., Liu, X., & Han, Y. (2019). Exploring Goodreads reviews for book impact assessment. Journal of Informetrics, 13(3), 874–886. https://doi.org/10.1016/j.joi.2019.07.003
- Bello-Orgaz, G., Jung, J. J., & Camacho, D. (2016). Social big data: Recent achievements and new challenges. Information Fusion, 28, 45–59. https://doi.org/10.1016/j.inffus.2015.08.005
- Elghannam, F. (2019). Text representation and classification based on bi-gram alphabet. Journal of King Saud University - Computer and Information Sciences. https://doi.org/10.1016/j.jksuci.2019.01.005
- Chalothorn, T., & Ellman, J. (2015). Simple approaches of sentiment analysis via ensemble learning. In Lecture Notes in Electrical Engineering (Vol. 339, pp. 631–639). https://doi.org/10.1007/978-3-662-46578-3_74
- Yang, L., Li, Y., Wang, J., & Sherratt, R. S. (2020). Sentiment Analysis for E-Commerce Product Reviews in Chinese Based on Sentiment Lexicon and Deep Learning. IEEE Access, 8, 23522–23530. https://doi.org/10.1109/ACCESS.2020.2969854
- Zeng, D., Dai, Y., Li, F., Wang, J., & Sangaiah, A. K. (2019). Aspect based sentiment analysis by a linguistically regularized CNN with gated mechanism. Journal of Intelligent & Fuzzy Systems, 36(5), 3971–3980. https://doi.org/10.3233/JIFS-169958
- Khan, K., Baharudin, B., Khan, A., & Ullah, A. (2014). Mining opinion components from unstructured reviews: A review. Journal of King Saud University - Computer and Information Sciences, 26(3), 258–275. https://doi.org/10.1016/j.jksuci.2014.03.009
- Hussein, D. M. E.-D. M. (2018). A survey on sentiment analysis challenges. Journal of King Saud University - Engineering Sciences, 30(4), 330–338. https://doi.org/10.1016/j.jksues.2016.04.002
- Moraes, R., Valiati, J. F., & Gavião Neto, W. P. (2013). Document-level sentiment classification: An empirical comparison between SVM and ANN. Expert Systems with Applications, 40(2), 621–633. https://doi.org/10.1016/j.eswa.2012.07.059
- McAuley, J., & Leskovec, J. (2013). From amateurs to connoisseurs: Modeling the evolution of user expertise through online reviews. WWW 2013 - Proceedings of the 22nd International Conference on World Wide Web, 897–907. Retrieved from https://arxiv.org/abs/1303.4402
- Willett, P. (2006). The Porter stemming algorithm: then and now. Program, 40(3), 219–223. https://doi.org/10.1108/00330330610681295
- Xie, F., Wu, X., & Zhu, X. (2017). Efficient sequential pattern mining with wildcards for keyphrase extraction. Knowledge-Based Systems, 115, 27–39. https://doi.org/10.1016/j.knosys.2016.10.011
- Gencosman, B. C., Ozmutlu, H. C., & Ozmutlu, S. (2014). Character n-gram application for automatic new topic identification. Information Processing & Management, 50(6), 821–856. https://doi.org/10.1016/j.ipm.2014.06.005
- Schmidt, C. W. (2019). Improving a tf-idf weighted document vector embedding. Retrieved from http://arxiv.org/abs/1902.09875
- Ren, J. (2012). ANN vs. SVM: Which one performs better in classification of MCCs in mammogram imaging. Knowledge-Based Systems, 26, 144–153. https://doi.org/10.1016/j.knosys.2011.07.016
- Appel, O., Chiclana, F., Carter, J., & Fujita, H. (2016). A hybrid approach to the sentiment analysis problem at the sentence level. Knowledge-Based Systems, 108, 110–124. https://doi.org/10.1016/j.knosys.2016.05.040
- Tripathy, A., Agrawal, A., & Rath, S. K. (2016). Classification of sentiment reviews using n-gram machine learning approach. Expert Systems with Applications, 57, 117–126. https://doi.org/10.1016/j.eswa.2016.03.028
References
Agarwal, B., Mittal, N., Bansal, P., & Garg, S. (2015). Sentiment Analysis Using Common-Sense and Context Information. Computational Intelligence and Neuroscience, 2015, 1–9. https://doi.org/10.1155/2015/715730
Cambria, E., Hussain, A., Durrani, T., Havasi, C., Eckl, C., & Munro, J. (2010). Sentic Computing for patient centered applications. IEEE 10th International Conference On Signal Processing Proceedings, 1279–1282. https://doi.org/10.1109/ICOSP.2010.5657072
Ebrahimi, M., Yazdavar, A. H., & Sheth, A. (2017). Challenges of Sentiment Analysis for Dynamic Events. IEEE Intelligent Systems, 32(5), 70–75. https://doi.org/10.1109/MIS.2017.3711649
Xing, F. Z., Cambria, E., & Welsch, R. E. (2018). Natural language based financial forecasting: a survey. Artificial Intelligence Review, 50(1), 49–73. https://doi.org/10.1007/s10462-017-9588-9
Van de Kauter, M., Breesch, D., & Hoste, V. (2015). Fine-grained analysis of explicit and implicit sentiment in financial news articles. Expert Systems with Applications, 42(11), 4999–5010. https://doi.org/10.1016/j.eswa.2015.02.007
Valdivia, A., Luzon, M. V., & Herrera, F. (2017). Sentiment Analysis in TripAdvisor. IEEE Intelligent Systems, 32(4), 72–77. https://doi.org/10.1109/MIS.2017.3121555
Vázquez, S., Muñoz-García, Ó., Campanella, I., Poch, M., Fisas, B., Bel, N., & Andreu, G. (2014). A classification of user-generated content into consumer decision journey stages. Neural Networks, 58, 68–81. https://doi.org/10.1016/j.neunet.2014.05.026
Thompson, J. J., Leung, B. H., Blair, M. R., & Taboada, M. (2017). Sentiment analysis of player chat messaging in the video game StarCraft 2: Extending a lexicon-based model. Knowledge-Based Systems, 137, 149–162. https://doi.org/10.1016/j.knosys.2017.09.022
Wang, K., Liu, X., & Han, Y. (2019). Exploring Goodreads reviews for book impact assessment. Journal of Informetrics, 13(3), 874–886. https://doi.org/10.1016/j.joi.2019.07.003
Bello-Orgaz, G., Jung, J. J., & Camacho, D. (2016). Social big data: Recent achievements and new challenges. Information Fusion, 28, 45–59. https://doi.org/10.1016/j.inffus.2015.08.005
Elghannam, F. (2019). Text representation and classification based on bi-gram alphabet. Journal of King Saud University - Computer and Information Sciences. https://doi.org/10.1016/j.jksuci.2019.01.005
Chalothorn, T., & Ellman, J. (2015). Simple approaches of sentiment analysis via ensemble learning. In Lecture Notes in Electrical Engineering (Vol. 339, pp. 631–639). https://doi.org/10.1007/978-3-662-46578-3_74
Yang, L., Li, Y., Wang, J., & Sherratt, R. S. (2020). Sentiment Analysis for E-Commerce Product Reviews in Chinese Based on Sentiment Lexicon and Deep Learning. IEEE Access, 8, 23522–23530. https://doi.org/10.1109/ACCESS.2020.2969854
Zeng, D., Dai, Y., Li, F., Wang, J., & Sangaiah, A. K. (2019). Aspect based sentiment analysis by a linguistically regularized CNN with gated mechanism. Journal of Intelligent & Fuzzy Systems, 36(5), 3971–3980. https://doi.org/10.3233/JIFS-169958
Khan, K., Baharudin, B., Khan, A., & Ullah, A. (2014). Mining opinion components from unstructured reviews: A review. Journal of King Saud University - Computer and Information Sciences, 26(3), 258–275. https://doi.org/10.1016/j.jksuci.2014.03.009
Hussein, D. M. E.-D. M. (2018). A survey on sentiment analysis challenges. Journal of King Saud University - Engineering Sciences, 30(4), 330–338. https://doi.org/10.1016/j.jksues.2016.04.002
Moraes, R., Valiati, J. F., & Gavião Neto, W. P. (2013). Document-level sentiment classification: An empirical comparison between SVM and ANN. Expert Systems with Applications, 40(2), 621–633. https://doi.org/10.1016/j.eswa.2012.07.059
McAuley, J., & Leskovec, J. (2013). From amateurs to connoisseurs: Modeling the evolution of user expertise through online reviews. WWW 2013 - Proceedings of the 22nd International Conference on World Wide Web, 897–907. Retrieved from https://arxiv.org/abs/1303.4402
Willett, P. (2006). The Porter stemming algorithm: then and now. Program, 40(3), 219–223. https://doi.org/10.1108/00330330610681295
Xie, F., Wu, X., & Zhu, X. (2017). Efficient sequential pattern mining with wildcards for keyphrase extraction. Knowledge-Based Systems, 115, 27–39. https://doi.org/10.1016/j.knosys.2016.10.011
Gencosman, B. C., Ozmutlu, H. C., & Ozmutlu, S. (2014). Character n-gram application for automatic new topic identification. Information Processing & Management, 50(6), 821–856. https://doi.org/10.1016/j.ipm.2014.06.005
Schmidt, C. W. (2019). Improving a tf-idf weighted document vector embedding. Retrieved from http://arxiv.org/abs/1902.09875
Ren, J. (2012). ANN vs. SVM: Which one performs better in classification of MCCs in mammogram imaging. Knowledge-Based Systems, 26, 144–153. https://doi.org/10.1016/j.knosys.2011.07.016
Appel, O., Chiclana, F., Carter, J., & Fujita, H. (2016). A hybrid approach to the sentiment analysis problem at the sentence level. Knowledge-Based Systems, 108, 110–124. https://doi.org/10.1016/j.knosys.2016.05.040
Tripathy, A., Agrawal, A., & Rath, S. K. (2016). Classification of sentiment reviews using n-gram machine learning approach. Expert Systems with Applications, 57, 117–126. https://doi.org/10.1016/j.eswa.2016.03.028