common.openJournalSystems

Peringkasan Tweet Berdasarkan Trending Topic Twitter Dengan Pembobotan TF-IDF dan Single Linkage AngglomerativeHierarchical Clustering

Annisa Annisa, Yuda Munarko, Yufis Azhar

Abstract

Trending topic is a feature provided by twitter that informs something widely discussed by users in a particular time. The form of a trending topic is a hashtag and can be selected by clicking. However, the number of tweets for each trending topics can be very large, so it will be difficult if we want to know all the contents. So, in order to make easy when reading the topic, a small number of tweets can be selected as the main idea of the topic. In this study, we applied the Agglomerative Single Linkage Hierarchical Clustering by calculating the TF-IDF value for each word in advance. We used 100 trending topics, where each topic consists of 50 tweets in Indonesian. For testing, we provided 30 trending topics which consist of 2 until 9 sub-topics. The result is that each trending topics can be summarized into shorter text contains 2 until 9 tweets. We were able to summarize 1 trending topics exactly same as the topic summarized by human expert. However, the rest of topics corresponded partially with human expert.

Keywords

Text Summarization, TF-IDF, Single Linkage Agglomerative Hierarchical Clustering

Full Text:

PDF

References

Erkan, Gnes, dan Dragomir R. Radev. "LexRank: Graph-based lexical centrality as salience in text summarization." Journal of Artificial Intelligence Research (2004): 457-479.

Mro, Rbert, dan M. Bielikov. "Personalized text summarization based on important terms identification." Database and Expert Systems Applications (DEXA), 2012 23rd International Workshop on. IEEE, 2012.

Berkhin, Pavel. A survey of clustering data mining techniques. Grouping multi dimensional data. Springer Berlin Heidelberg, 2006. 25-71.

Hamzah, Amir, F.Soesianto, dan Jazi Eko Istiyanto . "Studi Kinerja Fungsi-Fungsi Jarak dan Similaritas dalam Clustering Dokumen Teks Berbahasa Indonesia." Seminar Nasional Informatika (SEMNASIF). Vol. 1. No. 1. 2015.

Steinbach, Michael, George Karypis, dan Vipin Kumar. "A comparison of document clustering techniques." KDD workshop on text mining. Vol. 400. No. 1. 2000.

LIN, C.Y. 2004. ROUGE: A Package for Automatic Evaluation of Summaries. Proceedings of Workshop on Text Summarization Brances Out.

Santika, Putu Praba, and Gus Nanang Syaifuddin. "Semantic Clustering Dan Pemilihan Kalimat Representatif Untuk Peringkasan Multi Dokumen." Jurnal Teknologi Informasi dan Ilmu Komputer 1.2 (2015).

Refbacks

  • There are currently no refbacks.
 

Indexed by:

Referencing Software:

Checked by:

Statistic:

View My Stats


Creative Commons License Kinetik : Game Technology, Information System, Computer Network, Computing, Electronics, and Control by http://kinetik.umm.ac.id is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

ISSN: 2503-2267