Paragraph Selection Methods Using Feature-Based On Segment-Based Clustering Process Using Paragraphs For Identifying Topics On Indications Detection of Plagiarism System

Paragraph Selection Methods Using Feature-Based On Segment-Based Clustering Process Using Paragraphs For Identifying Topics On Indications Detection of Plagiarism System

Denar Regata Akbi, Arini Rahmawati Rosyadi

Abstract

In segment-based clustering, the paragraphs selection as a data-set in the clustering process has a very important role. This is because the paragraph used as the data-set can affect the clustering result. In this research used the method of paragraph selection using feature-based, which aims to optimize the clustering process done in previous research. Based on evaluating using Silhouette Coefficient and Sum Square Errors evaluation methods to find the proper k value, it was found that with the use of Feature Based method, there was better result compared to the evaluation result from previous research.

References

[1] Brooke, J., & Hirst, G. (2012). Paragraph Clustering for Intrinsic Plagiarism Detection using a Stylistic Vector-Space Model with Extrinsic Features Notebook for PAN at CLEF 2012.

[2] Brooke, J., Hammond, A., & Hirst, G. (2012, June). Unsupervised Stylistic Segmentation of Poetry with Change Curves and Extrinsic Features. In CLfL@ NAACL-HLT (pp. 26-35).

[3] Shrestha, P., & Solorio, T. (2013). Using a Variety of n-Grams for the Detection of Different Kind of Plagiarism. CLEF.

[4] Jiffriya, M., Jahan, M. A., Ragel, R. G., & Deegalla, S. (2013). AntiPlag: Plagiarism detection on electronic submissions of text based assignments. Industrial and Information Systems (ICIIS) 8th IEEE International Conference on (pp. 376 - 380). Peradeniya: IEEE.

[5] Rosyadi, A., Arifin, A.Z., & Purwitasari, D. (2016). Pengklasteran Berbasis Segmen Menggunakan Paragraf Untuk Identifikasi Topik Pada Deteksi Indikasi Plagiarisme. Jurnal Inspiration, 6(2).

[6] Ladda Suanmali, Naomie Salim, and Mohammed Salem Binwahlan, "Automatic TextSummarization Using Feature Based Fuzzy Extraction," Iurnal Teknologi Maklumat, pp.105-115, Desember 2008

[7] Luhn, H. P. (1999). The automatic creation of literature abstracts. Advances in automatic text summarization, 15.

[8] Tagarelli, A., & Karypis, G. (2013). A segment-based approach to clustering multi-topic documents. Knowledge and Information System, 563-595

[9] Peter J. Rousseeuw (1987). "Silhouettes: a Graphical Aid to the Interpretation and Validation of Cluster Analysis". Computational and Applied Mathematics. 20: 53–65.

[10] Merliana, N. P. E., & Santoso, A. J. (2015). ANALISA PENENTUAN JUMLAH CLUSTER TERBAIK PADA METODE K-MEANS CLUSTERING. Proceeding SENDI_U.

Refbacks

  • There are currently no refbacks.

Referencing Software:

Checked by:

Supervised by:

Statistic:

View My Stats


Creative Commons License Kinetik : Game Technology, Information System, Computer Network, Computing, Electronics, and Control by http://kinetik.umm.ac.id is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.