Paragraph Selection Methods Using Feature-Based on Segment-Based Clustering Process Using Paragraphs for Identifying Topics on Indication Detection of Plagiarism System

Denar Regata Akbi; Arini Rahmawati Rosyadi

doi:10.22219/kinetik.v3i2.593

Issue

Vol 3, No 2, May-2018

Issue Published : Apr 16, 2018

Paragraph Selection Methods Using Feature-Based on Segment-Based Clustering Process Using Paragraphs for Identifying Topics on Indication Detection of Plagiarism System

https://doi.org/10.22219/kinetik.v3i2.593

Denar Regata Akbi

Universitas Muhammadiyah Malang

Arini Rahmawati Rosyadi

Universitas Muhammadiyah Malang

Corresponding Author(s) : Denar Regata Akbi

dregata.akbi@gmail.com

Kinetik: Game Technology, Information System, Computer Network, Computing, Electronics, and Control, Vol 3, No 2, May-2018
Article Published : Apr 16, 2018

Abstract

In segment-based clustering, the paragraphs selection as a dataset in the clustering process has a very important role. This is because the paragraph used as the dataset can affect the clustering result. This research uses paragraph selection using feature-based method which aims to optimize the clustering process conducted in the previous research. Based on the evaluation results using Silhouette Coefficient and Sum Square Errors evaluation methods to find the proper k value, it is found that with the utilization of Feature-based method, better results can be acquire compared to the evaluation result from the previous research.

Keywords

Feature-Based Paragraphs Selection Segment-Based Silhouette Coefficient Sum Square Errors

Akbi, D. R., & Rosyadi, A. R. (2018). Paragraph Selection Methods Using Feature-Based on Segment-Based Clustering Process Using Paragraphs for Identifying Topics on Indication Detection of Plagiarism System. Kinetik: Game Technology, Information System, Computer Network, Computing, Electronics, and Control, 3(2), 91-102. https://doi.org/10.22219/kinetik.v3i2.593

Download Citation

References

J. Brooke, and G. Hirst, “Paragraph Clustering for Intrinsic Plagiarism Detection using a Stylistic Vector-Space Model with Extrinsic Features,” Notebook for PAN at CLEF, 2012.
J. Brooke, A. Hammond, and G. Hirst, “Unsupervised Stylistic Segmentation of Poetry with Change Curves and Extrinsic Features,” In CLfL@ NAACL-HLT, Pp. 26-35, June 2012.
P. Shrestha, and T. Solorio, “Using a Variety of n-Grams for the Detection of Different Kind of Plagiarism,” CLEF, 2013.
M. Jiffriya, M.A. Jahan, R.G. Ragel, and S. Deegalla, “AntiPlag: Plagiarism Detection on Electronic Submissions of Text Based Assignments,” Industrial and Information Systems (ICIIS) 8th IEEE International Conference, Pp. 376 – 380, Peradeniya: IEEE, 2013.
A. Rosyadi, A.Z. Arifin, and & D. Purwitasari, “Clusterization Based on Segment Using Paragraph to Identify Topic on Plagiarism Indication Detection,” Inspiration Journal, Pp. 6(2), 2016.
S. Ladda, N. Salim, and M.S. Binwahlan, "Automatic Text Summarization Using Feature Based Fuzzy Extraction," Journal of Information Declaration, Pp.105-115, December 2008.
H.P. Luhn, “The Automatic Creation of Literature Abstracts: Advances in Automatic Text Summarization,” Pp. 15, 1999.
A. Tagarelli, & G. Karypis, “A Segment-Based Approach to Clustering Multi-Topic Documents,” Knowledge and Information System, Pp. 563-595, 2013.
P.J. Rousseeuw, "Silhouettes: A Graphical Aid to the Interpretation and Validation of Cluster Analysis," Computational and Applied Mathematics, 20: 53–65, 1987.
N.P.E Merliana, & A.J. Santoso, “Analysis of Best Cluster Number Determination on K-Means Clustering Method,” Proceeding Sendi_U, 2015.

References

J. Brooke, and G. Hirst, “Paragraph Clustering for Intrinsic Plagiarism Detection using a Stylistic Vector-Space Model with Extrinsic Features,” Notebook for PAN at CLEF, 2012.

J. Brooke, A. Hammond, and G. Hirst, “Unsupervised Stylistic Segmentation of Poetry with Change Curves and Extrinsic Features,” In CLfL@ NAACL-HLT, Pp. 26-35, June 2012.

P. Shrestha, and T. Solorio, “Using a Variety of n-Grams for the Detection of Different Kind of Plagiarism,” CLEF, 2013.

M. Jiffriya, M.A. Jahan, R.G. Ragel, and S. Deegalla, “AntiPlag: Plagiarism Detection on Electronic Submissions of Text Based Assignments,” Industrial and Information Systems (ICIIS) 8th IEEE International Conference, Pp. 376 – 380, Peradeniya: IEEE, 2013.

A. Rosyadi, A.Z. Arifin, and & D. Purwitasari, “Clusterization Based on Segment Using Paragraph to Identify Topic on Plagiarism Indication Detection,” Inspiration Journal, Pp. 6(2), 2016.

S. Ladda, N. Salim, and M.S. Binwahlan, "Automatic Text Summarization Using Feature Based Fuzzy Extraction," Journal of Information Declaration, Pp.105-115, December 2008.

H.P. Luhn, “The Automatic Creation of Literature Abstracts: Advances in Automatic Text Summarization,” Pp. 15, 1999.

A. Tagarelli, & G. Karypis, “A Segment-Based Approach to Clustering Multi-Topic Documents,” Knowledge and Information System, Pp. 563-595, 2013.

P.J. Rousseeuw, "Silhouettes: A Graphical Aid to the Interpretation and Validation of Cluster Analysis," Computational and Applied Mathematics, 20: 53–65, 1987.

N.P.E Merliana, & A.J. Santoso, “Analysis of Best Cluster Number Determination on K-Means Clustering Method,” Proceeding Sendi_U, 2015.

Issue

Vol 3, No 2, May-2018

Paragraph Selection Methods Using Feature-Based on Segment-Based Clustering Process Using Paragraphs for Identifying Topics on Indication Detection of Plagiarism System

Corresponding Author(s) : Denar Regata Akbi

Abstract

Keywords

Download Citation

References

Downloads