Paragraph Selection Methods Using Feature-Based on Segment-Based Clustering Process Using Paragraphs for Identifying Topics on Indication Detection of Plagiarism System
Corresponding Author(s) : Denar Regata Akbi
Kinetik: Game Technology, Information System, Computer Network, Computing, Electronics, and Control,
Vol 3, No 2, May-2018
Abstract
In segment-based clustering, the paragraphs selection as a dataset in the clustering process has a very important role. This is because the paragraph used as the dataset can affect the clustering result. This research uses paragraph selection using feature-based method which aims to optimize the clustering process conducted in the previous research. Based on the evaluation results using Silhouette Coefficient and Sum Square Errors evaluation methods to find the proper k value, it is found that with the utilization of Feature-based method, better results can be acquire compared to the evaluation result from the previous research.
Keywords
Download Citation
Endnote/Zotero/Mendeley (RIS)BibTeX
- J. Brooke, and G. Hirst, “Paragraph Clustering for Intrinsic Plagiarism Detection using a Stylistic Vector-Space Model with Extrinsic Features,” Notebook for PAN at CLEF, 2012.
- J. Brooke, A. Hammond, and G. Hirst, “Unsupervised Stylistic Segmentation of Poetry with Change Curves and Extrinsic Features,” In CLfL@ NAACL-HLT, Pp. 26-35, June 2012.
- P. Shrestha, and T. Solorio, “Using a Variety of n-Grams for the Detection of Different Kind of Plagiarism,” CLEF, 2013.
- M. Jiffriya, M.A. Jahan, R.G. Ragel, and S. Deegalla, “AntiPlag: Plagiarism Detection on Electronic Submissions of Text Based Assignments,” Industrial and Information Systems (ICIIS) 8th IEEE International Conference, Pp. 376 – 380, Peradeniya: IEEE, 2013.
- A. Rosyadi, A.Z. Arifin, and & D. Purwitasari, “Clusterization Based on Segment Using Paragraph to Identify Topic on Plagiarism Indication Detection,” Inspiration Journal, Pp. 6(2), 2016.
- S. Ladda, N. Salim, and M.S. Binwahlan, "Automatic Text Summarization Using Feature Based Fuzzy Extraction," Journal of Information Declaration, Pp.105-115, December 2008.
- H.P. Luhn, “The Automatic Creation of Literature Abstracts: Advances in Automatic Text Summarization,” Pp. 15, 1999.
- A. Tagarelli, & G. Karypis, “A Segment-Based Approach to Clustering Multi-Topic Documents,” Knowledge and Information System, Pp. 563-595, 2013.
- P.J. Rousseeuw, "Silhouettes: A Graphical Aid to the Interpretation and Validation of Cluster Analysis," Computational and Applied Mathematics, 20: 53–65, 1987.
- N.P.E Merliana, & A.J. Santoso, “Analysis of Best Cluster Number Determination on K-Means Clustering Method,” Proceeding Sendi_U, 2015.
References
J. Brooke, and G. Hirst, “Paragraph Clustering for Intrinsic Plagiarism Detection using a Stylistic Vector-Space Model with Extrinsic Features,” Notebook for PAN at CLEF, 2012.
J. Brooke, A. Hammond, and G. Hirst, “Unsupervised Stylistic Segmentation of Poetry with Change Curves and Extrinsic Features,” In CLfL@ NAACL-HLT, Pp. 26-35, June 2012.
P. Shrestha, and T. Solorio, “Using a Variety of n-Grams for the Detection of Different Kind of Plagiarism,” CLEF, 2013.
M. Jiffriya, M.A. Jahan, R.G. Ragel, and S. Deegalla, “AntiPlag: Plagiarism Detection on Electronic Submissions of Text Based Assignments,” Industrial and Information Systems (ICIIS) 8th IEEE International Conference, Pp. 376 – 380, Peradeniya: IEEE, 2013.
A. Rosyadi, A.Z. Arifin, and & D. Purwitasari, “Clusterization Based on Segment Using Paragraph to Identify Topic on Plagiarism Indication Detection,” Inspiration Journal, Pp. 6(2), 2016.
S. Ladda, N. Salim, and M.S. Binwahlan, "Automatic Text Summarization Using Feature Based Fuzzy Extraction," Journal of Information Declaration, Pp.105-115, December 2008.
H.P. Luhn, “The Automatic Creation of Literature Abstracts: Advances in Automatic Text Summarization,” Pp. 15, 1999.
A. Tagarelli, & G. Karypis, “A Segment-Based Approach to Clustering Multi-Topic Documents,” Knowledge and Information System, Pp. 563-595, 2013.
P.J. Rousseeuw, "Silhouettes: A Graphical Aid to the Interpretation and Validation of Cluster Analysis," Computational and Applied Mathematics, 20: 53–65, 1987.
N.P.E Merliana, & A.J. Santoso, “Analysis of Best Cluster Number Determination on K-Means Clustering Method,” Proceeding Sendi_U, 2015.