Comparison of Modified Nazief&Adriani and Modified Enhanced Confix Stripping algorithms for Madurese Language Stemming

Keywords: Madurese, Morphology, Stemming, Nazief & Adriani, ECS


The Madurese language has a unique morphology. The morphological uniqueness can be used to find basic words. The basic word process is called stemming. Stemming can be developed into an application for translating Madurese into Indonesian and even other languages. It can support the development of a Madurese language text plagiarism system. Stemming research on the Madurese language is still rare. Therefore, this study aims to find the basic words of the Madurese language using modifications to the Nazief & Adriani algorithm and Enhanced Confix Stripping (ECS) modifications. The study used 1000 Madurese words, consisting of 630 prefix words, 74 ending words, and 296 confix words. The results showed that the modification of the Nazief & Adriani algorithm was better, shown by the accuracy obtained of 88.8% with overstemming of 0.7% and understemming of 10.5%. As for ECS, an accuracy of 74.0% was obtained, 0.4% overstemming, and 25.6% understemming. In the same process, Nazief&Adriani's modification is faster than the ECS modification. For the Nazief&Adriani modification, it takes 13.31 seconds while for the ECS modification, it takes 210.88.


Author Biographies

Enni Lindrawati, Universitas Amikom Yogyakarta

Magister of Informatics Engineering, Universitas Amikom Yogyakarta

Ema Utami, Universitas Amikom Yogyakarta

Magister of Informatics Engineering, Universitas Amikom Yogyakarta

Ainul Yaqin, Universitas Amikom Yogyakarta

Faculty of Computer Science, Universitas Amikom Yogyakarta


How to Cite
E. Lindrawati, E. Utami, and A. Yaqin, “Comparison of Modified Nazief&Adriani and Modified Enhanced Confix Stripping algorithms for Madurese Language Stemming”, intensif, vol. 7, no. 2, pp. 276-289, Aug. 2023.