Comparison of Modified Nazief&Adriani and Modified Enhanced Confix Stripping algorithms for Madurese Language Stemming
Abstract
The Madurese language has a unique morphology. The morphological uniqueness can be used to find basic words. The basic word process is called stemming. Stemming can be developed into an application for translating Madurese into Indonesian and even other languages. It can support the development of a Madurese language text plagiarism system. Stemming research on the Madurese language is still rare. Therefore, this study aims to find the basic words of the Madurese language using modifications to the Nazief & Adriani algorithm and Enhanced Confix Stripping (ECS) modifications. The study used 1000 Madurese words, consisting of 630 prefix words, 74 ending words, and 296 confix words. The results showed that the modification of the Nazief & Adriani algorithm was better, shown by the accuracy obtained of 88.8% with overstemming of 0.7% and understemming of 10.5%. As for ECS, an accuracy of 74.0% was obtained, 0.4% overstemming, and 25.6% understemming. In the same process, Nazief&Adriani's modification is faster than the ECS modification. For the Nazief&Adriani modification, it takes 13.31 seconds while for the ECS modification, it takes 210.88.
Downloads
References
F. L. Fitri Lintang and F. Ulfatun Najicha, “Nilai-Nilai Sila Persatuan Indonesia Dalam Keberagaman Kebudayaan Indonesia,” J. Glob. Citiz. J. Ilm. Kaji. Pendidik. Kewarganegaraan, vol. 11, no. 1, pp. 79–85, 2022, doi: 10.33061/jgz.v11i1.7469.
R. Peter and M. S. Simatupang, “Keberagaman Bahasa Dan Budaya Sebagai Kekayaan Bangsa Indonesia,” Dialekt. J. Bahasa, Sastra Dan Budaya, vol. 9, no. 1, pp. 96–105, 2022, doi: 10.33541/dia.v9i1.4028.
A. F. Hidayati, “Afiks Nomina Deverbal dalam Kumpulan Cerpen Bahasa Madura,” Konf. Linguist. Tah. Atma Jaya 19, pp. 17–20, 2021.
I. Irwiandi and M. Norman, “Proses Morfologis pada Bahasa Madura: Studi pada Mahasiswa Madura di Universitas Trunojoyo,” AIJER Algazali Int. J. Educ. Res., vol. 5, no. 1, pp. 68–75, 2022.
T. Winarti et al., “Penanganan Kasus Overstemming dan Understemming dengan Modifikasi Algoritma Stemming,” IOP Conf. Ser. Mater. Sci. Eng., vol. 6, no. 1, pp. 199–206, 2020, doi: 10.18517/ijaseit.7.5.1705.
A. Andriani, “Morfofonemik Bahasa Indonesia Pada Masyarakat Tutur Bugis Dialek Sidenreng Rappang,” pp. 1–28, 2021, [Online]. Available: http://eprints.unm.ac.id/20489/%0Ahttp://eprints.unm.ac.id/20489/1/ARTIKEL.pdf
N. W. Wardani and P. G. S. Cipta Nugraha, “STEMMING DOKUMEN TEKS BAHASA BALI DENGAN METODE RULE BASE APPROACH,” JATISI (Jurnal Tek. Inform. dan Sist. Informasi), vol. 7, no. 3, pp. 510–521, 2020, doi: 10.35957/jatisi.v7i3.538.
S. Ibrihich, A. Oussous, O. Ibrihich, and M. Esghir, “A Review on recent research in information retrieval,” Procedia Comput. Sci., vol. 201, no. C, pp. 777–782, 2022, doi: 10.1016/j.procs.2022.03.106.
R. Kurniasari, R. Santoso, and A. Prahutama, “ANALISIS KECENDERUNGAN LAPORAN MASYARAKAT PADA ‘LAPORGUB..!’ PROVINSI JAWA TENGAH MENGGUNAKAN TEXT MINING DENGAN FUZZY C-MEANS CLUSTERING,” J. Gaussian, vol. 10, no. 4, pp. 544–553, 2021, doi: 10.14710/j.gauss.v10i4.33101.
Z. Muhamad, “Jurnal Teknik Informatika Atmaluhur,” J. Tek. Inform. Atmaluhur, vol. 6, no. 1, p. 40, 2018.
S. B. Rossi Hersianie, “Analisa Modifikasi Algoritma Stemming Untuk Kasus Overstemming,” Teknokom, vol. 3, no. 2, pp. 23–28, 2020, doi: 10.31943/teknokom.v3i2.51.
A. S. Nurul Justina Mahardianingroem, “Utuk Mengurangi Kesalahan Stemming Bahasa,” vol. 10, no. 2, pp. 103–112, 2018.
R. Maulidi, “Modifikasi Metode Enhanced Confix Stripping,” Pros. Semin. Nas. FDI, no. December, pp. 12–15, 2016.
A. Sholihin, F. Solihin, and F. H. Rachman, “Penerapan Modifikasi Metode Enhanced Confix Stripping Stemmer Pada Teks Berbahasa Madura,” J. Sarj. Tek. Inform., vol. 2, no. 1, pp. 305–314, 2013.
A. P. Wibawa, F. A. Dwiyanto, I. A. E. Zaeni, R. K. Nurrohman, and A. Afandi, “Stemming javanese affix words using nazief and adriani modifications,” J. Inform., vol. 14, no. 1, p. 36, 2020, doi: 10.26555/jifo.v14i1.a17106.
M. A. Nq, L. P. Manik, and D. Widiyatmoko, “Stemming Javanese: Another Adaptation of the Nazief-Adriani Algorithm,” 2020 3rd Int. Semin. Res. Inf. Technol. Intell. Syst. ISRITI 2020, pp. 627–631, 2020, doi: 10.1109/ISRITI51436.2020.9315420.
N. Hidayatullah, Aji Prasetya Wibawa, and Harits Ar Rosyid, “Penerapan ECS Stemmer untuk Modifikasi Nazief & Adriani Berbahasa Jawa,” J. RESTI (Rekayasa Sist. dan Teknol. Informasi), vol. 3, no. 3, pp. 343–348, 2019, doi: 10.29207/resti.v3i3.994.
D. Soyusiawaty, A. H. S. Jones, and N. L. Lestariw, “The Stemming Application on Affixed Javanese Words by using Nazief and Adriani Algorithm,” IOP Conf. Ser. Mater. Sci. Eng., vol. 771, no. 1, 2020, doi: 10.1088/1757-899X/771/1/012026.
M. A. Muchtar et al., “Separation of Basic Words in Angkola Batak Text Documents using Enhanced Confix Stripping Stemmer Case: Mandailing Ethnic,” IOP Conf. Ser. Mater. Sci. Eng., vol. 648, no. 1, 2019, doi: 10.1088/1757-899X/648/1/012024.
J. Jumadi, D. S. Maylawati, L. D. Pratiwi, and M. A. Ramdhani, “Comparison of Nazief-Adriani and Paice-Husk algorithm for Indonesian text stemming process,” IOP Conf. Ser. Mater. Sci. Eng., vol. 1098, no. 3, p. 032044, 2021, doi: 10.1088/1757-899x/1098/3/032044.
I. P. M. Wirayasa, I. M. A. Wirawan, and I. M. A. Pradnyana, “Algoritma Bastal: Adaptasi Algoritma Nazief & Adriani Untuk Stemming Teks Bahasa Bali,” J. Nas. Pendidik. Tek. Inform., vol. 8, no. 1, p. 60, 2019, doi: 10.23887/janapati.v8i1.13500.
D. Wahyudi, T. Susyanto, and D. Nugroho, “Implementasi Dan Analisis Algoritma Stemming Nazief & Adriani Dan Porter Pada Dokumen Berbahasa Indonesia,” J. Ilm. SINUS, vol. 15, no. 2, pp. 49–56, 2017, doi: 10.30646/sinus.v15i2.305.
K. N. Lakonawa, S. A. S. Mola, and A. Fanggidae, “Nazief-Adriani Stemmer Dengan Imbuhan Tak Baku Pada Normalisasi Bahasa Percakapan Di Media Sosial,” J. Komput. dan Inform., vol. 9, no. 1, pp. 65–73, 2021, doi: 10.35508/jicon.v9i1.3749.
P. Ruriana, “Hubungan Kekerabatan Bahasa Jawa Dan Madura,” Kandai, vol. 14, no. 1, p. 15, 2018, doi: 10.26499/jk.v14i1.512.
M. H. Effendy, “Problematika Periodisasi Ejaan Bahasa Madura Dalam Perspektif Praktisi Madura,” Okara, vol. 2, pp. 24–34, 2013.
N. W. Wardani and P. G. S. C. Nugraha, “Stemming Teks Bahasa Bali dengan Algoritma Enhanced Confix Stripping,” Int. J. Nat. Sci. Eng., vol. 4, no. 3, p. 103, 2020, doi: 10.23887/ijnse.v4i3.30309.
H. Mukhtar, J. Al Amien, and M. A. Rucyat, “Filtering Spam Email menggunakan Algoritma Naïve Bayes,” J. CoSciTech (Computer Sci. Inf. Technol., vol. 3, no. 1, pp. 9–19, 2022, doi: 10.37859/coscitech.v3i1.3652.
S. Firman, W. Desena, A. Wibowo, M. I. Komputer, and U. B. Luhur, “Penerapan Algoritma Stemming Nazief & Adriani Pada Proses Klasterisasi Berita Berdasarkan Tematik Pada Laman (Web) Direktorat Jenderal HAM Menggunakan Rapidminer,” Syntax J. Inform., vol. 11, no. 02, pp. 10–21, 2022.
W. Hidayat, E. Utami, and A. D. Hartanto, “Effect of Stemming Nazief Adriani on the Ratcliff/Obershelp algorithm in identifying level of similarity between slang and formal words,” 2020 3rd Int. Conf. Inf. Commun. Technol. ICOIACT 2020, pp. 22–27, 2020, doi: 10.1109/ICOIACT50329.2020.9331973.
G. Septian, A. Susanto, and G. F. Shidik, “Indonesian news classification based on NaBaNA,” Proc. - 2017 Int. Semin. Appl. Technol. Inf. Commun. Empower. Technol. a Better Hum. Life, iSemantic 2017, vol. 2018-Janua, pp. 175–180, 2017, doi: 10.1109/ISEMANTIC.2017.8251865.
Copyright (c) 2023 Enni Lindrawati, Ema Utami, Ainul Yaqin
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with this journal agree to the following terms:
1. Copyright on any article is retained by the author(s).
2. The author grants the journal, right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgment of the work’s authorship and initial publication in this journal.
3. Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal’s published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
4. Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work.
5. The article and any associated published material is distributed under the Creative Commons Attribution-ShareAlike 4.0 International License