Comparison of Modified Nazief&Adriani and Modified Enhanced Confix Stripping algorithms for Madurese Language Stemming

Abstract views: 198 , PDF downloads: 233
Keywords: Madurese, Morphology, Stemming, Nazief & Adriani, ECS

Abstract

The Madurese language has a unique morphology. The morphological uniqueness can be used to find basic words. The basic word process is called stemming. Stemming can be developed into an application for translating Madurese into Indonesian and even other languages. It can support the development of a Madurese language text plagiarism system. Stemming research on the Madurese language is still rare. Therefore, this study aims to find the basic words of the Madurese language using modifications to the Nazief & Adriani algorithm and Enhanced Confix Stripping (ECS) modifications. The study used 1000 Madurese words, consisting of 630 prefix words, 74 ending words, and 296 confix words. The results showed that the modification of the Nazief & Adriani algorithm was better, shown by the accuracy obtained of 88.8% with overstemming of 0.7% and understemming of 10.5%. As for ECS, an accuracy of 74.0% was obtained, 0.4% overstemming, and 25.6% understemming. In the same process, Nazief&Adriani's modification is faster than the ECS modification. For the Nazief&Adriani modification, it takes 13.31 seconds while for the ECS modification, it takes 210.88.

Downloads

Download data is not yet available.

Author Biographies

Enni Lindrawati, Universitas Amikom Yogyakarta

Magister of Informatics Engineering, Universitas Amikom Yogyakarta

Ema Utami, Universitas Amikom Yogyakarta

Magister of Informatics Engineering, Universitas Amikom Yogyakarta

Ainul Yaqin, Universitas Amikom Yogyakarta

Faculty of Computer Science, Universitas Amikom Yogyakarta

References

F. L. Fitri Lintang and F. Ulfatun Najicha, “Nilai-Nilai Sila Persatuan Indonesia Dalam Keberagaman Kebudayaan Indonesia,” J. Glob. Citiz. J. Ilm. Kaji. Pendidik. Kewarganegaraan, vol. 11, no. 1, pp. 79–85, 2022, doi: 10.33061/jgz.v11i1.7469.

R. Peter and M. S. Simatupang, “Keberagaman Bahasa Dan Budaya Sebagai Kekayaan Bangsa Indonesia,” Dialekt. J. Bahasa, Sastra Dan Budaya, vol. 9, no. 1, pp. 96–105, 2022, doi: 10.33541/dia.v9i1.4028.

A. F. Hidayati, “Afiks Nomina Deverbal dalam Kumpulan Cerpen Bahasa Madura,” Konf. Linguist. Tah. Atma Jaya 19, pp. 17–20, 2021.

I. Irwiandi and M. Norman, “Proses Morfologis pada Bahasa Madura: Studi pada Mahasiswa Madura di Universitas Trunojoyo,” AIJER Algazali Int. J. Educ. Res., vol. 5, no. 1, pp. 68–75, 2022.

T. Winarti et al., “Penanganan Kasus Overstemming dan Understemming dengan Modifikasi Algoritma Stemming,” IOP Conf. Ser. Mater. Sci. Eng., vol. 6, no. 1, pp. 199–206, 2020, doi: 10.18517/ijaseit.7.5.1705.

A. Andriani, “Morfofonemik Bahasa Indonesia Pada Masyarakat Tutur Bugis Dialek Sidenreng Rappang,” pp. 1–28, 2021, [Online]. Available: http://eprints.unm.ac.id/20489/%0Ahttp://eprints.unm.ac.id/20489/1/ARTIKEL.pdf

N. W. Wardani and P. G. S. Cipta Nugraha, “STEMMING DOKUMEN TEKS BAHASA BALI DENGAN METODE RULE BASE APPROACH,” JATISI (Jurnal Tek. Inform. dan Sist. Informasi), vol. 7, no. 3, pp. 510–521, 2020, doi: 10.35957/jatisi.v7i3.538.

S. Ibrihich, A. Oussous, O. Ibrihich, and M. Esghir, “A Review on recent research in information retrieval,” Procedia Comput. Sci., vol. 201, no. C, pp. 777–782, 2022, doi: 10.1016/j.procs.2022.03.106.

R. Kurniasari, R. Santoso, and A. Prahutama, “ANALISIS KECENDERUNGAN LAPORAN MASYARAKAT PADA ‘LAPORGUB..!’ PROVINSI JAWA TENGAH MENGGUNAKAN TEXT MINING DENGAN FUZZY C-MEANS CLUSTERING,” J. Gaussian, vol. 10, no. 4, pp. 544–553, 2021, doi: 10.14710/j.gauss.v10i4.33101.

Z. Muhamad, “Jurnal Teknik Informatika Atmaluhur,” J. Tek. Inform. Atmaluhur, vol. 6, no. 1, p. 40, 2018.

S. B. Rossi Hersianie, “Analisa Modifikasi Algoritma Stemming Untuk Kasus Overstemming,” Teknokom, vol. 3, no. 2, pp. 23–28, 2020, doi: 10.31943/teknokom.v3i2.51.

A. S. Nurul Justina Mahardianingroem, “Utuk Mengurangi Kesalahan Stemming Bahasa,” vol. 10, no. 2, pp. 103–112, 2018.

R. Maulidi, “Modifikasi Metode Enhanced Confix Stripping,” Pros. Semin. Nas. FDI, no. December, pp. 12–15, 2016.

A. Sholihin, F. Solihin, and F. H. Rachman, “Penerapan Modifikasi Metode Enhanced Confix Stripping Stemmer Pada Teks Berbahasa Madura,” J. Sarj. Tek. Inform., vol. 2, no. 1, pp. 305–314, 2013.

A. P. Wibawa, F. A. Dwiyanto, I. A. E. Zaeni, R. K. Nurrohman, and A. Afandi, “Stemming javanese affix words using nazief and adriani modifications,” J. Inform., vol. 14, no. 1, p. 36, 2020, doi: 10.26555/jifo.v14i1.a17106.

M. A. Nq, L. P. Manik, and D. Widiyatmoko, “Stemming Javanese: Another Adaptation of the Nazief-Adriani Algorithm,” 2020 3rd Int. Semin. Res. Inf. Technol. Intell. Syst. ISRITI 2020, pp. 627–631, 2020, doi: 10.1109/ISRITI51436.2020.9315420.

N. Hidayatullah, Aji Prasetya Wibawa, and Harits Ar Rosyid, “Penerapan ECS Stemmer untuk Modifikasi Nazief & Adriani Berbahasa Jawa,” J. RESTI (Rekayasa Sist. dan Teknol. Informasi), vol. 3, no. 3, pp. 343–348, 2019, doi: 10.29207/resti.v3i3.994.

D. Soyusiawaty, A. H. S. Jones, and N. L. Lestariw, “The Stemming Application on Affixed Javanese Words by using Nazief and Adriani Algorithm,” IOP Conf. Ser. Mater. Sci. Eng., vol. 771, no. 1, 2020, doi: 10.1088/1757-899X/771/1/012026.

M. A. Muchtar et al., “Separation of Basic Words in Angkola Batak Text Documents using Enhanced Confix Stripping Stemmer Case: Mandailing Ethnic,” IOP Conf. Ser. Mater. Sci. Eng., vol. 648, no. 1, 2019, doi: 10.1088/1757-899X/648/1/012024.

J. Jumadi, D. S. Maylawati, L. D. Pratiwi, and M. A. Ramdhani, “Comparison of Nazief-Adriani and Paice-Husk algorithm for Indonesian text stemming process,” IOP Conf. Ser. Mater. Sci. Eng., vol. 1098, no. 3, p. 032044, 2021, doi: 10.1088/1757-899x/1098/3/032044.

I. P. M. Wirayasa, I. M. A. Wirawan, and I. M. A. Pradnyana, “Algoritma Bastal: Adaptasi Algoritma Nazief & Adriani Untuk Stemming Teks Bahasa Bali,” J. Nas. Pendidik. Tek. Inform., vol. 8, no. 1, p. 60, 2019, doi: 10.23887/janapati.v8i1.13500.

D. Wahyudi, T. Susyanto, and D. Nugroho, “Implementasi Dan Analisis Algoritma Stemming Nazief & Adriani Dan Porter Pada Dokumen Berbahasa Indonesia,” J. Ilm. SINUS, vol. 15, no. 2, pp. 49–56, 2017, doi: 10.30646/sinus.v15i2.305.

K. N. Lakonawa, S. A. S. Mola, and A. Fanggidae, “Nazief-Adriani Stemmer Dengan Imbuhan Tak Baku Pada Normalisasi Bahasa Percakapan Di Media Sosial,” J. Komput. dan Inform., vol. 9, no. 1, pp. 65–73, 2021, doi: 10.35508/jicon.v9i1.3749.

P. Ruriana, “Hubungan Kekerabatan Bahasa Jawa Dan Madura,” Kandai, vol. 14, no. 1, p. 15, 2018, doi: 10.26499/jk.v14i1.512.

M. H. Effendy, “Problematika Periodisasi Ejaan Bahasa Madura Dalam Perspektif Praktisi Madura,” Okara, vol. 2, pp. 24–34, 2013.

N. W. Wardani and P. G. S. C. Nugraha, “Stemming Teks Bahasa Bali dengan Algoritma Enhanced Confix Stripping,” Int. J. Nat. Sci. Eng., vol. 4, no. 3, p. 103, 2020, doi: 10.23887/ijnse.v4i3.30309.

H. Mukhtar, J. Al Amien, and M. A. Rucyat, “Filtering Spam Email menggunakan Algoritma Naïve Bayes,” J. CoSciTech (Computer Sci. Inf. Technol., vol. 3, no. 1, pp. 9–19, 2022, doi: 10.37859/coscitech.v3i1.3652.

S. Firman, W. Desena, A. Wibowo, M. I. Komputer, and U. B. Luhur, “Penerapan Algoritma Stemming Nazief & Adriani Pada Proses Klasterisasi Berita Berdasarkan Tematik Pada Laman (Web) Direktorat Jenderal HAM Menggunakan Rapidminer,” Syntax J. Inform., vol. 11, no. 02, pp. 10–21, 2022.

W. Hidayat, E. Utami, and A. D. Hartanto, “Effect of Stemming Nazief Adriani on the Ratcliff/Obershelp algorithm in identifying level of similarity between slang and formal words,” 2020 3rd Int. Conf. Inf. Commun. Technol. ICOIACT 2020, pp. 22–27, 2020, doi: 10.1109/ICOIACT50329.2020.9331973.

G. Septian, A. Susanto, and G. F. Shidik, “Indonesian news classification based on NaBaNA,” Proc. - 2017 Int. Semin. Appl. Technol. Inf. Commun. Empower. Technol. a Better Hum. Life, iSemantic 2017, vol. 2018-Janua, pp. 175–180, 2017, doi: 10.1109/ISEMANTIC.2017.8251865.

PlumX Metrics

Published
2023-08-05
How to Cite
[1]
E. Lindrawati, E. Utami, and A. Yaqin, “Comparison of Modified Nazief&Adriani and Modified Enhanced Confix Stripping algorithms for Madurese Language Stemming”, intensif, vol. 7, no. 2, pp. 276-289, Aug. 2023.