Classification of Diabetes Mellitus (DM) Using the Naïve Bayes Method with Chi-Square Variable Selection

Authors

  • Farhan Arizal Ginanjar Universitas Nahdlatul Ulama Purwokerto
  • Ambar Winarni Universitas Nahdlatul Ulama Purwokerto
  • Nur'aini Muhassanah Universitas Nahdlatul Ulama Purwokerto

DOI:

https://doi.org/10.29407/gj.v10i2.27878

Keywords:

Diabetes Mellitus, Classification, Naïve Bayes, Chi-Square

Abstract

Diabetes mellitus (DM) is a chronic disease that can cause serious complications, making early detection essential. Technological advances enable the use of data mining techniques, particularly the Naïve Bayes classification method, to support early diabetes detection. Although Chi-Square variable selection is known to improve Naïve Bayes accuracy, studies examining the impact of different significance levels remain limited. Therefore, this study applies the Naïve Bayes method with and without Chi-Square variable selection at three significance levels (α = 0.05, α = 0.01, and α = 0.001) to evaluate their effects on classification performance and identify the optimal significance level. The results show that Naïve Bayes without variable selection achieved an accuracy of 87.50%, precision of 93.01%, and recall of 86.21%. After applying Chi-Square selection, performance improved across all significance levels. At α = 0.05, the accuracy reached 87.88%, with precision of 93.06% and recall of 86.85%. At α = 0.01, accuracy increased to 88.46%, precision to 94.25%, and recall to 86.53%. The best performance was obtained at α = 0.001, achieving an accuracy of 88.65%, precision of 94.19%, and recall of 86.86%. These findings indicate that Chi-Square variable selection effectively enhances the performance of the Naïve Bayes algorithm for diabetes classification

 

Abstract views: 5 , PDF downloads: 5

Author Biographies

  • Ambar Winarni, Universitas Nahdlatul Ulama Purwokerto

    Dosen

  • Nur'aini Muhassanah, Universitas Nahdlatul Ulama Purwokerto

    Dosen

References

[1] I. Roifah, “Analisis Hubungan Lama Menderita Diabetes Mellitus Dengan Kualitas Hidup Penderita Diabetes Mellitus,” Jurnal Ilmu Kesehatan, vol. 4, no. 2, 2016, doi: 10.32831/jik.v4i2.84.

[2] K. Yudianto, H. Rizmadewi, and I. Maryati, “Kualitas Hidup Penderita Diabetes Mellitus Di Rumah Sakit Umum Daerah Cianjur,” 2008.

[3] International Diabetes Federation, “IDF Diabetes Atlas 11th Edition,” 2025.

[4] Lestari, Zulkarnain, and S. Aisyah Sijid, “Diabetes Melitus: Review Etiologi, Patofisiologi, Gejala, Penyebab, Cara Pemeriksaan, Cara Pengobatan dan Cara Pencegahan,” Jurusan Biologi, Fakultas Sains dan Teknologi, p. 237, Nov. 2021, [Online]. Available: http://journal.uin-alauddin.ac.id/index.php/psb

[5] O. R. Simatupang, M. Kristina, S. Nauli, and H. Sibolga, “Penyuluhan Tentang Diabetes Melitus Pada Lansia Penderita DM,” JPM Jurnal Pengabdian Mandiri, vol. 2, no. 3, 2023, [Online]. Available: http://bajangjournal.com/index.php/JPM

[6] Y. Mardi, “Data Mining : Klasifikasi Menggunakan Algoritma C4.5,” Jurnal Edik informatika, vol. 2, pp. 213–219, 2017, doi: 10.22202/ei.2016.v2i2.1465.

[7] T. Novika, P. Poningsih, H. Okprana, A. P. Windarto, and H. Siahaan, “Penerapan Data Mining Klasifikasi Tingkat Pemahaman Siswa Pada Pelajaran Matematika,” Jurnal Media Informatika Budidarma, vol. 5, no. 1, pp. 9–17, Jan. 2021, doi: 10.30865/mib.v5i1.2498.

[8] E. K. Putri and T. Setiadi, “Penerapan Text Mining Pada Sistem Klasifikasi Email Spam Menggunakan Naive Bayes,” Jurnal Sarjana Teknik Informatika, vol. 2, pp. 73–83, 2014, doi: 10.12928/jstie.v2i3.2877.

[9] D. Septhya et al., “Implementasi Algoritma Decision Tree dan Support Vector Machine untuk Klasifikasi Penyakit Kanker Paru,” MALCOM: Indonesian Journal of Machine Learning and Computer Science, vol. 3, pp. 15–19, Apr. 2023, doi: 10.57152/malcom.v3i1.591.

[10] M. Arifin, F. Helmi, and R. Bagus Hikmawansyah, “Analisis Metode Dan Algoritma Dalam Sistem Pendukung Keputusan Untuk Memprediksi Kelulusan,” Jurnal Advance Research Informatika, vol. 3, no. 1, p. 73, 2024, doi: 10.24929/jars.v3i1.4045.

[11] H. Muhamad, C. A. Prasojo, N. A. Sugianto, L. Surtiningsih, and I. Cholissodin, “Optimasi Naïve Bayes Classifier Dengan Menggunakan Particle Swarm Optimization Pada Data Iris,” Jurnal Teknologi Informasi dan Ilmu Komputer (JTIIK), vol. 4, no. 3, pp. 180–184, Sep. 2017, doi: 10.25126/jtiik.201743251.

[12] I. Riswanto and R. H. Laluma, “Klasifikasi Kelayakan Pinjaman Pada Koperasi Karyawan Menggunakan Metode Naïve Bayes Classifier Berbasis WEB,” Infotronik : Jurnal Teknologi Informasi dan Elektronika, vol. 5, no. 1, pp. 11–16, Jun. 2020, doi: 10.32897/infotronik.2020.5.1.2.

[13] A. V. Agustin and A. Voutama, “Implementasi Data Mining Klasifikasi Penyakit Diabetes Pada Perempuan Menggunakan Naïve BayeS,” 2023. doi: 10.36040/jati.v7i2.6808.

[14] M. Danny and A. Muhidin, “Analisis Prediksi Resiko Diabetes Tahap Awal Menggunakan Algoritma Naive Bayes,” Jurnal Teknologi Informatika dan Komputer MH. Thamrin, vol. 9, no. 2, pp. 1443–1459, Sep. 2023, doi: 10.37012/jtik.v9i2.2017.

[15] R. Aziz, C. K. Verma, and N. Srivastava, “Dimension reduction methods for microarray data: a review,” AIMS Bioeng, vol. 4, no. 1, pp. 179–197, 2017, doi: 10.3934/bioeng.2017.1.179.

[16] G. Kicska and A. Kiss, “Comparing swarm intelligence algorithms for dimension reduction in machine learning,” Big Data and Cognitive Computing, vol. 5, no. 3, Sep. 2021, doi: 10.3390/bdcc5030036.

[17] D. H. Jeong, B. K. Jeong, N. Leslie, C. Kamhoua, and S.-Y. Ji, “Designing a supervised feature selection technique for mixed attribute data analysis,” Machine Learning with Applications, vol. 10, p. 100431, Dec. 2022, doi: 10.1016/j.mlwa.2022.100431.

[18] H. Chauhan, K. Modi, and S. Shrivastava, “Development of a classifier with analysis of feature selection methods for COVID-19 diagnosis,” World Journal of Engineering, vol. 19, no. 1, pp. 49–57, Feb. 2021, doi: 10.1108/WJE-10-2020-0537.

[19] D. R. Anamisa, F. A. Mufarroha, and A. Jauhari, “Visitor Decision System in Selection of Tourist Sites Based on Hybrid of Chi-Square And K-NN Methods,” Elinvo (Electronics, Informatics, and Vocational Education), vol. 8, no. 2, pp. 248–254, Jan. 2024, doi: 10.21831/elinvo.v8i2.55702.

[20] D. Ryanto Fernandes, N. Jacky Pratama Hasan, and N. Wijaya, “Optimasi Akurasi Sentimen Komentar Xiaomi SU7 di YouTube Menggunakan Naive Bayes dan Chi-Square,” Journal of Software Engineering and Computational Intelligence (JSECI), vol. 2, no. 1, 2024, doi: 10.36982/jseci.v2i01.4099.

[21] R. Yunita Kisworini and M. Akbar Setiawan, “Peningkatan Performa Naivee Bayes Dengan Seleksi Atribut Menggunakan Chi Square Untuk Klasifikasi Loyalitas Pelanggan GRAB,” Journal of Informatics, Information System, Software Engineering and Applications, vol. 2, no. 2, pp. 69–075, 2020, doi: 10.20895/INISTA.V2I2.

[22] S. Kumari, D. Kumar, and M. Mittal, “An ensemble approach for classification and prediction of diabetes mellitus using soft voting classifier,” International Journal of Cognitive Computing in Engineering, vol. 2, pp. 40–46, Jun. 2021, doi: 10.1016/j.ijcce.2021.01.001.

[23] D. Kurnia, M. Itqan Mazdadi, D. Kartini, R. Adi Nugroho, and F. Abadi, “Seleksi Fitur dengan Particle Swarm Optimization pada Klasifikasi Penyakit Parkinson Menggunakan XGBoost,” Jurnal Teknologi Informasi dan Ilmu Komputer, vol. 10, no. 5, pp. 1083–1094, Oct. 2023, doi: 10.25126/jtiik.2023107252.

[24] Y. Kamila, A. Sa’idah, A. S. Akbar, F. A. N. Azzen, A. Y. B. Rohim, and N. Chamidah, “Analisis Hubungan Antara Jalur Masuk Universitas dengan Predikat Kelulusan Mahasiswa,” Zeta - Math Journal, vol. 8, no. 1, pp. 23–29, May 2023, doi: 10.31102/zeta.2023.8.1.23-29.

[25] N. Adliani Awalia, R. Nur Shofa, and S. Yuliyanti, “Perbandingan Algoritma Pendekatan Supervised Learning Menggunakan Seleksi Fitur Chi-Square untuk Klasifikasi Status Kesehatan Jemaah Haji,” Jurnal Sistem dan Teknologi Informasi (JUSTIN), vol. 13, pp. 166–2, Jan. 2025, doi: 10.26418/justin.v13i1.86639.

[26] I. Cahya Negara and A. Prabowo, “Penggunaan Uji Chi–Square Untuk Mengetahui Pengaruh Tingkat Pendidikan Dan Umur Terhadap Pengetahuan Penasun Mengenai Hiv–Aids Di Provinsi DKI Jakarta,” Purwokerto, Sep. 2018.

[27] J. Homepage, A. Putri, and B. Purnama, “Classification of Scholarship Eligibility Using Naïve Bayes with Attribute Optimization Based on K-Means Clustering,” MALCOM: Indonesian Journal of Machine Learning and Computer Science, vol. 5, pp. 1450–1462, 2025, doi: 10.57152/malcom.v5i4.2312.

[28] Moch. A. Aprihartha and I. Idham, “Optimization of Classification Algorithms Performance with k-Fold Cross Validation,” EIGEN MATHEMATICS JOURNAL, vol. 7, no. 2, pp. 61–66, Sep. 2024, doi: 10.29303/emj.v7i2.212.

[29] E. Etriyanti, D. Syamsuar, and Y. Novaria Kunang, “Implementasi Data Mining Menggunakan Algoritme Naive Bayes Classifier dan C4.5 untuk Memprediksi Kelulusan Mahasiswa,” Telematika, vol. 13, no. 1, pp. 56–67, Feb. 2020, doi: 10.35671/telematika.v13i1.881.

[30] I. D. Ratih, S. M. Retnaningsih, and V. M. Dewi, “Klasifikasi Kualitas Tanah Menggunakan Metode Naive Bayes Classifier,” Jurnal Aplikasi Matematika dan Statistik), vol. 1, pp. 11–20, 2022, doi: 10.53625/jams.v1i1.4227.

[31] Fadlisyah and S. Eliyanda, “Pengelompokan Siswa Penyandang Disabilitas Berdasarkan Tingkat Tunagrahita Menggunakan Metode Naïve Bayes,” vol. 2, Aug. 2021, doi: 10.29103/tts.v2i1.3703.

[32] L. U. Khasanah, Y. N. Nasution, F. Deny, and T. Amijaya, “Klasifikasi Penyakit Diabetes Melitus Menggunakan Algoritma Naïve Bayes Classifier,” Jurnal Ilmiah Matematika, vol. 1, no. 1, pp. 41–50, 2022, doi: https://doi.org/10.30872/basis.v1i1.918.

[33] P. L. Romadloni, B. A. Kusuma, and W. M. Baihaqi, “Komparasi Metode Pembelajaran Mesin Untuk Implementasi Pengambilan Keputusan Dalam Menentukan Promosi Jabatan Karyawan,” JATI (Jurnal Mahasiswa Teknik Informatika), vol. 6, pp. 622–628, 2022, doi: https://doi.org/10.36040/jati.v6i2.5238.

[34] A. B. Alpiansah and Y. Ramdhani, “Optimasi Fitur dengan Forward Selection pada Estimasi Tingkat Obesitas menggunakan Random Forest,” 2023. doi: 10.32520/stmsi.v12i3.3125.

[35] C. Shi, J. Gao, J. Yu, L. Zhao, and F. Jia, “A novel similarity-constrained feature selection method for epilepsy detection via EEG signals,” Journal of King Saud University - Computer and Information Sciences, vol. 37, no. 6, pp. 1–24, Aug. 2025, doi: 10.1007/s44443-025-00152-w.

Downloads

PlumX Metrics

Published

2026-06-04

How to Cite

Classification of Diabetes Mellitus (DM) Using the Naïve Bayes Method with Chi-Square Variable Selection. (2026). Generation Journal, 10(2), 63-75. https://doi.org/10.29407/gj.v10i2.27878

Similar Articles

1-10 of 31

You may also start an advanced similarity search for this article.