Comparing Data Mining Classification for Online Fraud Victim Profile in Indonesia
Abstract
Classification is one of the most often employed data mining techniques. It focuses on developing a classification model or function, also known as a classifier, and predicting the class of objects whose class label is unknown. Categorizing applications include pattern recognition, medical diagnosis, identifying weaknesses in organizational systems, and classifying changes in the financial markets. The objectives of this study are to develop a profile of a victim of online fraud and to contrast the approaches frequently used in data mining for classification based on Accuracy, Classification Error, Precision, and Recall. The survey was conducted using Google Forms, which is an online platform. Naive Bayes, Decision Tree, and Random Forest algorithms are popular models for classification in data mining. Based on the sociodemographics of Indonesia's online crime victims, these models are used to classify and predict. The result shows that Naïve Bayes and Decision Tree are slightly superior to the Random Forest Model. Naive Bayes and Decision Tree have an accuracy value of 77.3%, while Random Forest values 76.8%.
Downloads
References
A. A. Gillespie and S. Magor, “Tackling online fraud,” ERA Forum, vol. 20, no. 3, pp. 439–454, 2020, doi: 10.1007/s12027-019-00580-y.
N. P. Singh, “Online Frauds in Banks with Phishing,” J. Internet Bank. Commer., vol. 12, no. 2, pp. 1–28, 2007, [Online]. Available: http://eprints.utm.my/8136/.
Sunardi, A. Fadlil, and N. M. P. Kusuma, “Implementasi Data Mining dengan Algoritma Naïve Bayes untuk Profiling Korban Penipuan Online di Indonesia,” vol. 6, pp. 1562–1572, 2022, doi: 10.30865/mib.v6i3.3999.
A. M. Marshal, Digital Forensics Digital Evidence in Criminal Investigations, 1st ed. Wiley-Blackwell, 2009.
E. R. Leukfeldt, “Phishing for suitable targets in the Netherlands: Routine activity theory and phishing victimization,” Cyberpsychology, Behav. Soc. Netw., vol. 17, no. 8, pp. 551–555, 2014, doi: 10.1089/cyber.2014.0008.
R. Ahmad and R. Thurasamy, “A Systematic Literature Review of Routine Activity Theory’s Applicability in Cybercrimes,” J. Cyber Secur. Mobil., vol. 11, no. 3, pp. 405–432, 2022, doi: 10.13052/jcsm2245-1439.1133.
J. Hawdon, M. Costello, T. Ratliff, L. Hall, and J. Middleton, “Conflict Management Styles and Cybervictimization: Extending Routine Activity Theory,” Sociol. Spectr., vol. 37, no. 4, pp. 250–266, 2017, doi: 10.1080/02732173.2017.1334608.
E. I. B. C. Tompsett, A. M. Marshall, and N. C. Semmens, “Cyberprofiling: Offender profiling and geographic profiling of crime on the internet,” Work. 1st Int. Conf. Secur. Priv. Emerg. Areas Commun. Networks, 2005, vol. 2005, pp. 22–25, 2005, doi: 10.1109/SECCMW.2005.1588290.
M. M. Hassan, “Customer Profiling and Segmentation in Retail Banks Using Data Mining Techniques,” Int. J. Adv. Res. Comput. Sci., vol. 9, no. 4, pp. 24–29, 2018, doi: 10.26483/ijarcs.v9i4.6172.
K. K. Sindhu and B. B. Meshram, “Digital Forensics and Cyber Crime Datamining,” J. Inf. Secur., vol. 03, no. 03, pp. 196–201, 2012, doi: 10.4236/jis.2012.33024.
Angkasa, “Legal Protection for Cyber Crime Victims on Victimological Perspective,” SHS Web Conf., vol. 54, p. 08004, 2018, doi: 10.1051/shsconf/20185408004.
B. K. Mamade and D. M. Dabala, “Exploring The Correlation between Cyber Security Awareness, Protection Measures and the State of Victimhood: The Case Study of Ambo University’s Academic Staffs,” J. Cyber Secur. Mobil., vol. 10, no. 4, pp. 699–724, 2021, doi: 10.13052/jcsm2245-1439.1044.
S. R. Sebastian, B. P. Babu, and S. R. Sebastian, “Are we cyber aware ? A cross sectional study on the prevailing cyber practices among adults from Thiruvalla , Kerala,” vol. 10, no. 1, pp. 235–239, 2023, doi: 10.18203/2394-6040.ijcmph20223550.
A. Kigerl, “Routine Activity Theory and the Determinants of High Cybercrime Countries,” Soc. Sci. Comput. Rev., vol. 30, no. 4, pp. 470–486, 2012, doi: 10.1177/0894439311422689.
T. Van Nguyen, “Cybercrime in Vietnam: An analysis based on routine activity theory,” Int. J. Cyber Criminol., vol. 14, no. 1, pp. 156–173, 2020, doi: 10.5281/zenodo.3747516.
A. Alzubaidi, “Measuring the level of cyber-security awareness for cybercrime in Saudi Arabia,” Heliyon, vol. 7, no. 1, p. e06016, 2021, doi: 10.1016/j.heliyon.2021.e06016.
A. Alzubaidi, “Cybercrime Awareness among Saudi Nationals: Dataset,” Data Br., vol. 36, p. 106965, 2021, doi: 10.1016/j.dib.2021.106965.
R. Saroha, “Profiling a Cyber Criminal,” Int. J. Inf. Comput. Technol., vol. 4, no. 3, pp. 253–258, 2014.
N. Innab, H. Al-Rashoud, R. Al-Mahawes, and Wauood Al-Shehri, “Evaluation of the Effective Anti-Phishing Awareness and Training in Governmental and Private Organizations in Riyadh,” 2018 21st Saudi Comput. Soc. Natl. Comput. Conf., pp. 1–5, 2018, doi: 10.1109/NCG.2018.8593144.
C. M. M. Reep-van den Bergh and M. Junger, “Victims of cybercrime in Europe: a review of victim surveys,” Crime Sci., vol. 7, no. 1, 2018, doi: 10.1186/s40163-018-0079-3.
F. Alotaibi, S. Furnell, I. Stengel, and M. Papadaki, “A survey of cyber-security awareness in Saudi Arabia,” 2016 11th Int. Conf. Internet Technol. Secur. Trans., pp. 154–158, 2016, doi: 10.1109/ICITST.2016.7856687.
E. I. M. Zayid and N. A. A. Farah, “A study on cybercrime awareness test in Saudi Arabia - Alnamas region,” 2017 2nd Int. Conf. Anti-Cyber Crimes, pp. 199–202, 2017, doi: 10.1109/Anti-Cybercrime.2017.7905290.
N. A. G. Arachchilage and S. Love, “Security awareness of computer users: A phishing threat avoidance perspective, Computers in Human Behavior,” Comput. Human Behav., vol. 38, no. 304–312, p. 161, 2014, doi: 10.1016/j.chb.2014.05.046.
J. Abawajy, “User preference of cyber security awareness delivery methods,” Behav. Inf. Technol. - Behav. IT., vol. 33, pp. 1–12, 2012, doi: 10.1080/0144929X.2012.708787.
N. Ahmed, U. Kulsum, I. Bin Azad, A. S. Z. Momtaz, M. E. Haque, and M. S. Rahman, “Cybersecurity awareness survey: An analysis from Bangladesh perspective,” p. 111, 2017, doi: 10.1109/R10-HTC.2017.8289074.Abstract.
M. Norouzi, A. Souri, and M. S. Zamini, “Behavioral Malware Detection,” vol. 2016, pp. 20–22, 2016.
S. Palaniappan, A. Mustapha, C. F. M. Foozy, and R. Atan, “Customer profiling using classification approach for bank telemarketing,” Int. J. Informatics Vis., vol. 1, no. 4–2, pp. 214–217, 2017, doi: 10.30630/joiv.1.4-2.68.
M. Server, R. Excel, T. Rapidminer, and R.-M. Value, “Analysis of classification algorithms with rapidminer,” pp. 517–520.
Dr.J.Arunadevi, S.Ramya, and M. R. Raja, “A study of classification algorithms using Rapidminer,” Int. J. Pure Appl. Math., vol. Volume 119, no. 12, pp. 15977–15988, 2018.
N. Baharun, N. F. M. Razi, S. Masrom, N. A. M. Yusri, and A. S. A. Rahman, “Auto Modelling for Machine Learning: A Comparison Implementation between RapidMiner and Python,” Int. J. Emerg. Technol. Adv. Eng., vol. 12, no. 05, pp. 15–27, 2022, doi: 10.46338/ijetae0522.
J. F. Andry and H. Hartono, “Analysis and Prediction Supermarket Sales with Data Mining using RapidMiner Analysis and Prediction Supermarket Sales with Data Mining using RapidMiner,” no. January, 2022.
Rapidminer, “What’s New in RapidMiner Server 9,” no. September, 2020, [Online]. Available: https://docs.rapidminer.com/9.2/server/releases/changes-9.2.0.html.
G. Michael, “Knowledge Based System for Predicting Cyber Crime Patterns Using Data Mining,” J. Crit. Rev., vol. 7, no. 10, pp. 2043–2053, 2020.
V. Metsis, I. Androutsopoulos, and G. Paliouras, “Spam filtering with Naive Bayes - Which Naive Bayes?,” 3rd Conf. Email Anti-Spam - Proceedings, CEAS 2006, no. January, 2006.
Y. K. Putra, Fathurrahman, and M. Sadali, “Comparison of Pso-Based Naive Bayes and Naive Bayes Algorithm in Determining the Feasibility of Bumdes Credit,” J. Phys. Conf. Ser., vol. 1539, no. 1, 2020, doi: 10.1088/1742-6596/1539/1/012030.
G. Oh, J. Song, H. Park, and C. Na, “Evaluation of Random Forest in Crime Prediction: Comparing Three-Layered Random Forest and Logistic Regression,” Deviant Behav., vol. 00, no. 00, pp. 1–14, 2021, doi: 10.1080/01639625.2021.1953360.
R. C. Barros, M. P. Basgalupp, A. C. P. L. F. De Carvalho, and A. A. Freitas, “A survey of evolutionary algorithms for decision-tree induction,” IEEE Trans. Syst. Man Cybern. Part C Appl. Rev., vol. 42, no. 3, pp. 291–312, 2012, doi: 10.1109/TSMCC.2011.2157494.
Anuradha and G. Gupta, “A self explanatory review of decision tree classifiers,” Int. Conf. Recent Adv. Innov. Eng. ICRAIE 2014, no. June, 2014, doi: 10.1109/ICRAIE.2014.6909245.
H. Hauska and P. Swain, “The Decision Tree Classifier : Design and Potential Hans Hauska,” no. June, 2014.
B. Çiǧşar and D. Ünal, “Comparison of Data Mining Classification Algorithms Determining the Default Risk,” Sci. Program., vol. 2019, 2019, doi: 10.1155/2019/8706505.
R. Sharma, S. N. Singh, and S. Khatri, “Data mining classification techniques - Comparison for better accuracy in prediction of cardiovascular disease,” Int. J. Data Anal. Tech. Strateg., vol. 11, no. 4, pp. 356–373, 2019, doi: 10.1504/IJDATS.2019.103756.
L. Marlina, M. lim, and A. P. Utama Siahaan, “Data Mining Classification Comparison (Naïve Bayes and C4.5 Algorithms),” Int. J. Eng. Trends Technol., vol. 38, no. 7, pp. 380–383, 2016, doi: 10.14445/22315381/ijett-v38p268.
A. I. Kusumarini, P. A. Hogantara, M. Fadhlurohman, and N. Chamidah, “Perbandingan Algoritma Random Forest, Naïve Bayes, Dan Decision Tree Dengan Oversampling Untuk Klasifikasi Bakteri E. Coli,” no. April, pp. 792–799, 2021.
I. P. Wibina, K. Gumi, and A. Syafrianto, “Perbandingan Algoritma Naïve Bayes dan Decision Tree Pada Sentimen Analisis,” vol. 1, pp. 1–15, 2022.
Copyright (c) 2023 Sunardi Sunardi, Abdul Fadlil, Nur Makkie Perdana Kusuma
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with this journal agree to the following terms:
1. Copyright on any article is retained by the author(s).
2. The author grants the journal, right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgment of the work’s authorship and initial publication in this journal.
3. Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal’s published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
4. Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work.
5. The article and any associated published material is distributed under the Creative Commons Attribution-ShareAlike 4.0 International License