Enhancing Vision Transformer Performance with Rotation Based Augmentation for Classifying Images of Colon Cancer Pathology

Rudy Eko Prasetya; M. Arief Soeleman; Farrikh Al Zami; Affandy Affandy; Aris Marjuni; Mohammad Iqbal Saryuddin Assaqty

doi:10.29407/intensif.v9i2.24918

Authors

Rudy Eko Prasetya Unuversitas Dian Nuswantoro https://orcid.org/0009-0001-5303-0890
M. Arief Soeleman Universitas Dian Nuswantoro Semarang https://orcid.org/0000-0001-6099-7023
Farrikh Al Zami Universitas Dian Nuswantoro Semarang https://orcid.org/0000-0003-2669-3864
Affandy Universitas Dian Nuswantoro Semarang https://orcid.org/0000-0003-1897-8261
Aris Marjuni Universitas Dian Nuswantoro Semarang https://orcid.org/0000-0002-4072-3081
Mohammad Iqbal Saryuddin Assaqty South China University of Technology https://orcid.org/0000-0001-7274-6299

DOI:

https://doi.org/10.29407/intensif.v9i2.24918

Keywords:

Vision Transformer, Data Augmentation, Images Classification, Colon Cancer

Abstract

Background: In medical imaging, classifying images of colon cancer pathology is still an essential challenge, especially for facilitating early diagnosis and successful intervention. Recently, Vision Transformer (ViT) models have demonstrated great promise for a variety of computer vision tasks, including the classification of medical images. However, the lack of annotated medical datasets and the intrinsic unpredictability of histopathology pictures sometimes restrict their performance. Objective: This study aims to enhance the performance of ViT models in colon cancer pathology classification by introducing a targeted data augmentation strategy, with a particular focus on rotation-based augmentation. Methods: We proposed a data augmentation pipeline that uses controlled changes to improve the number and diversity of training data. Like Rotation, Flip and Geometry are emphasized to replicate the real-world tissue orientation variations that are frequently seen in colon pathology slides. 10,000 JPEG pictures of colon cancer pathology, each with a resolution of 768 x 768 pixels, are used to train the models. We use models trained with and without the suggested augmentation pipeline to compare ViT performance across accuracy, sensitivity, and specificity in order to assess the impact of augmentation. Results: According to study results, rotation-based augmentation enhances ViT performance, achieving up to 99.30% accuracy and 99.50% sensitivity while preserving training times. In real-world pathology settings, where slide orientation varies greatly and can affect categorization consistency, these enhancements are especially pertinent. Conclusion: The proposed rotation-centric data augmentation technique enhances the performance of the ViT model in the classification of images showing colon cancer pathology.

Downloads

Download data is not yet available.

Abstract views: 34 , PDF downloads: 33

Author Biographies

Rudy Eko Prasetya, Unuversitas Dian Nuswantoro

Magister Teknologi Informasi, Universitas Dian Nuswantoro Semarang
M. Arief Soeleman, Universitas Dian Nuswantoro Semarang

Magister Teknologi Informasi, Universitas Dian Nuswantoro Semarang
Farrikh Al Zami, Universitas Dian Nuswantoro Semarang

Magister Teknologi Informasi, Universitas Dian Nuswantoro Semarang
Affandy, Universitas Dian Nuswantoro Semarang

Magister Teknologi Informasi, Universitas Dian Nuswantoro Semarang
Aris Marjuni, Universitas Dian Nuswantoro Semarang

Magister Teknologi Informasi, Universitas Dian Nuswantoro Semarang
Mohammad Iqbal Saryuddin Assaqty, South China University of Technology

School of Computer Science and Engineering, South China University of Technology

References

[1] H. Sung et al., “Global Cancer Statistics 2020: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries,” CA. Cancer J. Clin., vol. 71, no. 3, pp. 209–249, 2021, doi: 10.3322/caac.21660.

[2] E. Dekker, P. J. Tanis, J. L. A. Vleugels, P. M. Kasi, and M. B. Wallace, “Colorectal cancer,” Lancet (London, England), vol. 394, no. 10207, pp. 1467–1480, Oct. 2019, doi: 10.1016/S0140-6736(19)32319-0.

[3] Asmaul Husnah, Andi Kartini Eka Yanti, Arina Fathiyyah Arifin, Berry Erida Hasbi, and Dzul Ikram, “Karakteristik Penderita Kanker Kolorektal Di Rumah Sakit Pendidikan Ibnu Sina Makassar Tahun 2022,” Fakumi Med. J. J. Mhs. Kedokt., vol. 4, no. 1, pp. 19–28, 2024, doi: 10.33096/fmj.v4i1.435.

[4] N. P. V. Primatama, A. Siswandi, T. Triwahyuni, and E. Purnanto, “Gambaran Faktor Resiko Kejadian Kanker Kolorektal di RSUD Dr. H. Abdi; Moeloek,” J. Ilmu Kedokt. dan Kesehat., vol. 10, no. 7, pp. 2461–2467, 2023, doi: 10.33024/jikk.v10i7.10808.

[5] L. W. Pratika Yuhyi Hernanda, Novina Aryanti, Maria Widijanti Sugeng, Febtarini Rahmawati,AjengTribawati, “Pemberdayaan Posyandu Lansia untuk Deteksi Dini Kanker Kolorektal dengan Tes Darah Samar Feses (FOBT),” in Seminar Nasional Kusuma III Kualitas Sumberdaya Manusia, 2024, pp. 10–19.

[6] I. Pacal, D. Karaboga, A. Basturk, B. Akay, and U. Nalbantoglu, “A comprehensive review of deep learning in colon cancer,” Comput. Biol. Med., vol. 126, no. August, p. 104003, 2020, doi: 10.1016/j.compbiomed.2020.104003.

[7] S. Sharmin, T. Ahammad, A. Talukder, and P. Ghose, “A Hybrid Dependable Deep Feature Extraction and Ensemble-Based Machine Learning Approach for Breast Cancer Detection,” IEEE Access, vol. 11, pp. 87694–87708, Jan. 2023, doi: 10.1109/access.2023.3304628.

[8] S. L. Verghese, I. Y. Liao, T. H. Maul, and S. Y. Chong, “An Empirical Study of Several Information Theoretic Based Feature Extraction Methods for Classifying High Dimensional Low Sample Size Data,” IEEE Access, vol. 9, pp. 69157–69172, Jan. 2021, doi: 10.1109/access.2021.3077958.

[9] S. Poudel, Y. J. Kim, D. M. Vo, and S.-W. Lee, “Colorectal Disease Classification Using Efficiently Scaled Dilation in Convolutional Neural Network,” IEEE Access, vol. 8, pp. 99227–99238, Jan. 2020, doi: 10.1109/access.2020.2996770.

[10] F. J. P. Montalbo, “Diagnosing gastrointestinal diseases from endoscopy images through a multi-fused CNN with auxiliary layers, alpha dropouts, and a fusion residual block,” Biomed. Signal Process. Control, vol. 76, p. 103683, Jul. 2022, doi: 10.1016/j.bspc.2022.103683.

[11] A. Bechar, Y. Elmir, R. Medjoudj, Y. Himeur, and A. Amira, “Transfer Learning for Cancer Detection based on Images Analysis,” Procedia Comput. Sci., vol. 239, pp. 1903–1910, 2024, doi: 10.1016/j.procs.2024.06.373.

[12] P. Haldar et al., “XGBoosted Binary CNNs for Multi-Class Classification of Colorectal Polyp Size,” IEEE Access, vol. 11, pp. 128461–128472, 2023, doi: 10.1109/ACCESS.2023.3332826.

[13] J. Lee, C. Han, K. Kim, G. H. Park, and J. T. Kwak, “CaMeL-Net: Centroid-aware metric learning for efficient multi-class cancer classification in pathology images,” Comput. Methods Programs Biomed., vol. 241, p. 107749, Nov. 2023, doi: 10.1016/J.CMPB.2023.107749.

[14] A. Vaswani et al., “Attention is all you need,” Adv. Neural Inf. Process. Syst., vol. 2017-Decem, no. Nips, pp. 5999–6009, 2017.

[15] N. Parmar et al., “Image transformer,” 35th Int. Conf. Mach. Learn. ICML 2018, vol. 9, pp. 6453–6462, 2018.

[16] A. Dosovitskiy et al., “An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale,” ICLR 2021 - 9th Int. Conf. Learn. Represent., Oct. 2020, Accessed: Dec. 12, 2023. [Online]. Available: https://arxiv.org/abs/2010.11929v2

[17] N. T. Duc, N. T. Oanh, N. T. Thuy, T. M. Triet, and V. S. Dinh, “ColonFormer: An Efficient Transformer Based Method for Colon Polyp Segmentation,” IEEE Access, vol. 10, pp. 80575–80586, Jan. 2022, doi: 10.1109/access.2022.3195241.

[18] A. Bechar, Y. Elmir, R. Medjoudj, Y. Himeur, and A. Amira, “Transfer Learning for Cancer Detection based on Images Analysis,” Procedia Comput. Sci., vol. 239, pp. 1903–1910, Jan. 2024, doi: 10.1016/J.PROCS.2024.06.373.

[19] T. Mahmood, A. Wahid, J. S. Hong, S. G. Kim, and K. R. Park, “A novel convolution transformer-based network for histopathology-image classification using adaptive convolution and dynamic attention,” Eng. Appl. Artif. Intell., vol. 135, p. 108824, 2024, doi: https://doi.org/10.1016/j.engappai.2024.108824.

[20] A. I. Saad, F. A. Maghraby, and O. M. Badawy, “PolyDSS: computer-aided decision support system for multiclass polyp segmentation and classification using deep learning,” Neural Comput. Appl., vol. 36, no. 9, pp. 5031–5057, Mar. 2024, doi: 10.1007/S00521-023-09358-3/FIGURES/16.

[21] T. Aitazaz, A. Tubaishat, F. Al-Obeidat, B. Shah, T. Zia, and A. Tariq, “Transfer learning for histopathology images: an empirical study,” Neural Comput. Appl. 2022 3511, vol. 35, no. 11, pp. 7963–7974, Jul. 2022, doi: 10.1007/S00521-022-07516-7.

[22] F. Hörst et al., “CellViT: Vision Transformers for precise cell segmentation and classification,” Med. Image Anal., vol. 94, p. 103143, May 2024, doi: 10.1016/J.MEDIA.2024.103143.

[23] N. Marini et al., “Multimodal representations of biomedical knowledge from limited training whole slide images and reports using deep learning,” Med. Image Anal., vol. 97, p. 103303, 2024, doi: https://doi.org/10.1016/j.media.2024.103303.

[24] S. V. Mahadevkar et al., “A Review on Machine Learning Styles in Computer Vision - Techniques and Future Directions,” IEEE Access, vol. 10, no. October, pp. 107293–107329, 2022, doi: 10.1109/ACCESS.2022.3209825.

[25] A. Kanadath, J. Angel Arul Jothi, and S. Urolagin, “CViTS-Net: A CNN-ViT Network With Skip Connections for Histopathology Image Classification,” IEEE Access, vol. 12, pp. 117627–117649, 2024, doi: 10.1109/ACCESS.2024.3448302.

[26] C. H. J. Kusters et al., “Will Transformers change gastrointestinal endoscopic image analysis? A comparative analysis between CNNs and Transformers, in terms of performance, robustness and generalization,” Med. Image Anal., vol. 99, p. 103348, Jan. 2025, doi: 10.1016/J.MEDIA.2024.103348.

[27] C.-M. Liu, Z. Niu, and K.-T. Liao, “Mechanisms to improve clustering uncertain data with UKmeans,” Data Knowl. Eng., vol. 116, pp. 61–79, Jul. 2018, doi: 10.1016/j.datak.2018.05.004.

[28] M. Al-Jabbar, M. Alshahrani, E. M. Senan, and I. A. Ahmed, “Histopathological Analysis for Detecting Lung and Colon Cancer Malignancies Using Hybrid Systems with Fused Features,” Bioengineering, vol. 10, no. 3, 2023, doi: 10.3390/bioengineering10030383.

[29] M. Mahanty, D. Bhattacharyya, D. Midhunchakkaravarthy, and T. H. Kim, “Detection of colorectal cancer by deep learning: An extensive review,” Int. J. Curr. Res. Rev., vol. 12, no. 22, pp. 150–157, 2020, doi: 10.31782/IJCRR.2020.122234.

[30] A. Ben Hamida et al., “Deep learning for colon cancer histopathological images analysis,” Comput. Biol. Med., vol. 136, no. August, 2021, doi: 10.1016/j.compbiomed.2021.104730.

[31] A. Gupta, A. Anand, and Y. Hasija, “Recall-based Machine Learning approach for early detection of Cervical Cancer,” 2021 6th Int. Conf. Converg. Technol. I2CT 2021, pp. 1–5, 2021, doi: 10.1109/I2CT51068.2021.9418099.