Breast Cancer Classification Using Support Vector Machine Method and RBF Kernel Function Based on Clinical Data, Cancer Stage, and Immunohistochemistry Results

Authors

  • Sherly Nur Ekawati Medical Laboratory Technology, Faculty of Health Sciences, Universitas Muhammadiyah Kudus, Indonesia
  • mudy wati
  • Arya Iswara Graduate Programme of Medical Technology Laboratory Science, Universitas Muhammadiyah Semarang, Indonesia
  • Ahmad Ilham Department of Informatics, Universitas Muhammadiyah Semarang, Indonesia
  • Astri Aditya Wardhani Regional General Hospital of SIMO Boyolali, Centre Java

DOI:

https://doi.org/10.30742/jikw.v15i1.4908

Keywords:

Clinical Data, Immunohistochemistry Results, RBF Kernel, Breast cancer subtype classification, SVM.

Abstract

Background: Breast cancer is one of the leading causes of cancer-related deaths among Indonesian women. Early detection and classification of molecular subtypes are crucial for determining appropriate therapy. Accurate determination of biological subtypes of breast cancer is essential for selecting optimal treatment strategies. This research aims to build and evaluate a breast cancer subtype classification model using the SVM with an RBF kernel. The subtypes classified include Luminal A, Luminal B, HER2+, and Triple Negative Breast Cancer, utilizing a combination of patient clinical data (age, tumor size, and tumor location), cancer stage, and the expression status of hormonal receptors ER and PR. The methodological steps include data preprocessing, feature selection, model training with cross-validation, and performance evaluation using metrics such as accuracy, precision, recall, F1-score, and the ROC-AUC curve. The results showed that the majority of patients' ages were in the range of 40–60 years, with dominant tumor sizes between 1 and 3 cm. Luminal A and B subtypes were more frequently observed in patients aged ≥50 years and at early stages, whereas HER2+ and TNBC were mostly observed in patients under 50 years with advanced stages. The established baseline SVM-RBF model achieved high accuracy (91%) but performed poorly at detecting minority subtypes, such as HER2+, with a recall = 0 and an F1-score = 0, indicating model bias toward the majority class. This study demonstrates that the SVM algorithm with the RBF kernel is effective for modeling breast cancer subtype classification using clinical data, cancer stage, and immunohistochemistry results.

References

Watkins EJ. Overview of breast cancer. J Am Acad Physician Assist. 2019 Oct 1;32(10):13–7.

American Cancer Society. Cancer Facts & Figures 2016. Amerika Serikat; 2016.

Kemenkes RI. Sehat Negeriku: Kemkes Indonesia. 2024. Kanker Masih Membebani Dunia. Available from: https://sehatnegeriku.kemkes.go.id/baca/blog/20240506/3045408/kanker-masih-membebani-dunia/

Sung H, Ferlay J, Siegel RL, Laversanne M, Soerjomataram I, Jemal A, et al. Global Cancer Statistics 2020: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries. CA Cancer J Clin. 2021 May;71(3):209–49.

Arnold M, Morgan E, Rumgay H, Mafra A, Singh D, Laversanne M, et al. Current and future burden of breast cancer: Global statistics for 2020 and 2040. Breast. 2022 Dec 1;66:15–23.

Bray F, Ferlay J, Soerjomataram I, Siegel RL, Torre LA, Jemal A. Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin. 2018 Nov;68(6):394–424.

Eliyatkin N, Yalcin E, Zengel B, Aktaş S, Vardar E. Molecular Classification of Breast Carcinoma: From Traditional, Old-Fashioned Way to A New Age, and A New Way. Journal of Breast Health. 2015 Apr 7;11(2):59–66.

Korn AR, Reedy J, Brockton NT, Kahle LL, Mitrou P, Shams-White MM. The 2018 World Cancer Research Fund/American Institute for Cancer Research Score and Cancer Risk: A Longitudinal Analysis in the NIH-AARP Diet and Health Study. Cancer Epidemiology Biomarkers and Prevention. 2022 Oct 1;31(10):1983–92.

Doren A, Vecchiola A, Aguirre B, Villaseca P. Gynecological–endocrinological aspects in women carriers of BRCA1/2 gene mutations. Vol. 21, Climacteric. Taylor and Francis Ltd; 2018. p. 529–35.

Tania FA, Shill PC. A modified support vector machine with hybrid kernel function for diagnosis of diseases. In: 2019 IEEE International Conference on Biomedical Engineering, Computer and Information Technology for Health (BECITHCON. IEEE; 2019. p. 42–46.

Bhoo-Pathy N, Verkooijen HM, Tan EY, Miao H, Taib NAM, Brand JS, et al. Trends in presentation, management and survival of patients with de novo metastatic breast cancer in a Southeast Asian setting. Sci Rep. 2015 Nov 5;5.

Shukla N, Hagenbuchner M, Win KT, Yang J. Breast cancer data analysis for survivability studies and prediction. Comput Methods Programs Biomed. 2018 Mar 1;155:199–208.

Marandi M, Hossein Abadi S. Aqueous synthesis of colloidal CdSexTe1-x – CdS core–shell nanocrystals and effect of shell formation parameters on the efficiency of corresponding quantum dot sensitized solar cells. Solar Energy. 2020 Oct 1;209:387–99.

Kriegeskorte N, Golan T. Neural network models and deep learning. Vol. 29, Current Biology. Cell Press; 2019. p. R231–6.

Lim TS, Tay KG, Huong A, Lim XY. Breast cancer diagnosis system using hybrid support vector machine-artificial neural network. International Journal of Electrical and Computer Engineering. 2021 Aug 1;11(4):3059–69.

Nilashi M, Ibrahim O, Ahmadi H, Shahmoradi L. A knowledge-based system for breast cancer classification using fuzzy logic method. Telematics and Informatics. 2017 Jul 1;34(4):133–44.

Chanda PB, Sarkar SK. Detection and classification technique of breast cancer using multi Kernal SVM classifier approach. In: 2018 IEEE applied signal processing conference (ASPCON. IEEE; 2018. p. 320–325.

Liantoni F, Santoso A. Perbaikan Kontras Citra Mammogram Pada Klasifikasi Kanker Payudara Berdasarkan Fitur Gray-Level Co-Occurrence Matrix. SINTECH (Science and Information Technology) Journal. 2020;3(1):46–51.

Alshutbi M, Li Z, Alrifaey M, Ahmadipour M, Othman MM. A hybrid classifier based on support vector machine and Jaya algorithm for breast cancer classification. Neural Comput Appl. 2022;34(19):16669–16681.

Agustina D, Putri E, Fauzi F, Alawiyah SN, Wasono R. Penerapan Metode Support Vector Machine (Svm) Untuk Klasifikasi Data Ekspresi Gen Microarray. Edusaintek. 2020;4.

Fajar AK, Mutaqin MZ, Mutoffar MM, Setiyadi D. Klasifikasi Kanker Payudara Menggunakan Algoritma Neural Network Dan Random Forest. J. Jurnal Manajemen Informatika dan Sistem Informasi. 2024;7(1):74–80.

Osareh F, Salehi Zahabi S, Akbarzadeh F. Co-authorship Network Analysis of Medical Images Researchers with Emphasis on Micro and Macro Metrics. Journal of Clinical Research in Paramedical Sciences. 2022 Dec 5;11(2).

Hurriyati, S. Implementasi metode support vector machine pada klasifikasi diagnosis penyakit kanker payudara. (Doctoral dissertation, Universiitas Islam Negeri Maulana Malik Ibrahim).; 2023.

Agustina D, Putri E, Fauzi F, Alawiyah SN, Wasono R. Metode Support Vector Machine (Svm) Untuk Klasifikasi Data Ekspresi Gen Microarray. Edusaintek 4. 2020;1–10.

Abdurrahman G. Klasifikasi Kanker Payudara Menggunakan Algoritma SVM dengan Kernel RBF, Linier, dan Sigmoid. JUSTIFY : Jurnal Sistem Informasi Ibrahimy. 2023 Jul 20;2(1):74–80.

Siegel RL, Miller KD, Wagle NS, Jemal A. Cancer statistics, 2023. CA Cancer J Clin. 2023 Jan;73(1):17–48.

Zhang C, Xu J, Tang R, Yang J, Wang W, Yu X, et al. Novel research and future prospects of artificial intelligence in cancer diagnosis and treatment. Vol. 16, Journal of Hematology and Oncology. BioMed Central Ltd; 2023.

Bilski M, Konat-Bąska K, Zerella MA, Corradini S, Hetnał M, Leonardi MC, et al. Advances in breast cancer treatment: a systematic review of preoperative stereotactic body radiotherapy (SBRT) for breast cancer. Vol. 19, Radiation Oncology. BioMed Central Ltd; 2024.

Xiong X, Wang X, Liu CC, Shao ZM, Yu K Da. Deciphering breast cancer dynamics: insights from single-cell and spatial profiling in the multi-omics era. Vol. 12, Biomarker Research. BioMed Central Ltd; 2024.

American Cancer Society Recommendations for the Early Detection of Breast Cancer.

Coleman R, Finkelstein DM, Barrios C, Martin M, Iwata H, Hegg R, et al. Adjuvant denosumab in early breast cancer (D-CARE): an international, multicentre, randomised, controlled, phase 3 trial. Lancet Oncol. 2020 Jan 1;21(1):60–72.

Harbeck N, Penault-Llorca F, Cortes J, Gnant M, Houssami N, Poortmans P, et al. Breast cancer. Nat Rev Dis Primers. 2019 Dec 1;5(1).

Dall GV, Britt KL. Estrogen effects on the mammary gland in early and late life and breast cancer risk. Vol. 7, Frontiers in Oncology. Frontiers Media S.A.; 2017.

Feng Y, Spezia M, Huang S, Yuan C, Zeng Z, Zhang L, et al. Breast cancer development and progression: Risk factors, cancer stem cells, signaling pathways, genomics, and molecular pathogenesis. Vol. 5, Genes and Diseases. Chongqing University; 2018. p. 77–106.

Siegel RL, Miller KD, Jemal A. Cancer statistics, 2018. CA Cancer J Clin. 2018 Jan;68(1):7–30.

Jung S, Wang M, Anderson K, Baglietto L, Bergkvist L, Bernstein L, et al. Alcohol consumption and breast cancer risk by estrogen receptor status: In a pooled analysis of 20 studies. Int J Epidemiol. 2016 Jun 1;45(3):916–28.

Makarem N, Chandran U, Bandera E V., Parekh N. Dietary fat in breast cancer survival. Annu Rev Nutr. 2013 Jul;33:319–48.

Kispert S, McHowat J. Recent insights into cigarette smoking as a lifestyle risk factor for breast cancer. Vol. 9, Breast Cancer: Targets and Therapy. Dove Medical Press Ltd.; 2017. p. 127–32.

Smolarz B, Zadrożna Nowak A, Romanowicz H. Breast Cancer—Epidemiology, Classification, Pathogenesis and Treatment (Review of Literature). Vol. 14, Cancers. MDPI; 2022.

Watkins EJ. Overview of breast cancer. J Am Acad Physician Assist. 2019 Oct 1;32(10):13–7.

McCart Reed AE, Kalinowski L, Simpson PT, Lakhani SR. Invasive lobular carcinoma of the breast: the increasing importance of this special subtype. Vol. 23, Breast Cancer Research. BioMed Central Ltd; 2021.

Roux P, Knight S, Cohen M, Classe JM, Mazouni C, Chauvet MP, et al. Tubular and mucinous breast cancer: results of a cohort of 917 patients. Tumori. 2019 Feb 1;105(1):55–62.

Cserni G. Histological type and typing of breast carcinomas and the WHO classification changes over time. Vol. 112, Pathologica. Pacini Editore S.p.A.; 2020. p. 25–41.

Siegel RL, Miller KD, Jemal A. Cancer statistics, 2020. CA Cancer J Clin. 2020 Jan;70(1):7–30.

Sung JS, Lee CH, Morris EA, Oeffinger KC, Dershaw DD. Screening breast MR imaging in women with a history of chest irradiation. Radiology. 2011 Apr;259(1):65–71.

Shien T, Iwata H. Adjuvant and neoadjuvant therapy for breast cancer. Vol. 50, Japanese Journal of Clinical Oncology. Oxford University Press; 2020. p. 225–9.

Waks AG, Winer EP. Breast Cancer Treatment: A Review. Vol. 321, JAMA - Journal of the American Medical Association. American Medical Association; 2019. p. 288–300.

Cao SS, Lu CT. Recent perspectives of breast cancer prognosis and predictive factors (Review). Vol. 12, Oncology Letters. Spandidos Publications; 2016. p. 3674–8.

Kiely BE, Soon YY, Tattersall MHN, Stockler MR. How Long Have I Got? Estimating Typical, Best-Case, and Worst-Case Scenarios for Patients Starting First-Line Chemotherapy for Metastatic Breast Cancer: A Systematic Review of Recent Randomized Trials. Vol. 29, Journal of Clinical Oncology. American Society of Clinical Oncology; 2011. p. 456–63.

Aji Mahesa G, Eko Minarno A, Azhar Y, Muhammadiyah Malang Iu. Klasifikasi Citra Histologi Kanker Payudara Menggunakan Metode Ensemble CNN. REPOSITOR [Internet]. 2022;4(3):373–84. Available from: https://www.kaggle.com/paultimothymooney/breast-histopathology-images.

Koh J, Kim MJ. Introduction of a new staging system of breast cancer for radiologists: An emphasis on the prognostic stage. Vol. 20, Korean Journal of Radiology. Korean Radiological Society; 2019. p. 69–82.

Li J, Chen Z, Su K, Zeng J. Original Article Clinicopathological classification and traditional prognostic indicators of breast cancer [Internet]. Vol. 8, Int J Clin Exp Pathol. 2015. Available from: www.ijcep.com/

Russnes HG, Lingjærde OC, Børresen-Dale AL, Caldas C. Breast Cancer Molecular Stratification: From Intrinsic Subtypes to Integrative Clusters. Vol. 187, American Journal of Pathology. Elsevier Inc.; 2017. p. 2152–62.

Choudhury P, Foroughi C, Larson B. Work-from-anywhere: The productivity effects of geographic flexibility. Strategic Management Journal. 2021 Apr 1;42(4):655–83.

Huang S, Nianguang CAI, Penzuti Pacheco P, Narandes S, Wang Y, Wayne XU. Applications of support vector machine (SVM) learning in cancer genomics. Vol. 15, Cancer Genomics and Proteomics. International Institute of Anticancer Research; 2018. p. 41–51.

Khouma O, Diop I, Fall PA, Ndiaye ML, Farssi SM, Oussamatou AM, et al. Novel Classification Method of Spikes Morphology in EEG Signal Using Machine Learning. In: Procedia Computer Science. Elsevier B.V.; 2019. p. 70–9.

Bilal A, Imran A, Baig TI, Liu X, Abouel Nasr E, Long H. Breast cancer diagnosis using support vector machine optimized by improved quantum inspired grey wolf optimization. Sci Rep. 2024 Dec 1;14(1).

Sung H, Ferlay J, Siegel RL, Laversanne M, Soerjomataram I, Jemal A, et al. Global Cancer Statistics 2020: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries. CA Cancer J Clin. 2021 May;71(3):209–49.

Kittaneh M, Montero AJ, Glück S. Molecular Profiling for Breast Cancer: A Comprehensive Review. Biomark Cancer. 2013 Jan;5:BIC.S9455.

Priyadharsini MS, Research SPM, Sathiaseelan JGR, Head &. Performance Analysis of SVM in Breast Cancer Classification: A Survey [Internet]. Available from: https://www.researchgate.net/publication/359645941

Onitilo AA, Engel JM, Greenlee RT, Mukesh BN. Breast Cancer Subtypes Based on ER/PR and Her2 Expression: Comparison of Clinicopathologic Features and Survival. Clin Med Res. 2009 Jun 1;7(1–2):4–13.

Wang D, He H, Wei C. Cellular and potential molecular mechanisms underlying transovarial transmission of the obligate symbiont Sulcia in cicadas. Environ Microbiol. 2023 Apr 1;25(4):836–52.

Wang R, Zhu Y, Liu X, Liao X, He J, Niu L. The Clinicopathological features and survival outcomes of patients with different metastatic sites in stage IV breast cancer. BMC Cancer. 2019 Nov 12;19(1).

Singh BK. Determining relevant biomarkers for prediction of breast cancer using anthropometric and clinical features: A comparative investigation in machine learning paradigm. Biocybern Biomed Eng. 2019 Apr 1;39(2):393–409.

Cai S, Zuo W, Lu X, Gou Z, Zhou Y, Liu P, et al. The Prognostic Impact of Age at Diagnosis Upon Breast Cancer of Different Immunohistochemical Subtypes: A Surveillance, Epidemiology, and End Results (SEER) Population-Based Analysis. Front Oncol. 2020 Sep 23;10.

DeSantis CE, Ma J, Goding Sauer A, Newman LA, Jemal A. Breast cancer statistics, 2017, racial disparity in mortality by state. CA Cancer J Clin. 2017 Nov;67(6):439–48.

Torre LA, Siegel RL, Ward EM, Jemal A. Global cancer incidence and mortality rates and trends - An update. Vol. 25, Cancer Epidemiology Biomarkers and Prevention. American Association for Cancer Research Inc.; 2016. p. 16–27.

Tittmann J, Ágh T, Erdősi D, Csanády B, Kövér E, Zemplényi A, et al. Breast cancer stage and molecular subtype distribution: real-world insights from a regional oncological center in Hungary. Discover Oncology. 2024 Dec 1;15(1).

Irawan HWI ketut. Clinical and Subtypes of Breast Cancer in Indonesia. 281Asian Pacific Journal of Cancer Care• Vol 5• Issue 4apjcc.waocp.com [Internet]. 2020 [cited 2025 Apr 17];5. Available from: DOI:10.31557/APJCC.2020.5.4.281

Sopik V, Narod SA. The relationship between tumour size, nodal status and distant metastases: on the origins of breast cancer. Breast Cancer Res Treat. 2018 Aug 1;170(3):647–56.

Moorman SEH, Pujara AC, Sakala MD, Neal CH, Maturen KE, Swartz L, et al. Annual screening mammography associated with lower stage breast cancer compared with biennial screening. American Journal of Roentgenology. 2021 Jun 1;217(1):40–7.

Mallapasi MN, Kusumanegara J, Kabo P, Usman U, Mulyono MT, Faruk M. Cardiac metastasis of triple-negative breast cancer mimicking myxoma: A case report. Int J Surg Case Rep. 2021 Nov 1;88.

Chang HM, Moudgil R, Scarabelli T, Okwuosa TM, Yeh ETH. Cardiovascular Complications of Cancer Therapy: Best Practices in Diagnosis, Prevention, and Management: Part 1. Vol. 70, Journal of the American College of Cardiology. Elsevier USA; 2017. p. 2536–51.

Feng RM, Zong YN, Cao SM, Xu RH. Current cancer situation in China: Good or bad news from the 2018 Global Cancer Statistics? Cancer Commun. 2019 Apr 29;39(1).

Abdel-Hafiz H. Epigenetic Mechanisms of Tamoxifen Resistance in Luminal Breast Cancer. Diseases. 2017 Jul 6;5(3):16.

Rasheed K, Qayyum A, Ghaly M, Al-Fuqaha A, Razi A, Qadir J. Explainable, trustworthy, and ethical machine learning for healthcare: A survey. Vol. 149, Computers in Biology and Medicine. Elsevier Ltd; 2022.

Somasundaram A, Scholar R, Reddy US. Data Imbalance: Effects and Solutions for Classification of Large and Highly Imbalanced Data Data Imbalance: Effects and Solutions for Classification of Large and Highly Imbalanced Data Akila Somasundaram [Internet]. Available from: https://www.researchgate.net/publication/320895020

Arifiyanti AA, Wahyuni ED. SMOTE: Metode Penyeimbang Kelas Pada Klasifikasi Data Mining. Scan - Jurnal Teknologi Informasi dan Komunikasi. 2020 Feb 28;15(1).

Sridhar S, & SS. Handling data imbalance in predictive maintenance for machines using SMOTE-based oversampling. In 2021 13th International Conference on Computational Intelligence and Communication Networks (CICN) . 2021;44–9.

George S, Sumathi B. Grid Search Tuning of Hyperparameters in Random Forest Classifier for Customer Feedback Sentiment Prediction [Internet]. Vol. 11, IJACSA) International Journal of Advanced Computer Science and Applications. 2020. Available from: www.ijacsa.thesai.org

Trihardianingsih L, Santos Lasatira G, kunci-GridSearrchCV K, Udara K. Optimasi Hyperparameter GridSearchCV pada Klasifikasi Kualitas Udara menggunakan Support Vector Machine [Internet]. Vol. 1, Jurnal Informasi dan Teknologi). 2024. Available from: https://data.jakarta.go.id/.

Rahman MdM, Rahman A, Akter S, Pinky SA. Hyperparameter Tuning Based Machine Learning Classifier for Breast Cancer Prediction. Journal of Computer and Communications. 2023;11(04):149–65.

Downloads

Published

2026-03-31

Issue

Section

Original Research Article