Qureshi, Asifa Mehmood and Kaushik, Abishek and Regan, Gilbert and McDaid, Kevin and McCaffery, Fergal (2024) Handling Class Imbalance via Counterfactual Generation in Medical Datasets. In: Proceedings of The 32nd Irish Conference on Artificial Intelligence and Cognitive Science, December 9-10, 2024, Dublin, Republic of Ireland.
|
PDF
Download (1MB) |
Abstract
Real-world datasets often contain uneven class distributions, that if not handled properly result in biased Machine Learning (ML) models. Therefore, class balancing is important to avoid overfitting, improve model generalisation and ensure fairness. Most state-of-the-art techniques used to balance datasets do not take into account the majority class samples that contain greater distributional information of the dataset. Therefore, in this article, we propose a method that generates counterfactuals using majority-class samples. The method takes an imbalanced dataset as input, normalises the dataset, and trains a Support Vector Machine (SVM) classifier on it. Afterwards, the majority class samples that lie near the decision boundary are extracted and perturbed until they are classified as minority class samples. The method is evaluated on two benchmark datasets i.e., the Diagnostic Wisconsin Breast Cancer dataset and the Eye State Classification Electroencephalogram (EEG) dataset. The results show that our approach produces reasonable accuracy, Area Under Curve (AUC), and Geometric Mean (Gmean) scores. Also, the F1-score also improved for minority classes when oversampled using counterfactuals. Moreover, the model achieved promising results when compared with state-of-the-art techniques.
| Item Type: | Conference or Workshop Item (Paper) |
|---|---|
| Uncontrolled Keywords: | Boundary enhancement; Over-sampling; SVM; Decision boundary; Classification; Counterfactuals. |
| Subjects: | Computer Science |
| Research Centres: | Regulated Software Research Centre |
| Depositing User: | Sean McGreal |
| Date Deposited: | 17 Dec 2025 10:13 |
| Last Modified: | 17 Dec 2025 10:13 |
| License: | Creative Commons: Attribution-Noncommercial-Share Alike 4.0 |
| URI: | https://eprints.dkit.ie/id/eprint/987 |
Actions (login required)
![]() |
View Item |
Downloads
Downloads per month over past year


