STÓR

Handling Class Imbalance via Counterfactual Generation in Medical Datasets

Qureshi, Asifa Mehmood and Kaushik, Abishek and Regan, Gilbert and McDaid, Kevin and McCaffery, Fergal (2024) Handling Class Imbalance via Counterfactual Generation in Medical Datasets. In: Proceedings of The 32nd Irish Conference on Artificial Intelligence and Cognitive Science, December 9-10, 2024, Dublin, Republic of Ireland.

[thumbnail of aics2024_p09.pdf] PDF
Download (1MB)

Abstract

Real-world datasets often contain uneven class distributions, that if not handled properly result in biased Machine Learning (ML) models. Therefore, class balancing is important to avoid overfitting, improve model generalisation and ensure fairness. Most state-of-the-art techniques used to balance datasets do not take into account the majority class samples that contain greater distributional information of the dataset. Therefore, in this article, we propose a method that generates counterfactuals using majority-class samples. The method takes an imbalanced dataset as input, normalises the dataset, and trains a Support Vector Machine (SVM) classifier on it. Afterwards, the majority class samples that lie near the decision boundary are extracted and perturbed until they are classified as minority class samples. The method is evaluated on two benchmark datasets i.e., the Diagnostic Wisconsin Breast Cancer dataset and the Eye State Classification Electroencephalogram (EEG) dataset. The results show that our approach produces reasonable accuracy, Area Under Curve (AUC), and Geometric Mean (Gmean) scores. Also, the F1-score also improved for minority classes when oversampled using counterfactuals. Moreover, the model achieved promising results when compared with state-of-the-art techniques.

Item Type: Conference or Workshop Item (Paper)
Uncontrolled Keywords: Boundary enhancement; Over-sampling; SVM; Decision boundary; Classification; Counterfactuals.
Subjects: Computer Science
Research Centres: Regulated Software Research Centre
Depositing User: Sean McGreal
Date Deposited: 17 Dec 2025 10:13
Last Modified: 17 Dec 2025 10:13
License: Creative Commons: Attribution-Noncommercial-Share Alike 4.0
URI: https://eprints.dkit.ie/id/eprint/987

Actions (login required)

View Item View Item

Downloads

Downloads per month over past year