STÓR

Investigating the Utility of Synthetic Data in the Detection of Lung Cancer

Atike, Israa and Qureshi, Asifa Mehmood and Kaushik, Abishek (2025) Investigating the Utility of Synthetic Data in the Detection of Lung Cancer. In: 33rd International Conference on Artificial Intelligence and Cognitive Science (AICS 2025), December 1st & 2nd, 2025, Hyatt Centric Liberties Hotel, Dublin, Ireland.

[thumbnail of AICS_draft__ArXiv_Preprint.pdf] PDF
Download (1MB)

Abstract

Artificial Intelligence (AI) is broadly used in healthcare to automate various clinical tasks. However, these models require large medical datasets for training, which are not readily available due to privacy issues. To address this challenge, synthetic data generation can help to increase dataset volume and improve diversity without compromising patient privacy. Therefore, this paper aims to analyse the utility of synthetic data in lung cancer detection, which is one of the most commonly diagnosed cancers. For this purpose, we trained and tested Machine Learning (ML) classifiers on a real dataset, substituting different proportions of real data with synthetic data and then replacing real data with synthetic data. We compared the performance of classifiers using accuracy, precision, recall, F1-score, and Area Under Curve (AUC). The results show that synthetic data can be used in conjunction with real data; however, replacing it completely with synthetic data needs more experiments on a large range of generative models to draw useful insights.

Item Type: Conference or Workshop Item (Paper)
Uncontrolled Keywords: GANs; cGAN; Lung cancer detection; ML; AI; Synthetic data.
Subjects: Computer Science
Research Centres: Regulated Software Research Centre
Depositing User: Sean McGreal
Date Deposited: 17 Dec 2025 09:32
Last Modified: 17 Dec 2025 09:50
License: Creative Commons: Attribution-Noncommercial-Share Alike 4.0
URI: https://eprints.dkit.ie/id/eprint/985

Actions (login required)

View Item View Item

Downloads

Downloads per month over past year