Research outputs 2022 to 2026

A generative AI method for minority class handling in anomaly detection with drift and explainability analysis

Author Identifier (ORCID)

Ahmad Mohsin: https://orcid.org/0000-0001-9023-0851

Iqbal H. Sarker: https://orcid.org/0000-0003-1740-5517

Abstract

Artificial Intelligence, particularly machine learning (ML) algorithms, plays a crucial role in detecting cyberattacks, including anomalies and intrusions. However, machine learning models trained on imbalanced cybersecurity datasets often struggle to accurately detect minority data instances and potential threats, thereby weakening overall system security. Despite extensive research, a persistent challenge is the inadequate explanation for model predictions concerning minority data classes. This study aims to address these limitations by developing a generative AI-based approach to manage minority classes in anomaly detection, incorporating concept drift handling and explainability analysis. We introduce an over-sampling technique, CGGReaT, designed to enhance the presence of minority classes in the anomaly detection domain. Leveraging Large Language Models (LLMs) as a hybrid approach, we use pre-trained transformer-based LLM DistilGPT-2 for generating synthetic tabular data. Extensive experiments on two publicly available benchmark datasets, UNSW NB15 and CIC-IDS2017, underscore the efficacy of our proposed approach. We employed concept drift detection and adaptation techniques to maintain reliable and sustainable ML performance. To enhance interpretability, eXplainable Artificial Intelligence (XAI) methods, including SHAP and LIME, are employed to quantify feature contributions to model outputs. Extensive experiments reveal that testing ML algorithms on datasets balanced with synthetic samples generated by cGGReaT boosts the prediction accuracy on the UNSW NB15 and CIC-IDS2017 datasets, compared to classifiers tested on imbalanced datasets.

Keywords

Anomaly detection, concept drift, cybersecurity, data imbalance, explainable AI, LLM

Document Type

Journal Article

Date of Publication

5-1-2026

Volume

Issue

Publication Title

New Generation Computing

Publisher

Springer

School

Centre for Securing Digital Futures

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 License.

Comments

Mwiga, K. J., Dida, M. A., Mohsin, A., & Sarker, I. H. (2026). A generative AI method for minority class handling in anomaly detection with drift and explainability analysis. New Generation Computing, 44. https://doi.org/10.1007/s00354-026-00318-8

Download

Included in

Artificial Intelligence and Robotics Commons

COinS

Link to publisher version (DOI)

10.1007/s00354-026-00318-8

Research outputs 2022 to 2026

A generative AI method for minority class handling in anomaly detection with drift and explainability analysis

Author Identifier (ORCID)

Abstract

Keywords

Document Type

Date of Publication

Volume

Issue

Publication Title

Publisher

School

Creative Commons License

Comments

Included in

Link to publisher version (DOI)

Search

Links

Browse

Author Information

Article Locations

Research outputs 2022 to 2026

A generative AI method for minority class handling in anomaly detection with drift and explainability analysis

Authors/Creators

Author Identifier (ORCID)

Abstract

Keywords

Document Type

Date of Publication

Volume

Issue

Publication Title

Publisher

School

Creative Commons License

Comments

Included in

Share

Link to publisher version (DOI)

Search

Links

Browse

Author Information

Article Locations