Author Identifier (ORCID)
Helge Janicke: https://orcid.org/0000-0002-1345-2829
Iqbal H. Sarker: https://orcid.org/0000-0003-1740-5517
Abstract
Short Message Service (SMS) is a widely used and cost-effective communication medium that has unfortunately become a frequent target for unsolicited messages - commonly known as SMS spam. With the rapid adoption of smartphones and increased Internet connectivity, SMS spam has emerged as a prevalent threat. Spammers have recognized the critical role SMS plays in today's modern communication, making it a prime target for abuse. As cybersecurity threats continue to evolve, the volume of SMS spam has increased substantially in recent years. Moreover, the unstructured format of SMS data creates significant challenges for SMS spam detection, making it more difficult to successfully combat spam attacks. In this paper, we present an optimized and fine-tuned transformer-based Language Model to address the problem of SMS spam detection. We use a benchmark SMS spam dataset to analyze this spam detection model. Additionally, we utilize pre-processing techniques to obtain clean and noise-free data and address class imbalance problem by leveraging text augmentation techniques. The overall experiment showed that our optimized fine-tuned BERT (Bidirectional Encoder Representations from Transformers) variant model RoBERTa obtained high accuracy with 99.84%. To further enhance model transparency, we incorporate Explainable Artificial Intelligence (XAI) techniques that compute positive and negative coefficient scores, offering insight into the model's decision-making process. Additionally, we evaluate the performance of traditional machine learning models as a baseline for comparison. This comprehensive analysis demonstrates the significant impact language models can have on addressing complex text-based challenges within the cybersecurity landscape.
Document Type
Journal Article
Date of Publication
10-1-2025
Volume
11
Issue
5
Publication Title
Digital Communications and Networks
Publisher
Elsevier
School
Centre for Securing Digital Futures
Creative Commons License

This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License.
First Page
1504
Last Page
1518
Comments
Uddin, M. A., Islam, M. N., Maglaras, L., Janicke, H., & Sarker, I. H. (2025). ExplainableDetector: Exploring transformer-based language modeling approach for SMS spam detection with explainability analysis. Digital Communications and Networks, 11(5), 1504–1518. https://doi.org/10.1016/j.dcan.2025.07.008