An explainable transformer-based model for phishing email detection: A large language model approach
Author Identifier (ORCID)
Iqbal H. Sarker: https://orcid.org/0000-0003-1740-5517
Abstract
Phishing email is a serious cyber threat that tries to deceive users by sending false emails with the intention of stealing confidential information or causing financial harm. Attackers, often posing as trustworthy entities, exploit technological advancements and sophistication to make the detection and prevention of phishing more challenging. Despite extensive academic research, phishing detection remains an ongoing and formidable challenge in the cybersecurity landscape. In this research paper, we present a fine-tuned transformer-based masked language model, RoBERTa (Robustly Optimized BERT Pretraining Approach), for phishing email detection. In the detection process, we employ a phishing email dataset and apply the preprocessing techniques to clean and address the class imbalance issues, thereby enhancing model performance. The results of the experiment demonstrate that our fine-tuned model outperforms traditional machine learning models with an accuracy of 98.45%. To ensure model transparency and user trust, we propose a hybrid explanation approach, LITA (LIME-Transformer Attribution), which integrates the potential of Local Interpretable Model-Agnostic Explanations (LIME) and Transformers Interpret methods. The proposed method provides more consistent and user-friendly insights, mitigating local attribution inconsistencies between the two explanation approaches. Moreover, the study highlights the model's ability to generate its predictions by presenting positive and negative contribution scores using LIME, Transformers Interpret, and LITA.
Document Type
Journal Article
Date of Publication
3-1-2026
Volume
277
Publication Title
Computer Networks
Publisher
Elsevier
School
Centre for Securing Digital Futures / School of Science
Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 License.
Comments
Uddin, M. A., Mahiuddin, M., & Sarker, I. H. (2026). An explainable transformer-based model for phishing email detection: A large language model approach. Computer Networks, 277, 112061. https://doi.org/10.1016/j.comnet.2026.112061