Author Identifier (ORCID)

Liang Wang: https://orcid.org/0000-0001-5339-7484

Abstract

Objectives Rapid discrimination of infections caused by Mycobacterium tuberculosis (MTB) and non-tuberculous mycobacteria (NTM) is crucial in clinical settings. Despite overlapping clinical and radiological features, the two require markedly different therapeutic approaches and public health responses. Current laboratory methods are time-consuming and complex, underscoring the urgent need for a simple and efficient diagnostic tool to inform public health decision-making. Methods Demographic, haematological and biochemical data were collected from two hospitals in Jiangsu province, China, between December 2018 and October 2024. A total of 400 patients were included in the training cohort, with 66 patients used for external validation. Six machine learning models were developed using routine laboratory features, and their performance was evaluated using multiple metrics. Results The random forest (RF) model outperformed others using 49 routine lab features, achieving 82.71% accuracy in the internal cohort and 87.69% in external validation. SHapley Additive exPlanations (SHAP) model identified the top 10 critical features influencing model decisions, namely, chloride, sodium, gender, prealbumin, high-density lipoprotein, procalcitonin, albumin, globulin, total protein and creatine. Based on these indicators, an interactive web-based tool was developed (https://mtb-ntm.streamlit.app). Discussion The features identified by the model align with established clinical parameters and existing studies. Certain previously underestimated variables, such as Cl and Na, exhibited substantial importance in distinguishing between MTB and NTM, offering valuable insights for the development of decision-support tools. Conclusion Routine laboratory indicators coupled with the RF model demonstrated potential capacity as an auxiliary diagnostic tool for discriminating MTB and NTM disease, offering effective medical support in resource-limited and remote settings.

Document Type

Journal Article

Date of Publication

10-1-2025

Volume

32

Issue

1

PubMed ID

41106844

Publication Title

BMJ Health and Care Informatics

Publisher

BMJ Publishing Group

School

School of Medical and Health Sciences

RAS ID

87991

Funders

Research Foundation for Advanced Talents of Guangdong Provincial People’s Hospital (KY012023293) / Research Training Program Australian Commonwealth Government

Creative Commons License

Creative Commons Attribution-Noncommercial 4.0 License
This work is licensed under a Creative Commons Attribution-Noncommercial 4.0 License

Comments

Tang, J., Xiong, X., Huang, T., Zhang, Y., Yao, L., Zhang, W., Xie, Y., Liang, Q., Tan, Z., Jiang, K., Liu, X., & Wang, L. (2025). Rapid discrimination of Mycobacterium tuberculosis and non-tuberculous mycobacteria disease via interpretive machine learning analysis of routine laboratory tests. BMJ Health & Care Informatics, 32(1), e101575. https://doi.org/10.1136/bmjhci-2025-101575

Included in

Diagnosis Commons

Share

 
COinS
 

Link to publisher version (DOI)

10.1136/bmjhci-2025-101575