Author Identifier
Zheng Guo
https://orcid.org/0000-0003-2105-4537
Xingang Li
https://orcid.org/0000-0003-0252-154X
Wei Wang
Document Type
Journal Article
Publication Title
EPMA Journal
Publisher
Springer
School
School of Medical and Health Sciences / Centre for Precision Health
RAS ID
44424
Funders
Funding information : https://doi.org/10.1007/s13167-022-00283-4
Grant Number
NHMRC Number : 1112767
Abstract
Background
Recognising the early signs of ischemic stroke (IS) in emergency settings has been challenging. Machine learning (ML), a robust tool for predictive, preventive and personalised medicine (PPPM/3PM), presents a possible solution for this issue and produces accurate predictions for real-time data processing.
Methods
This investigation evaluated 4999 IS patients among a total of 10,476 adults included in the initial dataset, and 1076 IS subjects among 3935 participants in the external validation dataset. Six ML-based models for the prediction of IS were trained on the initial dataset of 10,476 participants (split participants into a training set [80%] and an internal validation set [20%]). Selected clinical laboratory features routinely assessed at admission were used to inform the models. Model performance was mainly evaluated by the area under the receiver operating characteristic (AUC) curve. Additional techniques—permutation feature importance (PFI), local interpretable model-agnostic explanations (LIME), and SHapley Additive exPlanations (SHAP)—were applied for explaining the black-box ML models.
Results
Fifteen routine haematological and biochemical features were selected to establish ML-based models for the prediction of IS. The XGBoost-based model achieved the highest predictive performance, reaching AUCs of 0.91 (0.90–0.92) and 0.92 (0.91–0.93) in the internal and external datasets respectively. PFI globally revealed that demographic feature age, routine haematological parameters, haemoglobin and neutrophil count, and biochemical analytes total protein and high-density lipoprotein cholesterol were more influential on the model’s prediction. LIME and SHAP showed similar local feature attribution explanations.
Conclusion
In the context of PPPM/3PM, we used the selected predictors obtained from the results of common blood tests to develop and validate ML-based models for the diagnosis of IS. The XGBoost-based model offers the most accurate prediction. By incorporating the individualised patient profile, this prediction tool is simple and quick to administer. This is promising to support subjective decision making in resource-limited settings or primary care, thereby shortening the time window for the treatment, and improving outcomes after IS.
DOI
10.1007/s13167-022-00283-4
Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.
Comments
Zheng, Y., Guo, Z., Zhang, Y., Shang, J., Yu, L., Fu, P., . . . Wang, W. (2022). Rapid triage for ischemic stroke: A machine learning-driven approach in the context of predictive, preventive and personalised medicine. EPMA Journal, 14(2), 285-298. https://doi.org/10.1007/s13167-022-00283-4