Document Type

Journal Article

Publication Title

Scientific Reports

Volume

11

Issue

1

Publisher

Nature

School

School of Medical and Health Sciences / Centre for Precision Health

RAS ID

42758

Funders

National Institute on Aging

Commonwealth Scientific and Industrial Research Organisation

Comments

Shishegar, R., Cox, T., Rolls, D., Bourgeat, P., Doré, V., Lamb, F., . . . Burnham, S. C. (2021). Using imputation to provide harmonized longitudinal measures of cognition across AIBL and ADNI. Scientific Reports, 11, article 23788.

https://doi.org/10.1038/s41598-021-02827-6

Abstract

To improve understanding of Alzheimer’s disease, large observational studies are needed to increase power for more nuanced analyses. Combining data across existing observational studies represents one solution. However, the disparity of such datasets makes this a non-trivial task. Here, a machine learning approach was applied to impute longitudinal neuropsychological test scores across two observational studies, namely the Australian Imaging, Biomarkers and Lifestyle Study (AIBL) and the Alzheimer's Disease Neuroimaging Initiative (ADNI) providing an overall harmonised dataset. MissForest, a machine learning algorithm, capitalises on the underlying structure and relationships of data to impute test scores not measured in one study aligning it to the other study. Results demonstrated that simulated missing values from one dataset could be accurately imputed, and that imputation of actual missing data in one dataset showed comparable discrimination (p < 0.001) for clinical classification to measured data in the other dataset. Further, the increased power of the overall harmonised dataset was demonstrated by observing a significant association between CVLT-II test scores (imputed for ADNI) with PET Amyloid-β in MCI APOE-ε4 homozygotes in the imputed data (N = 65) but not for the original AIBL dataset (N = 11). These results suggest that MissForest can provide a practical solution for data harmonization using imputation across studies to improve power for more nuanced analyses.

DOI

10.1038/s41598-021-02827-6

Creative Commons License

Creative Commons Attribution 4.0 License
This work is licensed under a Creative Commons Attribution 4.0 License.

Research Themes

Health

Priority Areas

Neuroscience and neurorehabilitation

Included in

Neurosciences Commons

Share

 
COinS