Authors
Rosita Shishegar
Timothy Cox
David Rolls
Pierrick Bourgeat
Vincent Doré
Fiona Lamb
Joanne Robertson
Simon M. Laws, Edith Cowan UniversityFollow
Tenielle Porter, Edith Cowan UniversityFollow
Jurgen Fripp
Duygu Tosun
Paul Maruff
Greg Savage
Christopher C. Rowe
Colin L. Masters
Michael W. Weiner
Victor L. Villemagne
Samantha C. Burnham
Document Type
Journal Article
Publication Title
Scientific Reports
Volume
11
Issue
1
Publisher
Nature
School
School of Medical and Health Sciences / Centre for Precision Health
RAS ID
42758
Funders
National Institute on Aging
Commonwealth Scientific and Industrial Research Organisation
Abstract
To improve understanding of Alzheimer’s disease, large observational studies are needed to increase power for more nuanced analyses. Combining data across existing observational studies represents one solution. However, the disparity of such datasets makes this a non-trivial task. Here, a machine learning approach was applied to impute longitudinal neuropsychological test scores across two observational studies, namely the Australian Imaging, Biomarkers and Lifestyle Study (AIBL) and the Alzheimer's Disease Neuroimaging Initiative (ADNI) providing an overall harmonised dataset. MissForest, a machine learning algorithm, capitalises on the underlying structure and relationships of data to impute test scores not measured in one study aligning it to the other study. Results demonstrated that simulated missing values from one dataset could be accurately imputed, and that imputation of actual missing data in one dataset showed comparable discrimination (p < 0.001) for clinical classification to measured data in the other dataset. Further, the increased power of the overall harmonised dataset was demonstrated by observing a significant association between CVLT-II test scores (imputed for ADNI) with PET Amyloid-β in MCI APOE-ε4 homozygotes in the imputed data (N = 65) but not for the original AIBL dataset (N = 11). These results suggest that MissForest can provide a practical solution for data harmonization using imputation across studies to improve power for more nuanced analyses.
DOI
10.1038/s41598-021-02827-6
Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.
Comments
Shishegar, R., Cox, T., Rolls, D., Bourgeat, P., Doré, V., Lamb, F., . . . Burnham, S. C. (2021). Using imputation to provide harmonized longitudinal measures of cognition across AIBL and ADNI. Scientific Reports, 11, article 23788.
https://doi.org/10.1038/s41598-021-02827-6