From pixels to prognosis: A multi-modal attention-based framework for visceral adipose tissue estimation
Abstract
Obesity is a chronic disease that increases the risk of multi-organ damage as well as cardiovascular disease, diabetes, and certain cancers. It is strongly related to Visceral Adipose Tissue (VAT), which is the fat stored around the internal organs. New approaches to assessing VAT in large populations are essential to understand how obesity contributes to chronic disease progression. Various direct and indirect measures have been developed to quantify VAT. However, many of these techniques either fail to distinguish between various types of body fats (e.g., subcutaneous versus visceral) or involve high radiation imaging and/or are costly (e.g., Computed Tomography). Annually, millions of individuals globally undergo hip or spine Dual-energy X-ray Absorptiometry (DXA) scans to screen for osteoporosis as well as lateral spine (LS) scans to detect vertebral fractures. In this paper, we develop a multi-modal attention-based framework for VAT estimation from LS DXA scans and patient demographic information. We compare our results on two LS DXA datasets with baseline methods and also perform clinical analysis to demonstrate its effectiveness. The proposed approach has the potential to enable cost-effective, non-invasive, and efficient quantification of VAT in people undergoing bone density assessment with LS scans. To the best of our knowledge, this is the first paper to predict VAT from DXA LS scans.
Document Type
Conference Proceeding
Date of Publication
1-1-2026
Volume
15974 LNCS
Publication Title
Lecture Notes in Computer Science
Publisher
Springer
School
Centre for Artificial Intelligence and Machine Learning (CAIML) / School of Science / Nutrition and Health Innovation Research Institute
RAS ID
84393
Funders
National Health and Medical Research Council / Raine Medical Research Foundation / Western Australian Future Health Research and Innovation Fund
Grant Number
NHMRC Number : APP1183570
Copyright
subscription content
First Page
249
Last Page
259
Comments
Maqsood, A., Saleem, A., Sim, M., Suter, D., Radavelli-Bagatini, S., Hodgson, J. M., Prince, R. L., Zhu, K., Leslie, W. D., Schousboe, J. T., Lewis, J. R., & Gilani, S. Z. (2025). From pixels to prognosis: A multi-modal attention-based framework for visceral adipose tissue estimation. In Lecture Notes in Computer Science (pp. 249–259). https://doi.org/10.1007/978-3-032-05182-0_25