MGSANet: A multiscale graph spatial alignment network for weakly aligned RGB-thermal object detection
Abstract
The current mainstream research on color-thermal (i.e., RGB-T) object detection assumes that the RGB and thermal images are strictly aligned. However, in practical situations, due to the insufficient spatiotemporal synchronization, stereo disparity of the camera in the installation position, and the errors in the image pairs registration process, the position of the same objects in RGB and thermal images are not completely overlapped. The position shift can cause distortion, trailing, and blurring issues during the image fusion process, leading to a decrease in model detection accuracy. To address this challenge, we propose a novel multiscale graph spatial alignment network (MGSANet), which can effectively alleviate the negative effects of cross-modal image misalignment. Specifically, we represent the feature maps extracted from RGB and thermal images by backbone network as a graph structure, and use graph attention network (GAT) to model the spatial position deviation relationship. Furthermore, considering the multiscale characteristics of the objects, we represent the feature maps with multiscale graphs. We then align RGB and thermal feature maps in a potential feature space according to the learned deviation relationship for object detection. In addition, considering the scarcity of RGB-T datasets from the perspective of unmanned aerial vehicle (UAV), and to verify the object detection performance on different platforms, we construct an RGB-T object detection dataset collected by the UAV platform, named KUSTDrone. We conducted experiments on datasets collected by vehicle and UAV platforms respectively. Experimental results demonstrate that MGSANet outperforms the competitive methods for weakly aligned RGB-T object detection.
Document Type
Journal Article
Date of Publication
1-1-2025
Publication Title
IEEE Transactions on Geoscience and Remote Sensing
Publisher
IEEE
School
School of Engineering
Funders
Yunnan Fundamental Research Projects (202401AW070019, 202301AV070003) / Youth Project of the National Natural Science Foundation of China (62201237) / Major Science and Technology Projects in Yunnan Province (202302AG050009) / Deanship of Scientific Research at King Khalid University (RGP)
Copyright
subscription content
Comments
Wang, Q., Sun, Y., Shen, T., Al-Antary, M., Alasmary, H., & Waqas, M. (2025). MGSANet: A multiscale graph spatial alignment network for weakly aligned RGB-thermal object detection. IEEE Transactions on Geoscience and Remote Sensing, 64. https://doi.org/10.1109/TGRS.2025.3647051