MGSANet: A multiscale graph spatial alignment network for weakly aligned RGB-thermal object detection

Abstract

The current mainstream research on color-thermal (i.e., RGB-T) object detection assumes that the RGB and thermal images are strictly aligned. However, in practical situations, due to the insufficient spatiotemporal synchronization, stereo disparity of the camera in the installation position, and the errors in the image pairs registration process, the position of the same objects in RGB and thermal images are not completely overlapped. The position shift can cause distortion, trailing, and blurring issues during the image fusion process, leading to a decrease in model detection accuracy. To address this challenge, we propose a novel multiscale graph spatial alignment network (MGSANet), which can effectively alleviate the negative effects of cross-modal image misalignment. Specifically, we represent the feature maps extracted from RGB and thermal images by backbone network as a graph structure, and use graph attention network (GAT) to model the spatial position deviation relationship. Furthermore, considering the multiscale characteristics of the objects, we represent the feature maps with multiscale graphs. We then align RGB and thermal feature maps in a potential feature space according to the learned deviation relationship for object detection. In addition, considering the scarcity of RGB-T datasets from the perspective of unmanned aerial vehicle (UAV), and to verify the object detection performance on different platforms, we construct an RGB-T object detection dataset collected by the UAV platform, named KUSTDrone. We conducted experiments on datasets collected by vehicle and UAV platforms respectively. Experimental results demonstrate that MGSANet outperforms the competitive methods for weakly aligned RGB-T object detection.

Document Type

Journal Article

Date of Publication

1-1-2025

Publication Title

IEEE Transactions on Geoscience and Remote Sensing

Publisher

IEEE

School

School of Engineering

Funders

Yunnan Fundamental Research Projects (202401AW070019, 202301AV070003) / Youth Project of the National Natural Science Foundation of China (62201237) / Major Science and Technology Projects in Yunnan Province (202302AG050009) / Deanship of Scientific Research at King Khalid University (RGP)

Comments

Wang, Q., Sun, Y., Shen, T., Al-Antary, M., Alasmary, H., & Waqas, M. (2025). MGSANet: A multiscale graph spatial alignment network for weakly aligned RGB-thermal object detection. IEEE Transactions on Geoscience and Remote Sensing, 64. https://doi.org/10.1109/TGRS.2025.3647051

Copyright

subscription content

Share

 
COinS
 

Link to publisher version (DOI)

10.1109/TGRS.2025.3647051