Author Identifier

Kun Hu: https://orcid.org/0000-0002-6891-8059

Document Type

Journal Article

Publication Title

Neurocomputing

Volume

637

Publisher

Elsevier

School

School of Science

Publication Unique Identifier

10.1016/j.neucom.2025.130077

RAS ID

78501

Comments

Hu, K., He, F., Schembri, A., & Wang, Z. (2025). Graph traverse reference network for sign language corpus retrieval in the wild. Neurocomputing, 637, 130077. https://doi.org/10.1016/j.neucom.2025.130077

Abstract

Sign languages are the primary languages of the deaf community as well as hearing individuals who are unable to speak, which engage the visual-manual modality to convey meanings. In recent years, there has been an explosive growth of sign language videos available from video streaming and social media service platforms. Given the size of these corpora, sign language users often face significant challenges in effectively acquiring the information they need. Therefore, we propose a novel deep learning architecture, namely Graph Traverse Reference Network (GTRN), allowing visual signing queries to retrieve relevant sign language videos (documents) from a large corpus. GTRN introduces a traverse graph, which provides coarse-to-fine reference information in a hierarchical manner from frame-level to body-part-level observations. A reference-based attention is devised to obtain the embedding for a visual input of each level, which allows the computations to be allocated and processed at difference locations regarding local devices and central servers. A contrastive learning strategy optimizes GTRN in pursuit of a joint latent space for the queries and the documents by their meanings. Moreover, GTRN is compatible to leverage existing general visual representation foundation models, by which their resulted embeddings are used as the frame-level reference of GTRN. To the best of our knowledge, it is one of the first studies on using visual signing queries for retrieving sign language videos in a real-world setting and comprehensive experiments were conducted which demonstrated the effectiveness of our proposed method.

DOI

10.1016/j.neucom.2025.130077

Creative Commons License

Creative Commons Attribution 4.0 License
This work is licensed under a Creative Commons Attribution 4.0 License.

Share

 
COinS