Research outputs 2022 to 2026

Generalized framework for image and video object segmentation using affinity learning and message passing GNNS

Sundaram Muthu
Ruwan Tennakoon
Tharindu Rathnayake
Reza Hoseinnezhad
David Suter, Edith Cowan UniversityFollow
Alireza Bab-Hadiashar

Abstract

Despite significant amount of work reported in the computer vision literature, segmenting images or videos based on multiple cues such as objectness, texture and motion, is still a challenge. This is particularly true when the number of objects to be segmented is not known or there are objects that are not classified in the training data (unknown objects). A possible remedy to this problem is to utilize graph-based clustering techniques such as Correlation Clustering. It is known that using long range affinities (Lifted multicut), makes correlation clustering more accurate than using only adjacent affinities (Multicut). However, the former is computationally expensive and hard to use. In this paper, we introduce a new framework to perform image/motion segmentation using an affinity learning module and a Message Passing Graph Neural Network (MPGNN). The affinity learning module uses a permutation invariant affinity representation to overcome the multi-object problem. The paper shows, both theoretically and empirically, that the proposed MPGNN aggregates higher order information and thereby converts the Lifted Multicut Problem (LMP) to a Multicut Problem (MP), which is easier and faster to solve. Importantly, the proposed method can be generalized to deal with different clustering problems with the same MPGNN architecture. For instance, our method produces competitive results for single image segmentation (on BSDS dataset) as well as unsupervised video object segmentation (on DAVIS17 dataset), by only changing the feature extraction part. In addition, using an ablation study on the proposed MPGNN architecture, we show that the way we update the parameterized affinities directly contributes to the accuracy of the results. © 2023 The Authors

Document Type

Journal Article

Date of Publication

2023

Publication Title

Computer Vision and Image Understanding

Publisher

Elsevier

School

School of Science

RAS ID

61938

Funders

Australian Research Council

Grant Number

ARC Number : LP160100662

Grant Link

http://purl.org/au-research/grants/arc/LP160100662

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 License.

Comments

Muthu, S., Tennakoon, R., Rathnayake, T., Hoseinnezhad, R., Suter, D., & Bab-Hadiashar, A. (2023). Generalized framework for image and video object segmentation using affinity learning and message passing GNNS. Computer Vision and Image Understanding, 236, article 103812. https://doi.org/10.1016/j.cviu.2023.103812

Download

Included in

Computer Sciences Commons

COinS

Link to publisher version (DOI)

10.1016/j.cviu.2023.103812

Research outputs 2022 to 2026

Generalized framework for image and video object segmentation using affinity learning and message passing GNNS

Abstract

Document Type

Date of Publication

Publication Title

Publisher

School

RAS ID

Funders

Grant Number

Grant Link

Creative Commons License

Comments

Included in

Link to publisher version (DOI)

Search

Links

Browse

Author Information

Article Locations

Research outputs 2022 to 2026

Generalized framework for image and video object segmentation using affinity learning and message passing GNNS

Authors/Creators

Abstract

Document Type

Date of Publication

Publication Title

Publisher

School

RAS ID

Funders

Grant Number

Grant Link

Creative Commons License

Comments

Included in

Share

Link to publisher version (DOI)

Search

Links

Browse

Author Information

Article Locations