Date of Award
Master of Science (Computer Science) Research
School of Science
Affordance Learning is linked to the study of interactions between robots and objects, including how robots perceive objects by scene understanding. This area has been popular in the Psychology, which has recently come to influence Computer Vision. In this way, Computer Vision has borrowed the concept of affordance from Psychology in order to develop Visual-Semantic recognition systems, and to develop the capabilities of robots to interact with objects, in particular. However, existing systems of Affordance Learning are still limited to detecting and segmenting object affordances, which is called Affordance Segmentation. Further, these systems are not designed to develop specific abilities to reason about affordances. For example, a Visual-Semantic system, for captioning a scene, can extract information from an image, such as “a person holds a chocolate bar and eats it”, but does not highlight the affordances: “hold” and “eat”. Indeed, these affordances and others commonly appear within all aspects of life, since affordances usually connect to actions (from a linguistic view, affordances are generally known as verbs in sentences). Due to the above mentioned limitations, this thesis aims to develop systems of Affordance Learning for Visual-Semantic Perception. These systems can be built using Deep Learning, which has been empirically shown to be efficient for performing Computer Vision tasks.
There are two goals of the thesis: (1) study what are the key factors that contribute to the performance of Affordance Segmentation and (2) reason about affordances (Affordance Reasoning) based on parts of objects for Visual-Semantic Perception. In terms of the first goal, the thesis mainly investigates the feature extraction module as this is one of the earliest steps in learning to segment affordances.
The thesis finds that the quality of feature extraction from images plays a vital role in improved performance of Affordance Segmentation. With regard to the second goal, the thesis infers affordances from object parts to reason about part-affordance relationships. Based on this approach, the thesis devises an Object Affordance Reasoning Network that can learn to construct relationships between affordances and object parts. As a result, reasoning about affordance becomes achievable in the generation of scene graphs of affordances and object parts. Empirical results, obtained from extensive experiments, show the potential of the system (that the thesis developed) towards Affordance Reasoning from Scene Graph Generation.
Access to this thesis is embargoed until 2nd February 2025.
Nguyen Duc Minh, C. (2021). Affordance learning for visual-semantic perception. https://ro.ecu.edu.au/theses/2443
Available for download on Sunday, February 02, 2025