Date of Award

2026

Keywords

deep reinforcement learning, robotics, unstructured terrain navigation, offroad navigation, supervised learning

Document Type

Thesis

Publisher

Edith Cowan University

Degree Name

Master of Engineering Science

School

School of Engineering

First Supervisor

Alexander Rassau

Second Supervisor

Douglas Chai

Abstract

Advancements in the field of Deep Learning has ushered in a boom in autonomous navigation research. Most of the work being conducted in this space, however, has focused on on-road urban navigation scenarios, with unstructured outdoor terrain navigation receiving much more limited attention. Given the wide range of applications that exist for legged and wheeled ground robots in off-road environments in areas such as agriculture, mining and disaster recovery, there is a growing need for research work to improve the navigational capabilities of mobile robots deployed in these challenging environments.

A promising candidate for application to these navigation challenges in the unstructured off-road domain is Deep Reinforcement Learning (DRL), which has shown significant success in achieving human-level expertise in a wide range of control problems. For problems with higher dimensionality however, DRL suffers from sample-inefficiency and requires large amounts of compute resources for convergence. Modern supervised learning methods have been used to achieve state-of-the-art results in urban navigation benchmarks but struggle to generalise to out-of-distribution data. While expert policies for urban on-road navigation are readily available in both synthetic and real-world forms, significant challenges exist in the case of unstructured outdoor environments due to the lack of availability of the large amounts of annotated navigational data needed for agent learning.

Based on an in-depth study of published literature for DRL, Imitation Learning (IL) and synthetic and real-world sensor data generation, this research investigates a novel method for combining DRL and IL techniques to achieve robust autonomous navigation in unstructured outdoor terrain. Validation of the proposed approach has been carried out in simulated environments containing prototype-phase designs consisting of both simplified terrain and obstacle features and more realistic designs containing high-fidelity photogrammetry assets representing Western Australian environments. The 3D environment models are implemented using the NVIDIA Isaac Sim simulation platform, which supports realistic physics-based autonomous robot training workflows. The developed 3D environment models are used for generating navigational expert policies, which are subsequently used to improve the sample efficiency of off-policy Reinforcement Learning (RL) algorithms. The expert trajectories are incorporated into the early learning phase of the RL algorithms using the proposed framework. The performance of the proposed approach is tested over four different 3D environment models and two different RL algorithms. The results of the experiments indicate that the proposed approach can achieve up to four times more sample-efficient learning with the use of expert trajectories in the replay buffer for off-policy RL algorithms.

The work presented in this thesis can be used to develop more robust and sample efficient end-to-end navigational policies for off-road and unstructured terrain. Application of this approach will lead to the advancement of autonomous navigation capabilities in the agriculture, mining, disaster-recovery and transportation industries.

Comments

Author also known as Hewage Dulitha Nisal Dabare

Share

 
COinS
 

Link to publisher version (DOI)

10.25958/6ykw-3e22