Research outputs 2022 to 2026

Long future frame prediction using optical flow informed deep neural networks for enhancement of robotic teleoperation in high latency environments

Md Moniruzzaman, Edith Cowan UniversityFollow
Alexander Rassau, Edith Cowan UniversityFollow
Douglas Chai, Edith Cowan UniversityFollow
Syed M. S. Islam, Edith Cowan UniversityFollow

Document Type

Journal Article

Publication Title

Journal of Field Robotics

Publisher

Wiley

School

School of Engineering / School of Science

RAS ID

53136

Funders

DSTG

Comments

Moniruzzaman, M. D., Rassau, A., Chai, D., & Islam, S. M. S. (2023). Long future frame prediction using optical flow informed deep neural networks for enhancement of robotic teleoperation in high latency environments. Journal of Field Robotics, 40(2), 393-425. https://doi.org/10.1002/rob.22135

https://doi.org/10.1002/rob.22135

Abstract

High latency in teleoperation has a significant negative impact on operator performance. While deep learning has revolutionized many domains recently, it has not previously been applied to teleoperation enhancement. We propose a novel approach to predict video frames deep into the future using neural networks informed by synthetically generated optical flow information. This can be employed in teleoperated robotic systems that rely on video feeds for operator situational awareness. We have used the image-to-image translation technique as a basis for the prediction of future frames. The Pix2Pix conditional generative adversarial network (cGAN) has been selected as a base network. Optical flow components reflecting real-time control inputs are added to the standard RGB channels of the input image. We have experimented with three data sets of 20,000 input images each that were generated using our custom-designed teleoperation simulator with a 500-ms delay added between the input and target frames. Structural Similarity Index Measures (SSIMs) of 0.60 and Multi-SSIMs of 0.68 were achieved when training the cGAN with three-channel RGB image data. With the five-channel input data (incorporating optical flow) these values improved to 0.67 and 0.74, respectively. Applying Fleiss' κ gave a score of 0.40 for three-channel RGB data, and 0.55 for five-channel optical flow-added data. We are confident the predicted synthetic frames are of sufficient quality and reliability to be presented to teleoperators as a video feed that will enhance teleoperation. To the best of our knowledge, we are the first to attempt to reduce the impacts of latency through future frame prediction using deep neural networks.

DOI

10.1002/rob.22135

Related Publications

Moniruzzaman, M. (2023). Teleoperation enhancement for improved control of unmanned ground vehicles through video transformation and deep learning based future frame prediction. https://ro.ecu.edu.au/theses/2644

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 License.

Download

Included in

Engineering Commons, Physical Sciences and Mathematics Commons

COinS

Link to publisher version (DOI)

10.1002/rob.22135

Research outputs 2022 to 2026

Long future frame prediction using optical flow informed deep neural networks for enhancement of robotic teleoperation in high latency environments

Document Type

Publication Title

Publisher

School

RAS ID

Funders

Comments

Abstract

DOI

Related Publications

Creative Commons License

Included in

Link to publisher version (DOI)

Search

Links

Browse

Author Information

Article Locations

Research outputs 2022 to 2026

Long future frame prediction using optical flow informed deep neural networks for enhancement of robotic teleoperation in high latency environments

Authors

Document Type

Publication Title

Publisher

School

RAS ID

Funders

Comments

Abstract

DOI

Related Publications

Creative Commons License

Included in

Share

Link to publisher version (DOI)

Search

Links

Browse

Author Information

Article Locations