Abstract
Predicting future frames through image-to-image translation and using these synthetically generated frames for high-speed ground vehicle teleoperation is a new concept to address latency and enhance operational performance. In the immediate previous work, the image quality of the predicted frames was low and a lot of scene detail was lost. To preserve the structural details of objects and improve overall image quality in the predicted frames, several novel ideas are proposed herein. A filter has been designed to remove noise from dense optical flow components resulting from frame rate inconsistencies. The Pix2Pix base network has been modified and a structure-aware SSIM-based perpetual loss function has been implemented. A new dataset of 20 000 training input images and 2000 test input images with a 500 ms delay between the target and input frames has been created. Without any additional video transformation steps, the proposed improved model achieved PSNR of 23.1; SSIM of 0.65; and MS-SSIM of 0.80, a substantial improvement over our previous work. A Fleiss’ kappa score of > 0.40 (0.48 for the modified network and 0.46 for the perpetual loss function) proves the reliability of the model.
Keywords
conditional generative adversarial networks, future frame predictions, perpetual losses, robotic vehicles, teleoperation
Document Type
Journal Article
Date of Publication
10-1-2023
Volume
5
Issue
10
Publication Title
Advanced Intelligent Systems
Publisher
Wiley
School
School of Engineering / School of Science
RAS ID
58282
Funders
ECU-DSTG PhD scholarship
Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 License.
Comments
Moniruzzaman, M., Rassau, A., Chai, D., & Islam, S. M. S. (2023). Structure-aware image translation-based long future prediction for enhancement of ground robotic vehicle teleoperation. Advanced Intelligent Systems, 5(10), article 2200439. https://doi.org/10.1002/aisy.202200439