Document Type

Journal Article

Publication Title

Advanced Intelligent Systems

Volume

5

Issue

10

Publisher

Wiley

School

School of Engineering / School of Science

RAS ID

58282

Funders

ECU-DSTG PhD scholarship

Comments

Moniruzzaman, M., Rassau, A., Chai, D., & Islam, S. M. S. (2023). Structure-aware image translation-based long future prediction for enhancement of ground robotic vehicle teleoperation. Advanced Intelligent Systems, 5(10), article 2200439. https://doi.org/10.1002/aisy.202200439

Abstract

Predicting future frames through image-to-image translation and using these synthetically generated frames for high-speed ground vehicle teleoperation is a new concept to address latency and enhance operational performance. In the immediate previous work, the image quality of the predicted frames was low and a lot of scene detail was lost. To preserve the structural details of objects and improve overall image quality in the predicted frames, several novel ideas are proposed herein. A filter has been designed to remove noise from dense optical flow components resulting from frame rate inconsistencies. The Pix2Pix base network has been modified and a structure-aware SSIM-based perpetual loss function has been implemented. A new dataset of 20 000 training input images and 2000 test input images with a 500 ms delay between the target and input frames has been created. Without any additional video transformation steps, the proposed improved model achieved PSNR of 23.1; SSIM of 0.65; and MS-SSIM of 0.80, a substantial improvement over our previous work. A Fleiss’ kappa score of > 0.40 (0.48 for the modified network and 0.46 for the perpetual loss function) proves the reliability of the model.

DOI

10.1002/aisy.202200439

Creative Commons License

Creative Commons Attribution 4.0 License
This work is licensed under a Creative Commons Attribution 4.0 License.

Share

 
COinS