VUNet Method

Publication

Checkout our paper(IEEE RA-Letters), arXiv paper, supplementary material, bibtex.

Feel free to contact us for more information.

Overview

VUNet can predict the future images considering both static transformation depending on the robot pose change and the motion of the dynamic objects. It is difficult for one network simultaneously to consider both of them. Hence, we separately train two individual networks. Next figures show the overview of VUNet. At first step, SNet generates the two images on the robot pose S_t+1 in the time t-1 and t as the pre-predicted images from the images on the position S_t-1 and S_t. These generated images are come from the same robot pose S_t+1. Hence, the pixel difference in these images are only caused by the motion of the dynamic objects. In next step, DNet predicts the image considering the motion of the dynamic objects by feeding these pre-predicted images.

In the web, we try to roughly explain our method. Details of our method is shown in the manuscript.

Following two network structures are SNet and DNet. SNet can have the information of the next robot pose from the robot velocities. Given robot velocities are concatenated with features of the images. Then, the optical flow, which corresnponds to the next robot pose, is generated throught the decoder. Finally, the bilinear sampler predict the image at the next robot pose.

On the other hand, DNet uses two consecutive images on the same robot pose. There are two reasons, 1) at least two images are required to understand the motion of the dynamic objects, and 2) the pixel information hidden by the dynamic objects can be given from the consecutive images. Our DNet realizes these two aspects by the alpha-blend merge using the weighting mask Wc and Wp. (Wc+Wp=1)

By multiple calculation of our method as following figure, we can predict far future images and can understand the traversability by GONet.

-->