Deep Visual MPC-Policy Learning for Navigation

Get Started View on Github


Humans can routinely follow a trajectory defined by a list of images/landmarks. However, traditional robot navigation methods require accurate mapping of the environment, localization, and planning. Moreover, these methods are sensitive to subtle changes in the environment. In this paper, we propose a Deep Visual MPC-policy learning method that can perform visual navigation while avoiding collisions with unseen objects on the navigation path. Our model, PoliNet takes in as input a visual trajectory and the image of the robot's current view and outputs velocity commands for a planning horizon that optimally balance between trajectory following and obstacle avoidance. PoliNet is trained using a strong image predictive model, VUNet-360 and traversability estimation model, GONet in a MPC setup, with minimal human supervision. Different from prior work, PoliNet can be applied to new scenes without retraining. We show experimentally that the robot can follow a visual trajectory when varying start position and in the presence of previously unseen obstacles. We validated our algorithm with tests both in a realistic simulation environment and in the real world. We also show that we can generate visual trajectories in simulation and execute the corresponding path in the real environment. Our approach outperforms classical approaches as well as previous learning-based baselines in success rate of goal reaching, sub-goal coverage rate, and computational load.
VUNet-360 is the modified version of our previous work VUNet.


Our method, DVMPC can follow the visual trajectory, which is the time consecutive images, to arrive at the final goal. Our control policy, PoliNet is trained using the same objects as visual model predictive control(MPC) for the navigation. To calculate the objectives, we also propose VUNet-360, which can predict the future images conditioned by the virtual velocities for 8 steps.
Here, the current image from the onboard camera, the subgoal image from the visual trajectory, and the predicted images at 8-th step(farthest future) are shown for the visualization of our method. To predict the future images, we feed the virtual velocities by PoliNet into VUNet-360. The predicted image is almost always similar to the subgoal image. It means that the robot with our method can successfully go toward the location of the subgoal image.


Our method is evaluated by 3 kinds of navigation in the real environment.


In Sim-to-Sim, we evaluate our method in the simulator, Gibson Environment. Before the navigation, we collect the visual trajectory(subgoal images) by tele-operation in the simulator. All environments and all obstacles are unseen in the training. Left side movie is without obstacle, and right side movie is with obstacle.


In Real-to-Real, we collect the visual trajectory by tele-operation, before the navigation. Then, we run our method for the navigation. Although there are environment changes and/or the obstacles, our method can arrive at the goal. We can see the robustness of our method. 8th predicted image is always similar to the subgoal image.

Long experiment of Real-to-Real


In Sim-to-Real, we collect the visual trajectory on the simulator, Gibson Environment. Hence, we don't need to collect the visual trajectory using the real robot.