In recent years, learning algorithms have been achieving impressive results in vision-based indoor autonomous navigation. Researchers in this field have successfully trained embodied agents to navigate to a location defined by a target on-board camera image, a designated coordinate or an object category of interest, or by following navigation instructions directly from language. These problems are usually studied in static environments and have limited applicability to the dynamic scenarios inherent to real human environments such as homes and offices. In these spaces, we find a large variety of objects with which the mobile agent can interact, such as furniture, shoes, children toys, etc. We also find humans (ourselves!) living in these environments and performing everyday activities. How to train embodied agents that cope with dynamic environments remains a challenging research question. In fact, during the 2020 challenge we observed a significant performance gap between the navigation performance achieved in the static case versus the dynamic environments with interactable objects and dynamic agents. In these dynamic environments, the learning-based methods yielding the policies submitted by the participants outperformed planning-based methods such as the planner in the ROS navigation stack. This trend indicates a promising avenue for future research on robot learning for mobility around humans and movable objects in the built environment. This year we present this challenge focused on the Interactive Navigation and Social Navigation problems, and invite researchers around the world to push the frontiers of vision-based indoor autonomous navigation by innovating with models and algorithms applied to these benchmarks.
We present iGibson Challenge 2022, implemented in collaboration with Robotics at Google. Compared to our challenge last year
iGibson Challenge 2022 launches at the 2022 Embodied AI Workshop at the Conference on Computer Vision and Pattern Recognition (CVPR), in coordination with eight other embodied AI challenges supported by 15 academic and research organizations. The joint launch of these challenges this year offers the embodied AI research community an unprecedented opportunity to move toward a common framework for the field, converging around a unified set of tasks, simulation platforms, and 3D assets. The organizers will collectively share results across all these challenges at CVPR in June, providing a unique viewpoint on the state of embodied AI research and new directions for the subfield.
[1] iGibson, a Simulation Environment for Interactive Tasks in Large Realistic Scenes.. Bokui Shen, Fei Xia, Chengshu Li, Roberto Martín-Martín, Linxi Fan, Guanzhi Wang, Shyamal Buch, Claudia D'Arpino, Sanjana Srivastava, Lyne P Tchapmi, Micael E Tchapmi, Kent Vainio, Li Fei-Fei, Silvio Savarese, 2020.
[2] On evaluation of embodied navigation agents. Peter Anderson, Angel Chang, Devendra Singh Chaplot, Alexey Dosovitskiy, Saurabh Gupta, Vladlen Koltun, Jana Kosecka, Jitendra Malik, Roozbeh Mottaghi, Manolis Savva, Amir R. Zamir. arXiv:1807.06757, 2018.
[3] Interactive Gibson Benchmark: A Benchmark for Interactive Navigation in Cluttered Environments. Fei Xia, William B. Shen, Chengshu Li, Priya Kasimbeg, Micael Tchapmi, Alexander Toshev, Roberto Martín-Martín, and Silvio Savarese. RA-L, to be presented at ICRA 2020.
[4] RVO2 Library: Reciprocal Collision Avoidance for Real-Time Multi-Agent Simulation. Jur van den Berg, Stephen J. Guy, Jamie Snape, Ming C. Lin, and Dinesh Manocha, 2011.
[5] Robot Navigation in Constrained Pedestrian Environments using Reinforcement Learning. Claudia Pérez-D'Arpino, Can Liu, Patrick Goebel, Roberto Martín-Martín and Silvio Savarese. ICRA 2021.
We provide 8 scenes reconstructed from real world apartments in total for training in iGibson. All objects in the scenes are assigned realistic weight and fully interactable. For interactive navigation, we also provide 20 additional small objects (e.g. shoes and toys) from the Google Scanned Objects dataset. For fairness, please only use these scenes and objects for training.
For evaluation, we have 2 unseen scenes in our dev split and 5 unseen scenes in our test split. We also use 10 unseen small objects (they will share the same object categories as the 20 training small objects, but they will be different object instances).
We adopt the following task setup:
The tech spec for the robot and the camera sensor can be found in our starter code TBA.
For Interactive Navigation, we place N additional small objects (e.g. toys, shoes) near the robot's shortest path to the goal (N is proportional to the path length). These objects are generally physically lighter than the objects originally in the scenes (e.g. tables, chairs).
For Social Navigation, we place M pedestrians randomly in the scenes that pursue their own random goals during the episode while respecting each other's personal space (M is proportional to the physical size of the scene). The pedestrians have the same maximum speed as the robot. They are aware of the robot so they won't walk straight into the robot. However, they also won't yield to the robot: if the robot moves straight towards the pedestrians, it will hit them and the episode will fail.
Challenge Announced | February 15, 2022 |
EvalAI Leaderboard Open, Minival and Dev Phase Starts | February 28, 2022 |
Test Phase Starts | May 15, 2022 |
Challenge Dev and Test Phase Ends | June 5, 2022 |
Winner Demo | June 19, 2022 |