Description: Matterport3D Simulator and Room-to-Room (R2R) data for Vision-and-Language Navigation
NEW! Our Vision-and-Language Navigation test server and leaderboard is up on EvalAI.
R2R is the first benchmark dataset for visually-grounded natural language navigation in real buildings. The dataset requires autonomous agents to follow human-generated navigation instructions in previously unseen buildings, as illustrated in the demo above. For training, each instruction is associated with a Matterport3D Simulator trajectory. 22k instructions are available, with an average length of 29 words. There is a test evaluation server for this dataset available at EvalAI .
We are currently setting up a test evaluation server for this dataset.