robot-colosseum.github.io - Colosseum

Description: THE COLOSSEUM: A Benchmark for Evaluating Generalization for Robotic Manipulation

manipulation (214) generalization (5)

Example domain paragraphs

* Equal contribution

To realize effective large-scale, real-world robotic applications, we must evaluate how well our robot policies adapt to changes in environmental conditions. Unfortunately, a majority of studies evaluate robot performance in environments closely resembling or even identical to the training setup.

We present Colosseum , a novel simulation benchmark,with 20 diverse manipulation tasks, that enables systematical evaluation of models across 12 axes of environmental perturbations. These perturbations include changes in color, texture, and size of objects, table-tops, and backgrounds; we also vary lighting, distractors, and camera pose. Using Colosseum , we compare 4 state-of-the-art manipulation models to reveal that their success rate degrades between 30-50% across these perturbation factors.

Links to robot-colosseum.github.io (1)