concept-fusion.github.io - ConceptFusion: Open-set Multimodal 3D Mapping

Description: ConceptFusion: Open-set Multimodal 3D Mapping

clip (625) slam (187) multimodal (76) 3d mapping (53) foundation models (6) conceptfusion (3) open-set (3)

Example domain paragraphs

Building 3D maps of the environment is central to robot navigation, planning, and interaction with objects in a scene. Most existing approaches that integrate semantic concepts with 3D maps largely remain confined to the closed-set setting: they can only reason about a finite set of concepts, pre-defined at training time. Further, these maps can only be queried using class labels, or in recent work, using text prompts.

We address both these issues with ConceptFusion, a scene representation that is: (i) fundamentally open-set, enabling reasoning beyond a closed set of concepts (ii) inherently multi-modal, enabling a diverse range of possible queries to the 3D map, from language, to images, to audio, to 3D geometry, all working in concert. ConceptFusion leverages the open-set capabilities of today’s foundation models pre-trained on internet-scale data to reason about concepts across modalities such as natural language, imag

ConceptFusion constructs pixel-aligned features from off-the-shelf foundation models that can only produce a global (image-level) embedding vector. This is achieved by: processing input images to generate generic (class-agnostic) object masks and extracting a local features for each, computing a global feature for the input image as a whole, and fusing the region-specific features with global features using our proposed zero-shot pixel alignment technique.

Links to concept-fusion.github.io (12)

krrish94.github.io Krishna Murthy Jatavallabhula
montrealrobotics.ca Robotics Group @ University of Montreal
ayushtewari.com Ayush Tewari
nik-v9.github.io Nikhil Varma Keetha
georgegu1997.github.io Qiao Gu - Qiao Gu’s Homepage
alihkw.com Ali Kuwajerwala
concept-graphs.github.io ConceptGraphs: Open-Vocabulary 3D Scene Graphs for Perception and Planning
opensun3d.github.io Open☀️3D
saryazdi.github.io Soroush Saryazdi
montrealrobotics.github.io Robotics Group @ University of Montreal
alik-git.github.io Ali Kuwajerwala
egolifter.github.io EgoLifter: Open-world 3D Segmentation for Egocentric Perception