lerf.io - LERF: Language Embedded Radiance Fields

Description: Grounding CLIP vectors volumetrically inside a NeRF allows flexible natural language queries in 3D

Example domain paragraphs

LERF optimizes a dense, multi-scale language 3D field by volume rendering CLIP embeddings along training rays, supervising these embeddings with multi-scale CLIP features across multi-view training images. After optimization, LERF can extract 3D relevancy maps for language queries interactively in real-time. LERF enables pixel-aligned queries of the distilled 3D CLIP embeddings without relying on region proposals, masks, or fine-tuning , supporting long-tail open-vocabulary queries hierarchically across the

Use ON/OFF to toggle between RGB renders and 3D relevancy maps!

With multi-view supervision, 3D CLIP embeddings are more robust to occlusion and viewpoint changes than 2D CLIP embeddings. 3D CLIP embeddings also conform better to the 3D scene structure, giving them a crisper appearance.

Links to lerf.io (12)