Description: Inpaint3D uses 2D generative priors to inpaint new 3D-consistent content in static scenes.
This paper presents a novel approach to inpainting 3D regions of a scene, given masked multi-view images, by distilling a 2D diffusion model into a learned 3D scene representation (e.g. a NeRF). Unlike 3D generative methods that explicitly condition the diffusion model on camera pose or multi-view information, our diffusion model is conditioned only on a single masked 2D image . Nevertheless, we show that this 2D diffusion model can still serve as a generative prior in a 3D multi-view reconstruction problem
There's excellent related work that was introduced around the same time as ours.
NeRFiller introduces an approach to completing missing regions of a NeRF.