nerfies.github.io - Nerfies: Deformable Neural Radiance Fields

Description: Deformable Neural Radiance Fields creates free-viewpoint portraits (nerfies) from casually captured videos.

nerf (138) d-nerf (48) nerfies (47)

Example domain paragraphs

We present the first method capable of photorealistically reconstructing a non-rigidly deforming scene using photos/videos captured casually from mobile phones.

Our approach augments neural radiance fields (NeRF) by optimizing an additional continuous volumetric deformation field that warps each observed point into a canonical 5D NeRF. We observe that these NeRF-like deformation fields are prone to local minima, and propose a coarse-to-fine optimization method for coordinate-based models that allows for more robust optimization. By adapting principles from geometry processing and physical simulation to NeRF-like models, we propose an elastic regularization of the d

We show that Nerfies can turn casually captured selfie photos/videos into deformable NeRF models that allow for photorealistic renderings of the subject from arbitrary viewpoints, which we dub "nerfies" . We evaluate our method by collecting data using a rig with two mobile phones that take time-synchronized photos, yielding train/validation images of the same pose at different viewpoints. We show that our method faithfully reconstructs non-rigidly deforming scenes and reproduces unseen views with high fide

Links to nerfies.github.io (222)

jonbarron.info Jon Barron
keunhong.com Keunhong Park
hypernerf.github.io HyperNeRF: A Higher-Dimensional Representation for Topologically Varying Neural Radiance Fields
photoshape.github.io PhotoShape: Photorealistic Materials for Large-Scale Shape Collections — Keunhong Park
ricardomartinbrualla.com Ricardo Martin-Brualla
smseitz.com Home
mmmu-benchmark.github.io MMMU
climatenerf.github.io ClimateNeRF
robust-dynrf.github.io RoDynRF: Robust Dynamic Radiance Fields
textual-inversion.github.io An Image is Worth One Word: Personalizing Text-to-Image Generation using Textual Inversion
mathvista.github.io MathVista: Evaluating Math Reasoning in Visual Contexts
robot-parkour.github.io Robot Parkour Learning
energy-locomotion.github.io Minimizing Energy Consumption Leads to the Emergence of Gaits in Legged Robots
navigation-locomotion.github.io Coupling Vision and Proprioception for Navigation of Legged Robots
isaac-orbit.github.io ORBIT: A Unified Simulation Framework for Interactive Robot Learning Environments
motion-x-dataset.github.io Motion-X: A Large-scale 3D Expressive Whole-body Human Motion Dataset
mobile-aloha.github.io Mobile ALOHA
manipulation-locomotion.github.io Deep Whole-Body Control: Learning a Unified Policy for Manipulation and Locomotion
mystyle-personalized-prior.github.io MyStyle
diffusion-classifier.github.io Diffusion Classifier
human2humanoid.com Learning Human-to-Humanoid Real-Time Whole-Body Teleoperation
autogpart.github.io AutoGPart: Intermediate Supervision Search for Generalizable 3D Part Segmentation
internet-explorer-ssl.github.io Internet Explorer: Targeted Representation Learning on the Open Web
texturedreamer.github.io TextureDreamer: Image-guided Texture Synthesis through Geometry-aware Diffusion
robotics-fm-survey.github.io Towards General-Purpose Robots via Foundation Models: A Survey and Meta-Analysis
diffusion-es.github.io Diffusion-ES: Gradient-free Planning with Diffusion for Autonomous Driving and Zero-Shot Instruction Following
video-edit-gan.github.io Temporally Consistent Semantic Video Editing
is-count.github.io IS-Count: Large-scale Object Counting from Satellite Images with Covariate-based Importance Sampling
mcnerf.github.io MCNeRF: Monte Carlo Rendering and Denoising for Real-Time NeRFs
confaide.github.io Can LLMs Keep a Secret? Testing Privacy Implications of Language Models via Contextual Integrity Theory
sat2density.github.io Sat2Density: Faithful Density Learning from Satellite-Ground Image Pairs
agile-but-safe.github.io Agile But Safe: Learning Collision-Free High-Speed Legged Locomotion
llm-tuning-safety.github.io LLM Finetuning Risks
in-n-out-3d.github.io In-N-Out
realfill.github.io RealFill
scenefun3d.github.io SceneFun3D: Fine-Grained Functionality and Affordance Understanding in 3D Scenes
human-sgd.github.io Human-SGD
scanents3d.github.io ScanEnts3D: Exploiting Phrase-to-3D-Object Correspondences for Improved Visio-Linguistic Models in 3D Scenes
stitch-time.github.io Stitch it in Time: GAN-Based Facial Editing of Real Videos
powers-of-10.github.io Generative Powers of Ten
catdrive.github.io CaT: Coaching a Teachable Student
rh20t.github.io RH20T: A Comprehensive Robotic Dataset for Learning Diverse Skills in One-Shot
airexo.github.io AirExo: Low-Cost Exoskeletons for Learning Whole-Arm Manipulation in the Wild
world-from-eyes.github.io Seeing the World through Your Eyes
meshdiffusion.github.io MeshDiffusion
md-splatting.github.io MD-Splatting: Learning Metric Deformation from 4D Gaussians in Highly Deformable Scenes
3d-motion-magnification.github.io 3D Motion Magnification: Visualizing Subtle Motions with Time-Varying Neural Fields
simpson-cvpr23.github.io SimpSON: Simplifying Photo Cleanup With Single-Click Distracting Object Segmentation Network
geff-b1.github.io Learning Generalizable Feature Fields for Mobile Manipulation
lfvoid-rl.github.io LfVoid: Can Pre-Trained Text-to-Image Models Generate Visual Goals for Reinforcement Learning?
camp-nerf.github.io CamP: Camera Preconditioning for Neural Radiance Fields
dynosaur-it.github.io Dynosaur: A Dynamic Growth Paradigm for Instruction-Tuning Data Curation
feature-3dgs.github.io Feature 3DGS: Supercharging 3D Gaussian Splatting to Enable Distilled Feature Fields
ovmm.github.io HomeRobot: Open Vocabulary Mobile Manipulation
urbaninverserendering.github.io UrbanIR
odin-seg.github.io ODIN: A Single Model For 2D and 3D Perception
glitchbench.github.io GlitchBench
mesh-aware-rf.github.io Dynamic Mesh-Aware Radiance Fields
osx-ubody.github.io One-Stage 3D Whole-Body Mesh Recovery with Component Aware Transformer
mv-nrsfm.github.io MV-NRSfM
scg-rule-guided-music.github.io Symbolic Music Generation with Non-Differentiable Rule Guided Diffusion
objnerf.github.io Obj-NeRF: Extracting Object NeRFs from Multi-view Images
pixart-alpha.github.io PIXART-α
lfbo-ml.github.io A General Recipe for Likelihood-free Bayesian Optimization
cyber-demo.github.io CyberDemo: Augmenting Simulated Human Demonstration for Real-World Dexterous Manipulation
ref-npr.github.io Ref-NPR: Reference-Based Non-Photorealistic Radiance Fields
worldsheet.github.io Worldsheet: Wrapping the World in a 3D Sheet for View Synthesis from a Single Image
lifelongmemory.github.io LifelongMemory
lightgaussian.github.io LightGaussian: Unbounded 3D Gaussian Compression with 15x Reduction and 200+ FPS
video-lavit.github.io Video-LaVIT
3dlfm.github.io 3D-LFM: Lifting Foundation Model
auto-rt.github.io AutoRT: Embodied Foundation Models for Large Scale Orchestration of Robotic Agents
fig-nerf.github.io FiG-NeRF
wavesbench.github.io WAVES: Benchmarking the Robustness of Image Watermarks
equi-articulated-pose.github.io Self-Supervised Category-Level Articulated Object Pose Estimation with Part-Level SE(3) Equivariance
general-navigation-models.github.io General Navigation Models
spright-t2i.github.io SPRIGHT
vr-nerf.github.io VR-NeRF: High-Fidelity Virtualized Walkable Spaces
reasonwithpal.com PAL: Program-aided Language Models
rgbmanip.github.io RGBManip: Monocular Image-based Robotic Manipulation through Active Object Pose Estimation
dp-mix.github.io DP-Mix: Mixup-based Data Augmentation for Differentially Private Learning
fairy-video2video.github.io Fairy: Fast Parallelized Instruction-Guided Video-to-Video Synthesis
multiview-bootstrapping-in-wild.github.io MBW: Multiview-bootstrapping in the Wild
mmstar-benchmark.github.io MMStar
ranni-t2i.github.io Ranni
snnet.github.io Stitchable Neural Networks
unlearn-canvas.netlify.app UnlearnCanvas: A Stylized Image Dataset to Benchmark Machine Unlearning for Diffusion Models
rise-policy.github.io RISE: 3D Perception Makes Real-World Robot Imitation Simple and Effective
gshell3d.github.io G-Shell
chat-with-nerf.github.io Chat with NeRF
texvocab.github.io TexVocab:Texture Vocabulary-conditioned Human Avatars
segment3d.github.io Segment3D: Learning Fine-Grained Class-Agnostic 3D Segmentation without Manual Labels
diffmimic.github.io DiffMimic: Efficient Motion Mimicking with Differentiable Physics
video-dex.github.io VideoDex: Learning Dexterity from Internet Videos
brics-project.github.io BRICS: Bi-level Feature Representation of Image CollectionS
adv-icl.github.io Jailbreak and Guard Aligned Language Models with Only Few In-Context Demonstrations
icgraspnet.github.io ICG-Net: A Unified Approach for Instance-Centric Grasping
minoring.github.io Minho Heo
rival-diff.github.io RIVAL: Real-World Image Variation by Aligning Diffusion Inversion Chain
ecoflap.github.io ECoFLaP: Efficient Coarse-to-Fine Layer-Wise Pruning for Vision-Language Models
stylegan-nada.github.io StyleGAN-NADA: CLIP-Guided Domain Adaptation of Image Generators
structdiffusion.github.io StructDiffusion: Language-Guided Creation of Physically-Valid Structures using Unseen Objects
video-p2p.github.io Video-P2P: Video Editing with Cross-attention Control
llm-access-control.github.io LLM Access Control
video2game.github.io Video2Game
gaussianeditor.github.io GaussianEditor: Editing 3D Gaussians Delicately with Text Instructions
selfrefine.info Self-Refine: Iterative Refinement with Self-Feedback
flex-nerf.github.io FlexNeRF: Photorealistic Free-viewpoint Rendering of Moving Humans from Sparse Views
becotta-ctta.github.io BECoTTA: Input-dependent Online Blending of Experts for Continual Test-time Adaptation
tdmpc2.com TD-MPC2
diffusion-tta.github.io Diffusion-TTA
twist-sim2real.github.io TWIST: Teacher-Student World Model Distillation for Efficient Sim-to-Real Transfer
mementos-bench.github.io Mementos
pie4perf.com Learning Performance-Improving Code Edits with <img src="./static/images/favicon.svg" alt="pie" width="25" height="2
instantsplat.github.io Instantsplat: Unbounded Sparse-view Pose-free Gaussian Splatting in 40 Seconds
controllable-cinemagraphs.github.io Controllable Animation of Fluid Elements in Still Images
lmql.ai LMQL is a programming language for LLM interaction. | LMQL
intrinsic-lora.github.io Generative Models: What do they know?
promptstyler.github.io PromptStyler
3d-diffuser-actor.github.io 3D Diffuser Actor: Policy Diffusion with 3D Scene Representations
image-hijacks.github.io Image Hijacks: Adversarial Images can Control Generative Models at Runtime
d2c-model.github.io D2C: Diffusion-Denoising Models for Few-shot Conditional Generation
maqingyang.github.io Ze (Edward) Ma
thinshelllab.github.io Thin-shell Object Manipulations with Differentiable Physics Simulations
robo-affordances.github.io VRB: Affordances from Human Videos as a Versatile Representation for Robotics
codet-ovd.github.io CoDet: Co-Occurrence Guided Region-Word Alignment for Open-Vocabulary Object Detection
simm-t2i.github.io SimM
robotic-telekinesis.github.io Robotic Telekinesis: Learning a Robotic Hand Imitator by Watching Humans on Youtube
dash-through-interaction.github.io A Framework for Designing Anthropomorphic Soft Hands through Interaction
3dnbf.github.io 3D-Aware Neural Body Fitting for Occlusion Robust 3D Human Pose Estimation
arkittrack.github.io ARKitTrack: A New Diverse Dataset for Tracking Using Mobile RGB-D Data
gradient-scaling.github.io Floaters No More: Radiance Field Gradient Scaling for Improved Near-Camera Training
geography-aware-ssl.github.io Geography-Aware Self-Supervised Learning
differential-diffusion.github.io Differential Diffusion: Giving Each Pixel Its Strength
avalonbench.github.io AvalonBench: Evaluating LLMs Playing the Game of Avalon
mathverse-cuhk.github.io MATHVERSE: Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems?
realgen.github.io RealGen
sear-rl.github.io Efficient RL via Disentangled Environment and Agent Representations
scsfnet.github.io Semantic Complete Scene Forecasting from a 4D Dynamic Point Cloud Sequence
diff-video-ae.github.io Diffusion Video Autoencoders: Toward Temporally Consistent Face Video Editing via Disentangled Video Encoding
mnms-project.github.io m&ms
ntu-aiot-lab.github.io NTU-AIoT-Lab | NTU-AIoT-Lab.github.io
leaphand.com LEAP Hand: Low-Cost, Efficient, and Anthropomorphic Hand for Robot Learning
cascadezero123.github.io Cascade-Zero123: One Image to Highly Consistent 3D with Self-Prompted Nearby Views
zerotprune.github.io Zero-TPrune: Zero-Shot Token Pruning through Leveraging of the Attention Graph in Pre-Trained Transformers
arsalsyed.github.io Arsal Syed
mathgenie.github.io MathGenie
mimicgen.github.io MimicGen
readout-guidance.github.io Readout Guidance: Learning Control from Diffusion Features
clip-actor.github.io CLIP-Actor
diffuse-to-choose.github.io Diffuse to Choose: Enriching Image Conditioned Inpainting in Latent Diffusion Models for Virtual Try-All
vision-robotics-bridge.github.io Vision-Robotics Bridge
masked-spacetime-hashing.github.io Masked Space-Time Hash Encoding
orthoplanes.github.io OP3D: Orthoplanes Representation for Better 3D-Awareness of GANs
illusionvqa.github.io IllusionVQA: A Challenging Optical Illusion Dataset for Vision Language Models
vim-bench.github.io VIM: Probing Multimodal Large Language Models for Visual Embedded Instruction Following
robot-mirage.github.io Mirage: Cross-Embodiment Zero-Shot Policy Transfer with Cross-Painting
dbaman.com Scraper Spider
tuning-encoder.github.io Encoder-based Domain Tuning for Fast Personalization of Text-to-Image Models
alpha-rgs.github.io Score-based Source Separation with Applications to Digital Communication Signals
opennerf.github.io OpenNerf: Open Set 3D Neural Scene Segmentation with Pixel-Wise Features and Rendered Novel Views
llm4mr.github.io LLMR: Real-time Prompting of Interactive Worlds using Large Language Models
sungnyun.github.io Academic Project Page Template | sungnyun.github.io
riemann-web.github.io RiEMann: Near Real-Time SE(3)-Equivariant Robot Manipulation without Point Cloud Segmentation
controlvideo.github.io ControlVideo: Adding Conditional Control for One Shot Text-to-Video Editing
lifelonger.github.io LifeLonger: A Benchmark for Continual Disease Classification
reveal-cvpr.github.io REVEAL: Retrieval-Augmented Visual-Language Pre-Training with Multi-Source Multimodal Knowledge Memory
spellburstllm.github.io spellburstllm.github.io
afford-motion.github.io Afford-Moiton
explore-eqa.github.io Explore until Confident: Efficient Exploration for Embodied Question Answering
a-scan2bim.github.io A-Scan2BIM: Assistive Scan to Building Information Modeling
satellite-pixel-synthesis.github.io Spatial-Temporal Super-Resolution of Satellite Imagery via Conditional Pixel Synthesis
hiss-csp.github.io Hierarchical State Space Models for Continuous Sequence-to-Sequence Modeling
locomanip-duet.github.io RoboDuet: A Framework Affording Mobile-Manipulation and Cross-Embodiment
atedm.github.io AT-EDM: Attention-Driven Training-Free Efficiency Enhancement of Diffusion Models
robo-explorer.github.io ALAN : Autonomously Exploring Robotic Agents in the Real World
gpteval3d.github.io GPT-4V(ision) is a Human-Aligned Evaluator for Text-to-3D Generation
visualnav-transformer.github.io ViNT: A Foundation Model for Visual Navigation
groma-mllm.github.io Groma: Localized Visual Tokenization for Grounding Multimodal Large Language Models
irisldr.github.io IRIS: Inverse Rendering of Indoor Scenes from Low Dynamic Range Images
opencodeinterpreter.github.io OpenCodeInterpreter
instancewarp.github.io Addressing Source Scale Bias via Image Warping for Domain Adaptation
portraitbooth.github.io PortraitBooth: A Versatile Portrait Model for Fast Identity-preserved Personalization
morphcut.github.io Jump Cut Smoothing for Talking Heads
genh2r.github.io GenH2R: Learning Generalizable Human-to-Robot Handover via Scalable Simulation, Demonstration, and Imitation
isotropic3d.github.io Isotropic3D: Image-to-3D Generation Based on a Single CLIP Embedding
hcplayercvpr2024.github.io Towards Effective Usage of Human-Centric Priors in Diffusion Models for Text-based Human Image Generation
maggie-matt.github.io CVPR'24 - MaGGIe
os-copilot.github.io OS-Copilot: Towards Generalist Computer Agents with Self-Improvement
llms-believe-the-earth-is-flat.github.io LLMs Believe the Earth is Flat
spot-rl-manip.github.io Continuously Improving Mobile Manipulation with Autonomous Real-World RL
vader-eccv.github.io VADER: Video Diffusion Alignment via Reward Gradient
magic-fixup.github.io Magic Fixup
map-tracker.github.io MapTracker: Tracking with Strided Memory Fusion for Consistent Vector HD Mapping
coformer.github.io CoFormer
mathvision-cuhk.github.io Measuring Multimodal Mathematical Reasoning with MATH-Vision Dataset
aacl2023-sea-nlp.github.io Current Status of NLP in South East Asia with Insights from Multilingualism and Language Diversity
llm-rl.github.io Large Language Models as Generalizable Policies for Embodied Tasks
nerf-course.github.io Home - nerf-course.github.io
human-world-model.github.io Structured World Models from Human Videos
code-eval.github.io Execution-Based Evaluation for Open-Domain Code Generation
factormatte.github.io FactorMatte: Redefining Video Matting for Re-Composition Tasks
cross-lingual-watermark.github.io Can Watermarks Survive Translation? On the Cross-lingual Consistency of Text Watermark for Large Language Models
scimult.github.io SciMult
quilt-llava.github.io Quilt-LLaVA: Visual Instruction Tuning by Extracting Localized Narratives from Open-Source Histopathology Videos
onechartt.github.io OneChart: Purify the Chart Structural Extraction via One Auxiliary Token
diffuse2choose.github.io Diffuse to Choose: Enriching Image Conditioned Inpainting in Latent Diffusion Models for Virtual Try-All
r2-play.github.io Read to Play
anyskill.github.io AnySkill: Learning Open-Vocabulary Physical Skill for Interactive Agents
diffusiongpt.github.io DiffusionGPT: LLM-Driven Text-to-Image Generation System
lazydiffusion.github.io Lazy Diffusion Transformer for Interactive Image Editing
cpmcorl2023.github.io Composable Part-Based Manipulation
meshnca.github.io Mesh Neural Cellular Automata
midgardsim.org MIDGARD
explorllm.github.io ExploRLLM: Guiding Exploration in Reinforcement Learning with Large Language Models
parl2024.github.io Learning Planning Abstractions From Language
deepanimation.net DeepAnimation
videoshop-editing.github.io Videoshop: Localized Semantic Video Editing with Noise-Extrapolated Diffusion Inversion
come-robot.github.io COME robot
rank2reward.github.io Rank2Reward: Learning Shaped Reward Functions from Passive Video
diffusekrona.github.io DiffuseKronA: A Parameter Efficient Fine-tuning Method for Personalized Diffusion Models
glee-vision.github.io GLEE:General Object Foundation Model for Images and Videos at Scale