We introduce ViCA-NeRF, a view-consistency-aware method for 3D editing with text instructions. In addition to the implicit NeRF modeling, our key insight is to exploit two sources of regularization that explicitly propagate the editing information across different views, thus ensuring multi-view consistency. As geometric regularization, we leverage the depth information derived from the NeRF model to establish image correspondence between different views. As learned regularization, we align the latent codes in the 2D diffusion model between edited and unedited images, enabling us to edit key views and propagate the update to the whole scene. Incorporating these two regularizations, our ViCA-NeRF framework consists of two stages. In the initial stage, we blend edits from different views to create a preliminary 3D edit. This is followed by a second stage of NeRF training that is dedicated to further refining the scene’s appearance. Experiments demonstrate that ViCA-NeRF provides more flexible, efficient(3 times faster) editing with higher levels of consistency and details, compared with the state of the art.
ViCA-NeRF is an efficient, controllable NeRF editing pipeline which can edit 3D scenes with text instructions. It shows better generalizability for various text instructions.
ViCA-NeRF leverages two sources of regularization to propagate editing information
Edited Key View
Edited Key View
@inproceedings{vicanerf2023,
author = {Dong, Jiahua and Wang, Yu-Xiong},
title = {ViCA-NeRF: View-Consistency-Aware 3D Editing of Neural Radiance Fields},
booktitle = {NeurIPS},
year = {2023},
}