Earlier this year NVIDIA advanced Neural Radiance Fields (NeRF) research notably with InstantNeRF, apparently capable of generating explorable neural scenes in mere seconds – from a technique that, when it emerged in 2020, frequently took hours or even days to train.
Though this kind of interpolation produces a static scene, NeRF is also capable of depicting movement, and of basic ‘copy-and-paste' editing, where individual NeRFs can either be collated into composite scenes or inserted into existing scenes.
However, if you're looking to intervene in a calculated NeRF and actually change something that's going on inside it (in the same way you can change elements in a traditional CGI scene), the rapid pace of sector interest has come up with very few solutions to date, and none that even begin to match the capabilities of CGI workflows.
Though geometry estimation is essential to creating a NeRF scene, the final result is composed of fairly ‘locked' values. While there is some progress being made towards changing texture values in NeRF, the actual objects in a NeRF scene are not parametric meshes that can be edited and played about with, but more akin to brittle and frozen point clouds.
In this scenario, a rendered person in a NeRF is essentially a statue (or a series of statues, in video NeRFs); the shadows they cast on themselves and other objects are textures, rather than flexible calculations based on light sources; and the editability of NeRF content is limited to the choices made by the photographer who takes the sparse source photos from which the NeRF is generated. Parameters such as shadows and pose remain non-editable, in any creative sense.
A new academic research collaboration between China and the UK addresses this challenge with NeRF-Editing, where proxy CGI-style meshes are extracted from a NeRF, deformed at will by the user, and the deformations passed back through to the NeRF's neural calculations:
The method adapts the NeuS 2021 US/China reconstructive technique, which extracts a Signed Distance Function (SDF, a much older method of volumetric reconstruction) that's able to learn the geometry represented inside the NeRF.
This SDF object becomes the user's sculpting base, with warping and molding capabilities provided by the venerable As-Rigid-As-Possible (ARAP) technique.
With the deformations applied, it's necessary to translate this information from vector to the RGB/pixel level native to NeRF, which is a slightly longer journey.
The triangular vertices of the mesh that the user has deformed are first translated into a tetrahedral mesh, which forms a skin around the user-mesh. A spatial discrete deformation field is extracted from this additional mesh, and finally a NeRF-friendly continuous deformation field is obtained which can be passed back into the neural radiance environment, reflecting the user's changes and edits, and directly affecting the interpreted rays in the target NeRF.
The paper states:
‘After transferring the surface deformation to the tetrahedral mesh, we can obtain the discrete deformation field of the “effective space”. We now utilize these discrete transformations to bend the casting rays. To generate an image of the deformed radiance field, we cast rays to the space containing the deformed tetrahedral mesh.'
The paper is titled NeRF-Editing: Geometry Editing of Neural Radiance Fields, and comes from researchers across three Chinese universities and institutions, together with a researcher from the School of Computer Science & Informatics at Cardiff University, and another two researchers from the Alibaba Group.
As mentioned earlier, transformed geometry will not ‘update' any related aspects in the NeRF that have not been edited, nor reflect secondary consequences of the deformed element, such as shadows. The researchers provide an example, where under-shadows on a human figure in a NeRF remain unaltered, even though the deformation should alter the lighting:
The authors observe that there are currently no comparable methods for direct intervention into NeRF geometry. Therefore the experiments conducted for the research were more exploratory than comparative.
The researchers demonstrated NeRF-Editing on a number of public datasets, including characters from Mixamo, and the now-iconic Lego bulldozer and chair from the original NeRF implementation. They also experimented on a real captured horse statue from the FVS dataset, as well as their own original captures.
For future work, the authors intend to develop their system in the just-in-time (JIT) compiled machine learning framework Jittor.
First published 16th May 2022.
- Lior Hakim, Co-founder & CTO of Hour One – Interview Series
- The Smart Enterprise: Making Generative AI Enterprise-Ready
- Flick Review: The Best Instagram Hashtag Tool to Boost Reach
- U.S. Imposes Export Restrictions on NVIDIA Chips to Certain Middle East Countries
- Tanguy Chau, Co-Founder & CEO of Paxton AI – Interview Series