Artificial Intelligence

Adobe: Relighting the Real World With Neural Rendering

Updated on December 9, 2022

Researchers from Adobe have created a neural rendering system for real world indoor scenes that's capable of sophisticated relighting, offers a real-time interface, and handles glossy surfaces and reflections – a notable challenge for competing image synthesis methods such as Neural Radiance Fields (NeRF).

Here, the a real world scene has been reconstructed from a number of still images, making the scene navigable. Lighting can be added and changed in color and quality, while reflections remain accurate, and glossy surfaces correctly express the user's change in lighting sources and/or styles. Source: https://www.youtube.com/watch?v=d3ma4opFpgM

The new system allows for Photoshop-style, GUI-driven control over lighting aspects of a real 3D scene that's been captured into a neural space, including shadows and reflections.

The GUI allows a user to add (and adjust) a lighting source to a real-world scene that has been reconstructed from a sparse number of photos, and to navigate freely through it as though it were a CGI-style mesh-based scenario.

The paper, submitted to ACM Transactions on Graphics and entitled Free-viewpoint Indoor Neural Relighting from Multi-view Stereo, is a collaboration between Adobe Research and researchers from the Université Côte d’Azur.

Source: https://arxiv.org/ftp/arxiv/papers/2106/2106.13299.pdf (click to see full-res version)

As with Neural Radiance Fields (NeRF), the system uses photogrammetry (above left), wherein the understanding of a scene is inferred from a limited number of photographs, and the ‘missing' viewpoints trained via machine learning until a complete and entirely abstracted model of the scene is available for ad hoc reinterpretation.

The system has been trained entirely on synthetic (CGI) data, but the 3D models used have been treated exactly as would occur if a person was taking several limited photographs of a real scene for neural interpretation. The image above shows a synthetic scene being relit, but the ‘bedroom' view in the top-most (animated) image above is derived from actual photos taken in a real room.

The implicit representation of the scene is obtained from the source material via a Convolutional Neural Network (CNN), and divided into several layers, including reflectance, source irradiance (radiosity/global illumination) and albedo.

The architecture of the Adobe relighting system. The multi-view dataset is preprocessed, and 3D mesh geometry generated from the input data. When a new light must be added, the irradiance is computed in real time, and the relit view synthesized. (click to see full-res version)

The algorithm combines facets of traditional ray tracing (Monte Carlo) and Image-Based Rendering (IBR, neural rendering).

Though a notable amount of recent research into Neural Radiance Fields has been concerned with the extraction of 3D geometry from flat images, but Adobe's offering is the first time that highly sophisticated re-lighting has been demonstrated via this method.

The algorithm also addresses another traditional limitation of NeRF and similar approaches, by calculating a complete reflection map, where every single part of the image is assigned a 100% reflective material.

Mirrored textures map out lighting paths. (click to see full-res version)

With this integral reflectivity map in place, it's possible to ‘dial down' the reflectivity to accommodate various levels of reflection in different types of material such as wood, metal and stone. The reflectivity map (above) also provides a complete template for ray mapping, which can be re-used for purposes of diffuse lighting adjustment.

Other layers in the Adobe neural rendering system. (click to see full-res version)

Initial capture of the scene uses 250-350 RAW photos from which a mesh is computed via Multi-View Stereo. The data is summarized into 2D input feature maps which are then re-projected into the novel view. Changes in lighting are calculated by averaging diffuse and glossy layers of the captured scene.

The mirror-image layer is generated through a fast single-ray mirror calculation (one bounce), which estimates original source values and then the target values. Maps that contain information about the scene's original lighting are stored in the neural data, similar to the way radiosity maps are often stored with traditional CGI scene data.

Solving Neural Rendering Reflections

Perhaps the primary achievement of the work is the decoupling of reflectance information from diffuse and other layers in the data. Calculation time is kept down by ensuring that live ‘reflectance'-enabled views, such as mirrors, are calculated only for the active user view, rather than for the entire scene.

The researchers claim that this work represents the first time that relighting capabilities have been matched to free-view navigation capabilities in a single framework for scenes that must reproduce reflective surfaces realistically.

Some sacrifices have been made to achieve this functionality, and the researchers concede that prior methods that use more complex per-view meshes demonstrate improved geometry for small objects. Future directions for the Adobe approach will include the use of per-view geometry in order to improve this aspect.

Free-viewpoint Indoor Neural Relighting from Multi-view Stereo