Artificial Intelligence
Projecting Climate Change Into Photos With Generative Adversarial Networks
A team of researchers from Canada and the US has developed a machine learning method to superimpose the catastrophic effects of climate change into real photos using Generative Adversarial Networks (GANs), with the aim of reducing ‘distancing' – our inability to relate to hypothetical or abstract scenarios regarding climate change.
The project, titled ClimateGAN, is part of a wider research effort to develop interactive environments where users can explore projected worlds that have been affected by floods, extreme heat, and other serious consequences of climate change.
Discussing the motivation behind the initiative, the researchers state:
‘Climate change is a major threat to humanity, and the actions required to prevent its catastrophic consequences include changes in both policy-making and individual behaviour. However, taking action requires understanding the effects of climate change, even though they may seem abstract and distant.
‘Projecting the potential consequences of extreme climate events such as flooding in familiar places can help make the abstract impacts of climate change more concrete and encourage action.'
A core aim of the initiative is to enable a system where a user can enter their address (or any address) and see a climate-change affected version of the corresponding image from Google Street View. However, the transformation algorithms behind ClimateGAN require some estimated knowledge of height for items in the photo, which is not included in the metadata Google provides for Street View, and so obtaining such an estimation algorithmically remains an ongoing challenge.
Data and Architecture
ClimateGAN utilizes an unsupervised image-to-image translation pipeline with two phases: a Masker layer, which estimates where a level water surface would theoretically exist in the target image; and a Painter module to realistically render water within the boundaries of the established mask, and takes into account reflectivity of the remaining non-obscured geometry above the waterline.
Most of the training data was chosen from the CityScapes and Mapillary datasets. However, since existing data for flood imagery is relatively scarce, the researchers combined existing available datasets with a novel ‘virtual world' developed with the Unity3D game engine.
The Unity3D world contains around 1.5km of terrain, and includes urban, suburban and rural areas, which the researchers ‘flooded'. This enabled the generation of ‘before' and ‘after' images for additional ground truth for the ClimateGAN framework.
The Masker unit adapts the 2018 ADVENT code for training, adding additional data in line with 2019 findings from the French research initiative DADA. The researchers also added a segmentation decoder to feed the Masker unit additional information regarding the semantics of the input image (i.e. labeled information that denotes a domain, such as ‘building').
The Flood Mask Decoder calculates a feasible waterline, and is powered by NVIDIA's hugely popular SPADE in-painting framework.
Though the researchers used NVIDIA GauGAN, powered by SPADE, for the Painter module, it was necessary to condition GauGAN on the output of the Masker, and not on a generalized semantic segmentation map, as occurs in normal use, since the images had to be transformed in line with the waterline delineations, rather than being subject to broad, general transformations.
Evaluating Quality
Metrics for evaluating the quality of the resulting images were facilitated by labeling a test set of 180 Google Street View images of varying types, including urban scenes and more rural images from a diversity of geographical locations. The images were manually labeled as cannot-be-flooded, must-be-flooded, and may-be-flooded.
This allowed the formulation of three metrics: error rate (perceived prediction areas by size in the transformed image), F05 Score, and edge coherence. For comparison, the researchers tested the data on prior image-to-image translation (IIT) models, including InstaGAN, CycleGAN, and MUNIT.
The researchers concede that the lack of height data in source imagery makes it difficult to arbitrarily impose waterline heights in images, if the user would like to dial up the ‘Roland Emmerich factor' a little. They also concede that the flood effects are overly limited to the flood area, and intend to investigate methods by which multiple levels of flooding (i.e. after recession of an initial deluge) could be added to the methodology.
ClimateGAN's code has been made available at GitHub, together with additional examples of rendered images.