UCLA Researchers have developed a method to change the apparent race of faces in datasets that are used to train medical machine learning systems, in an attempt to redress the racial bias that many common datasets suffer from.
The new technique is capable of producing photorealistic and physiologically accurate synthetic video at an average rate of 0.005 seconds per frame, and is hoped to aid the development of new diagnostics systems for remote healthcare diagnosis and monitoring – a field that has expanded greatly under COVID restrictions. The system is intended to improve the applicability of remote photoplethysmography (rPPG), a computer vision technique that evaluates facial video content to detect volumetric changes in blood supply in a non-invasive manner.
Though the work, which utilizes convolutional neural networks (CNNs), incorporates previous research code published by the UK's Durham University in 2020, the new application is intended to preserve pulsatile signals in the original test data, rather than just visually changing the apparent race of the data, as the 2020 research does.
CNNs For Racial Transformation
The first part of the encoder-decoder system uses the Durham race transfer model, pre-trained on VGGFace2, to generate proxy target frames with the prior Caucasian-to-African component of the Durham research. This produces a flat transfer of racial characteristics, but does not contain the variations in color and tone that represent visual physiological indicators of the patient's blood flow state.
A second network, called PhysResNet (PRN), provides the rPPG component. PhysResNet is trained to learn both the visual appearance and also the color variations that define the subcutaneous blood volume movements.
The architecture that the UCLA project proposes outperforms competing rPPG techniques even in the absence of skin color augmentation, representing a 31% improvement on similar techniques optimized with MAE and RMSE.
The UCLA researchers hope that future work will undertake more extensive challenges to redress racial bias in this sector of medical imaging, and hope also that later schemes will output higher-resolution video, since the system in question is limited to an 80×80 pixel resolution – suited reasonably well to the limitations of telehealth, but not ideal.
Lack Of Ethnically Diverse Datasets
The economic and practical circumstances that lead to racially diverse datasets have been an obstacle to medical research for some years. Data tends to be generated parochially, with many factors contributing to a frequent Caucasian-centric homogeneity of data subjects These include the composition of minority demographics in cities where research occurs, and other socioeconomic factors that may influence the extent to which non-white subjects appear in western datasets that the researchers wish could have a more global applicability.
In countries with a higher proportion of dark-skinned subjects, the requisite equipment and resources to gather the data are frequently lacking.
Currently dark-skinned subjects are notably underrepresented in rPPG datasets, representing 0%, 5% and 10% of the content of the three primary databases in common use for this purpose.
Homogeneous Caucasian Data
In 2019 new research published in Science found that an algorithm widely diffused in US hospital care was heavily biased in favor of Caucasian subjects. The study found that black people were less likely to be referred to specialized care in triage and deeper levels of hospital admission.
Further research in that year from researchers in Malaysia and Australia established the general problem of ‘Own Race Bias' for dataset generation across many regions of the world, including Asia.
Potential Limitations Of Scale And Architecture
Some of the limitations that have led to limited-ethnicity datasets are pragmatic rather than ethical in nature. The broader the plurality of the contributing data, the better it generalizes across the subjects featured in that data, but the less the training routine is likely to intuit patterns within any single characteristic of data, including race, because a smaller percentage of training time, attention and resources is available for each identifiable subset of the data.
This can lead to models that are widely applicable but obtain less specific results, due to the constraints of data size, the economics of batch size, and practical limitations of the latent space as a function of limited hardware resources.
At the other extreme, though effective and granular results can be obtained by constraining the input data to a more limited set of characteristics, including ethnicity, the results are likely to be ‘overfit' to the limited data, and not broadly applicable, perhaps even across unseen subjects in the same geographic area from which the original dataset subjects were obtained.
Synthetic Avatars For PPG Simulation
The UCLA paper also notes prior work from Microsoft Research in 2020 into the use of racially pliable synthetic avatars, which leverages 3D image synthesis to create face videos rich in PPG information.
- The Black Box Problem in LLMs: Challenges and Emerging Solutions
- Alex Ratner, CEO & Co-Founder of Snorkel AI – Interview Series
- Circleboom Review: The Best AI-Powered Social Media Tool?
- Stable Video Diffusion: Latent Video Diffusion Models to Large Datasets
- Donny White, CEO & Co-Founder of Satisfi Labs – Interview Series