stub Making a Machine Learning Model Forget About You - Unite.AI
Connect with us

Artificial Intelligence

Making a Machine Learning Model Forget About You

Updated on

Removing a particular piece of data that contributed to a machine learning model is like trying to remove the second spoonful of sugar from a cup of coffee. The data, by this time, has already become intrinsically linked to many other neurons inside the model. If a data point represents ‘defining' data that was involved in the earliest, high-dimensional part of the training, then removing it can radically redefine how the model functions, or even require that it be re-trained at some expenditure of time and money.

Nonetheless, in Europe at least, Article 17 of the General Data Protection Regulation Act (GDPR) requires that companies remove such user data on request. Since the act was formulated on the understanding that this erasure would be no more than a database ‘drop' query, the legislation destined to emerge from the Draft EU Artificial Intelligence Act will effectively copy and paste the spirit of GDPR into laws that apply to trained AI systems rather than tabular data.

Further legislation is being considered around the world that will entitle individuals to request deletion of their data from machine learning systems, while the California Consumer Privacy Act (CCPA) of 2018 already provides this right to state residents.

Why It Matters

When a dataset is trained into an actionable machine learning model, the characteristics of that data become generalized and abstract, because the model is designed to infer principles and broad trends from the data, eventually producing an algorithm that will be useful in analyzing specific and non-generalized data.

However, techniques such as model inversion have revealed the possibility of re-identifying the contributing data that underlies the final, abstracted algorithm, while membership inference attacks are also capable of exposing source data, including sensitive data that may only have been permitted to be included in a dataset on the understanding of anonymity.

Escalating interest in this pursuit does not need to rely on grass-roots privacy activism: as the machine learning sector commercializes over the next ten years, and nations come under pressure to end the current laissez faire culture over the use of screen scraping for dataset generation, there will be a growing commercial incentive for IP-enforcing organizations (and IP trolls) to decode and review the data that has contributed to proprietary and high-earning classification, inference and generative AI frameworks.

Inducing Amnesia in Machine Learning Models

Therefore we are left with the challenge of getting the sugar out of the coffee. It's a problem that has been vexing researchers in recent years: in 2021 the EU-supported paper A Comparative Study on the Privacy Risks of Face Recognition Libraries found that several popular face recognition algorithms were capable of enabling sex or race based discrimination in re-identification attacks; in 2015 research out of Columbia University proposed a ‘machine unlearning' method based on updating a number of summations within the data; and in 2019 Stanford researchers offered novel deletion algorithms for K-means clustering implementations.

Now a research consortium from China and the US has published new work that introduces a uniform metric for evaluating the success of data deletion approaches, together with a new ‘unlearning' method called Forsaken, which the researchers claim is capable of achieving a more than 90% forgetting rate, with only a 5% accuracy loss in the overall performance of the model.

The paper is called Learn to Forget: Machine Unlearning via Neuron Masking, and features researchers from China and Berkeley.

Neuron masking, the principle behind Forsaken, uses a mask gradient generator as a filter for the removal of specific data from a model, effectively updating it rather than forcing it to be retrained either from scratch or from a snapshot that occurred prior to the inclusion of the data (in the case of streaming-based models that are continuously updated).

The architecture of the mask gradient generator. Source:

The architecture of the mask gradient generator. Source:

Biological Origins

The researchers state that this approach was inspired by the biological process of ‘active forgetting', where the user takes strident action to erase all engram cells for a particular memory by manipulation of a special type of dopamine.

Forsaken continuously evokes a mask gradient that replicates this action, with safeguards to slow down or halt this process in order to avoid catastrophic forgetting of non-target data.

The advantages of the system are that it is applicable to many kinds of existing neural networks, whereas recent similar work has enjoyed success largely in computer vision networks; and that it does not interfere with model training procedures, but rather acts as an adjunct, without requiring that the core architecture be altered or the data retrained.

Restricting The Effect

Deletion of contributed data can have a potentially deleterious effect on the functionality of a machine learning algorithm. To avoid this, the researchers have exploited norm regularization, a feature of normal neural network training that is commonly used to avoid overtraining. The particular implementation chosen is designed to ensure that Forsaken does not fail to converge in training.

To establish a usable dispersal of data, the researchers used out-of-distribution (OOD) data (i.e., data not included in the actual dataset, mimicking ‘sensitive' data in the actual dataset) to calibrate the way that the algorithm should behave.

Testing On Datasets

The method was tested over eight standard datasets and in general achieved close-to or higher forgetting rates than full retraining, with very little impact on model accuracy.

It seems impossible that full retraining on an edited dataset could actually do worse than any other method, since the target data is entirely absent. However, the model has by this time abstracted various features of the deleted data in a ‘holographic' fashion, in the way (by analogy) that a drop of ink redefines the utility of a glass of water.

In effect, the weights of the model have already been influenced by the excised data, and the only way to entirely remove its influence is to retrain the model from absolute zero, rather than the far speedier approach of retraining the weighted model on an edited dataset.