Artificial Intelligence

Google Has Banned the Training of Deepfakes in Colab

Updated on December 9, 2022

Sometime in the last two weeks, Google has quietly changed the terms of service for its Colab users, adding a stipulation that Colab services may no longer be used to train deepfakes.

The May update brings a deepfake ban to Colab. Source: https://research.google.com/colaboratory/faq.html#limitations-and-restrictions

The first web-archived version from the Internet Archive that features the deepfake ban was captured last Tuesday, the 24th May. The last captured version of the Colab FAQ that does not mention the ban was on the 14th May.

Of the two popular deepfake-creation distributions, DeepFaceLab (DFL) and FaceSwap, both of which are forks of the controversial and anonymous code posted to Reddit in 2017, only the more notorious DFL appears to have been directly targeted by the ban. According to deepfake developer ‘chervonij' at the DFL Discord, running the software in Google Colab now produces a warning:

‘You may be executing code that is disallowed, and this may restrict your ability to use Colab in the future. Please note the prohibited actions specified in our FAQ.'

However, interestingly, the user is currently allowed to continue with the execution of the code.

The new warning that greeted DFL deepfakers attempting to run the code on Google Colab. Source: https://discord.com/channels/797172242697682985/797391052042010654/979823182624219136

According to a user in the Discord for rival distribution FaceSwap, that project's code apparently does not yet trigger the warning, suggesting that code for DeepFaceLab (also the feeding architecture for real-time deepfake streaming implementation DeepFaceLive), by far the most dominant deepfakes method, has been specifically targeted by Colab.

FaceSwap co-lead developer Matt Tora commented*:

‘I find it very unlikely that Google are doing this for any particular ethical reasons, more that Colab's raison d'être is for students/data scientists/researchers to be able to run computationally expensive GPU code in an easy and accessible manner, free of charge. However, I suspect that a not insignificant amount of users are exploiting this resource to create deepfake models, at scale, which is both computationally expensive and takes a not insignificant amount of training time to produce results.

‘You could say that Colab leans more to the educational, research side of AI. Executing scripts that require little user input, nor understanding, tends to go counter to this. At Faceswap we try to focus on educating the user in AI and the mechanisms involved, whilst lowering the barrier to entry. We very much encourage ethical use of the software and feel that making these kinds of tools available to a wider audience helps educate people in terms of what is achievable in today's world, rather than keeping it hidden away for a select few.

‘Unfortunately we cannot control how our tools are ultimately used, nor where they are run. It saddens me that an avenue has been closed for people to experiment with our code, however, in terms of protecting this particular resource to ensure its availability to the actual target audience, I find it understandable.'

There is no evidence that the new restriction is limited only to the free tier of Google Colab – at the bottom of the list of prohibited activities to which deepfakes have now been added, is the note ‘Additional restrictions exist for paid users', indicating that these are baseline regulations. In regard to the deepfakes ban, this has confused some, since ‘cryptocurrency mining' and ‘engaging in peer-to-peer file-sharing' are included in both the free and pro ‘Restrictions' section.

By that logic, everything banned in the free ‘Restrictions' section is allowed in the Pro version, so long as the Pro version does not explicitly prohibit it, including ‘running denial-of-service attacks‘ and ‘password cracking'. The additional restrictions for the Pro tier are chiefly concerned with not ‘subletting' pro Colab access, despite the confusing and selective duplicate prohibitions.

Google Colab is a dedicated implementation of Jupyter notebook environments, which allow for remote training of machine learning projects on far more powerful GPUs than many users can afford.

Since deepfake training is a VRAM-hungry pursuit, and since the advent of the GPU famine, many deepfakers in recent years have eschewed home training in favor of remote training in Colab, where it's possible, depending on chance and tier, to train a deepfake model on powerful cards such as the Tesla T4 (16GB VRAM, currently around $2k USD), the V100 (32GB VRAM, around $4k USD), and the mighty A100 (80GB VRAM, MSRP of $32,097.00), among others.

The ban on Colab training seems likely to reduce the pool of deepfakers able to train higher-resolution models, where the input and output images are larger, more suited to high-resolution results, and capable of extracting and reproducing greater facial detail.

Some of the most committed deepfake hobbyists and enthusiasts, according to Discord and forum posts, have invested heavily in local hardware over the last couple of years, in spite of the high prices of GPUs.

However, given the high costs involved, sub-communities have emerged to deal with the challenges of training deepfakes on Colabs, with random GPU allocation the most common complaint since Colab limited the use of higher-end GPUs to free users.

* In private messages on Discord

First published 28th May 2022.
Revised 7:28 AM EST, correction of quote typo.
Revised 12:40pm EST – added clarification regarding free and pro tier deepfake bans,as best can be understood from the ‘free' and ‘pro' lists of prohibitions.