Dr. Michael Capps is a well-known technologist and CEO of Diveplane Corporation. Before co-founding Diveplane, Mike had a legendary career in the videogame industry as president of Epic Games, makers of blockbusters Fortnite and Gears of War. His tenure included a hundred game-of-the-year awards, dozens of conference keynotes, a lifetime achievement award, and a successful free-speech defense of videogames in the U.S. Supreme Court.
Diveplane offers AI-powered business solutions across multiple industries. With six patents approved and multiple pending, Diveplane’s Understandable AI gives full understanding and decision transparency in support of ethical AI policies and data privacy strategies.
You successfully retired from a successful career in the video game industry at Epic Games, what inspired you to come out of retirement to focus on AI?
Making games was a blast but – at least at the time – wasn’t an ideal career when having a new family. I kept busy with board seats and advisory roles, but it just wasn’t fulfilling. So, I made a list of three major problems facing the world that I could possibly impact – and that included the proliferation of black-box AI systems. My plan was to spend a year on each digging in, but a few weeks later, my brilliant friend Chris Hazard told me he’d been working secretly on a transparent, fully-explainable AI platform. And here we are.
Diveplane was started with a mission of bringing humanity to AI, can you elaborate on what this means specifically?
Sure. Here we’re using humanity to mean “humaneness” or “compassion.” To make sure the best of humanity is in your AI model, you can’t just train, test a little, and hope it’s all okay.
We need to carefully review input data, the model itself, and the output of that model, and be sure that it reflects the best of our humanity. Most systems trained on historical or real-world data aren’t going to be correct the first time, and they’re not necessarily unbiased either. We believe the only way to root out bias in a model – meaning both statistical errors and prejudice – is the combination of transparency, auditability, and human-understandable explanation.
The core technology at Diveplane is called REACTOR, what makes this a novel approach to making machine learning explainable?
Machine learning typically entails using data to build a model which makes a particular type of decision. Decisions might include the angle to turn the wheels for a vehicle, whether to approve or deny a purchase or mark it as fraud, or which product to recommend to someone. If you want to learn how the model made the decision, you typically have to ask it many similar decisions and then try again to predict what the model itself might do. Machine learning techniques are either limited in the types of insights they can offer, by whether the insights actually reflect what the model did to come up with the decision, or by having lower accuracy.
Working with REACTOR is quite different. REACTOR characterizes your data’s uncertainty, and your data becomes the model. Instead of building one model per type of decision, you just ask REACTOR what you’d like it to decide — it can be anything related to the data — and REACTOR queries what data is needed for a given decision. REACTOR always can show you the data it used, how it relates to the answer, every aspect of uncertainty, counterfactual reasoning, and virtually any additional question you’d like to ask. Because the data is the model, you can edit the data and REACTOR will be instantly updated. It can show you if there was any data that looked anomalous that went into the decision, and trace every edit to the data and its source. REACTOR uses probability theory all the way down, meaning that we can tell you the units of measurement of every part of its operation. And finally, you can reproduce and validate any decision using just the data that lead to the decision and the uncertainties, using relatively straightforward mathematics without even needing REACTOR.
REACTOR is able to do all of this while maintaining highly competitive accuracy especially for small and sparse data sets.
GEMINAI is a product that builds a digital twin of a dataset, what does this mean specifically how does this ensure data privacy?
When you feed GEMINAI a dataset, it builds a deep knowledge of the statistical shape of that data. You can use it to create a synthetic twin that resembles the structure of the original data, but all the records are newly created. But the statistical shape is the same. So for example, the average heart rate of patients in both sets would be nearly the same, as would all other statistics. Thus, any data analytics using the twin would give the same answer as the originals, including training ML models.
And if someone has a record in the original data, there’d be no record for them in the synthetic twin. We’re not just removing the name – we’re making sure that there’s no new record that’s anywhere “near” their record (and all the others) in the information space. I.e., there’s no record that’s recognizable in both the original and synthetic set.
And that means, the synthetic data set can be shared much more freely with no risk of sharing confidential information improperly. Doesn’t matter if it’s personal financial transactions, patient health information, classified data – as long as the statistics of the data aren’t confidential, the synthetic twin isn’t confidential.
Why is GEMINAI a better solution than using differential privacy?
Differential privacy is a set of techniques that keep the probability of any one individual from influencing the statistics more than a marginal amount, and is a fundamental piece in nearly any data privacy solution. However, when differential privacy is used alone, a privacy budget for the data needs to be managed, with sufficient noise added to each query. Once that budget is used up, the data cannot be used again without incurring privacy risks.
One way to overcome this budget is to apply the full privacy budget at once to train a machine learning model to generate synthetic data. The idea is that this model, trained using differential privacy, can be used relatively safely. However, proper application of differential privacy can be tricky, especially if there are differing data volumes for different individuals and more complex relationships, such as people living in the same house. And synthetic data produced from this model is often likely to include, by chance, real data that an individual could claim is their own because it is too similar.
GEMINAI solves these problems and more by combining multiple privacy techniques when synthesizing the data. It uses an appropriate practical form of differential privacy that can accommodate a wide variety of data types. It is built upon our REACTOR engine, so it additionally knows the probability that any pieces of data might be confused with one another, and synthesizes data making sure that it is always sufficiently different from the most similar original data. Additionally, it treats every field, every piece of data as potentially sensitive or identifying, so it applies practical forms of differential privacy for fields that are not traditionally thought of as sensitive but could uniquely identify an individual, such as the only transaction in a 24-hour store between 2am and 3am. We often refer to this as privacy cross-shredding.
GEMINAI is able to achieve high accuracy for nearly any purpose, that looks like the original data, but prevents anyone from finding any synthetic data too similar to the synthetic data.
Diveplane was instrumental in co-founding the Data & Trust Alliance, what is this alliance?
It’s an absolutely fantastic group of technology CEOs, collaborating to develop and adopt responsible data and AI practices. World class organizations like IBM, Johnson&Johnson, Mastercard, UPS, Walmart, and Diveplane. We’re very proud to have been part of the early stages, and also proud of the work we’ve collectively accomplished on our initiatives.
Diveplane recently raised a successful Series A round, what will this mean for the future of the company?
We’ve been fortunate to be successful with our enterprise projects, but it’s difficult to change the world one enterprise at a time. We’ll use this support to build our team, share our message, and get Understandable AI in as many places as we can!
Is there anything else that you would like to share about Diveplane?
Diveplane is all about making sure AI is done properly as it proliferates. We’re about fair, transparent, and understandable AI, proactively showing what’s driving decisions, and moving away from the “black box mentality” in AI that has the potential to be unfair, unethical, and biased. We believe Explainability is the future of AI, and we’re excited to play a pivotal role in driving it forward!
Thank you for the great interview, readers who wish to learn more should visit Diveplane.