Over the course of the past year, it seems that more and more attention is being paid to ensuring that AI is used in ethical ways. Google and Microsoft have both recently warned investors that misuse of AI algorithms or poorly designed AI algorithms presents ethical and legal risks. Meanwhile, the state of California has just decided to pass a bill that bans the use of face recognition technology by California’s law enforcement agencies.
Recently, startups such as Arthur have been attempting to design tools that will help AI engineers quantify and qualify how their machine learning models perform. As reported by Wired, Arthur is trying to give AI developers a toolkit that will make it easier for them to discover problems when designing financial applications, like unveiling bias in investment or lending decisions.
Arthur’s efforts are aimed at addressing the “black box” problem of AI. The black box problem in AI describes how unlike traditional code, which can be easily interpreted by those who know how to read it, machine learning systems map features to behavior without unveiling the reasons that these behaviors are selected/how the features have been interpreted. In other words, in a black box system the exact implementation of the algorithm is opaque.
Machine learning systems operate by extracting patterns from input data and reasoning about these patterns. This is accomplished by essentially having a computer write its own code by manipulating certain mathematical functions. In order to address this problem, researchers and engineers need tools that make the observation and analysis of machine learning software behavior easier. Startups like Arthur acknowledge the difficulty of solving this problem and don’t claim to have the optimal solutions, but they are hoping to make progress in this area and make cracking open the black box a little easier. Its hoped that if AI systems can be analyzed easier, it will become easier to correct problems like bias as well.
Large companies like Facebook already have some tools to analyze the inner workings of machine learning systems. For example, Facebook has a tool dubbed Fairness Flow which is intended to make sure the ads that recommend jobs to people target people from all different backgrounds. However, it is likely that large AI teams won’t want to invest time in creating such tools, and therefore a business opportunity exists for companies that want to create monitoring tools for use by AI companies.
Arthur is focused on creating tools that enable companies to better maintain and monitor AI systems after the system has already been deployed. Arthur’s tools are intended to let companies see how their system’s performance shifts over time, which would theoretically let companies pick up on potential manifestations of bias. If a company’s loan recommendation software starts excluding certain groups of customers, a flag could be set that indicates the system needs review in order to ensure it isn’t discriminating against customers based on sensitive attributes like race or gender.
However, Arthur isn’t the only company creating tools that let AI companies review the performance of their algorithms. Many startups are investing in the creation of tools to fight bias and ensure that AI algorithms are being used ethically. Weights & Biases is another startup creating tools to help machine learning engineers analyze potential problems with their network. Toyota has used the tools created by Weights & Biases to monitor their machine learning devices as they train. Meanwhile, the startup Fiddler is working to create a different set of AI monitoring tools. IBM has even created its own monitoring service called OpenScale.
Liz O’Sullivan, one of the co-creators of Arthur, explained that the interest in creating tools to help solve the Black Box problem is driven by a growing awareness of the power of AI.
“People are starting to realize how powerful these systems can be, and that they need to take advantage of the benefits in a way that is responsible,” O’Sullivan said.