We are seeing overwhelming growth in AI/ML systems to process oceans of data that are being generated in the new digital economy. However, with this growth, there is a need to seriously consider the ethical and legal implications of AI.
As we entrust increasingly more sophisticated and important tasks to AI systems, such as automatic loan approval, for example, we must be absolutely certain that these systems are responsible and trustworthy. Reducing bias in AI has become a massive area of focus for many researchers and has huge ethical implications, as does the amount of autonomy that we give these systems.
The concept of Responsible AI is an important framework that can help build trust in your AI deployments. There are five core foundational pillars for Responsible AI. This article will explore these to help you build better systems.
There’s an old saying in the software development world that goes: “hey, it works on my machine” In ML and AI, the phrase could be tweaked to be: “hey, it works on my dataset.” This is to say, that machine learning models can often tend to be a black box. Many training datasets can have inherent biases such as sampling bias or confirmation bias that reduce the accuracy of the final product.
To help make AI/ML systems more reproducible, and therefore accurate and trustworthy, the first step is to standardize the MLOps pipeline. Even the smartest data scientists have their favorite technologies and libraries, which means that the feature engineering and resulting models are not uniform from person to person. By using tools such as MLflow, you can standardize the MLOps pipeline and reduce these differences.
Another way to help make AI/ML systems more reproducible is through the use of what are called “gold datasets.” These are representative datasets that essentially act as tests and validation of new models before they are released for production.
As stated earlier, many ML models, particularly neural networks, are black boxes. To make them more accountable, we need to make them more interpretable. For simple systems such as decision trees, it’s quite easy to understand how and why the system made a certain decision, but, as the accuracy and complexity of an AI system goes up, its interpretability oftentimes goes down.
There’s a new area of research called “explainability” which is trying to bring transparency even to the complex AI systems such as neural networks and deep learning. These use proxy models to copy the performance of a neural network, but they also try to give valid explanations of what features are important.
This all leads to fairness; you want to know why a certain decision is made and make sure that this decision is fair. You also want to ensure that inappropriate features are not considered so that bias doesn’t creep into your model.
Perhaps the most important aspect of Responsible AI is accountability. There’s a lot of conversation about this topic, even in the government sector, as it deals with what policies will drive AI outcomes. This policy-driven approach determines at what stage humans should be in the loop.
Accountability requires robust monitors and metrics to help guide policy-makers and control AI/ML systems. Accountability really ties together reproducibility and transparency, but it needs effective oversight in the form of AI ethics committees. These committees can handle policy decisions, decide what is important to measure, and conduct fairness reviews.
AI security focuses on the confidentiality and integrity of data. When systems are processing data, you want them to be in a secure environment. You want the data to be both encrypted while at rest in your database and also as it is being called over the pipeline, but vulnerabilities still exist while it is being fed into a machine learning model as plain text. Technologies such as homomorphic encryption fix this problem by allowing machine learning training to happen in an encrypted environment.
Another aspect is the security of the model itself. For instance, model inversion attacks allow hackers to learn the training data that was used to build the model. There are also model poisoning attacks, which insert bad data into the model while it is training and totally damage its performance. Testing your model for adversarial attacks such as these can keep it safe and secure.
Google and OpenMined are two organizations that have recently been prioritizing AI privacy, and OpenMined hosted a recent conference on this very topic. With new regulations such as GDPR and CCPA, and potentially more coming down the line, privacy will play a central role in how we train machine learning models.
One way to ensure that you are handling your customer’s data in a privacy-aware manner is to use federated learning. This decentralized method of machine learning trains different models locally, and then aggregates each model in a central hub while keeping the data safe, secure, and private. Another method is to introduce statistical noise so that individual values of customers are not leaked. This keeps you working with the aggregate so that an individual’s data is intact and not available to the algorithm.
Keeping AI Responsible
Ultimately, keeping AI responsible is up to each organization that is designing AI/ML systems. By intentionally pursuing technologies within each of these five facets of Responsible AI, you can not only benefit from the power of artificial intelligence, you can do so in a trustworthy and straightforward way that will reassure your organization, customers, and regulators.