Connect with us

AI-First Means Safety-First

Thought Leaders

AI-First Means Safety-First

mm

Buy a kid a brand new bike, and the bike will get all the attention—not the shiny helmet that accompanies it. But parents appreciate the helmet.

I’m afraid many of us today are more like kids when it comes to AI. We’re focused on how cool it is and how fast we can go with it. Not so much on what we can do to stay safe as we use it. It’s a pity because you can’t have the benefit of one without the other.

Simply put, applying AI without carefully planning for safety first isn’t just risky. It’s a straight path off a cliff.

What Does AI Safety Even Mean?

AI safety involves a host of steps. But perhaps the most important element is when to take them. To be effective, AI Safety must be by design.

That means that we consider how to prevent harm before we take it for a test drive. We figure out how to make sure the AI operates and generates results in line with our values and social expectations first—not after we get some horrible results.

Designing for AI safety also includes thinking about how to make it robust, or able to perform predictably even in adverse situations. It means making AI transparent, so the decisions AI makes are understandable, auditable, and unbiased.

But it also includes taking a look at the world in which the AI will function. What institutional and legal safeguards do we need, especially to comply with applicable government regulations? And I can’t overemphasize the people component: What will the impact of the use of AI be on the people who interact with it?

Safety by design means embedding AI safety into all our processes, workflows, and operations before we type our first prompt.

The Risks Outweigh Concerns

Not everyone agrees. When they hear “safety-first,” some hear “step so carefully and slowly that you get left behind.” Of course, that’s not what safety first means. It doesn’t have to stifle innovation or slow time-to-market. And it doesn’t mean an endless stream of pilots that never scale. Quite the contrary.

It does mean understanding the risks of not designing safety into AI. Consider just a few.

  • Deloitte’s Center for Financial Services predicts that GenAI could be responsible for fraud losses reaching US$40 billion in the U.S. alone by 2027, from US$12.3 billion in 2023, a 32% CAGR.
  • Biased decisions. Cases document biased medical care due to AI that had been trained on biased data.
  • Bad decisions that inspire more bad decisions. Worse than an initial bad decision spurred by faulty AI, studies indicate that those faulty decisions can become part of how we think and make future decisions.
  • Real consequences. AI that gives bad medical advice has been responsible for deadly patient outcomes. Legal issues have resulted from citing an AI’s hallucination as legal precedent. And software errors resulting from an AI assistant giving misinformation have tainted company products and their reputation and led to widespread user dissatisfaction.

And things are about to get even more interesting.

The advent and rapid adoption of agentic AI, AI that can function autonomously to take action based on decisions it’s made, will magnify the importance of designing for AI safety.

An AI agent that can act on your behalf could be tremendously useful. Instead of it telling you about the best flights for a trip, it could find them and book them for you. If you want to return a product, a company’s AI agent could not just tell you the return policy and how to file a return, but also handle the entire transaction for you.

Great—as long as the agent doesn’t hallucinate a flight or mishandle your financial information. Or get the company’s return policy wrong and refuse valid returns.

It’s not too difficult to see how the present AI safety risks could easily cascade with a host of AI agents running around making decisions and acting, especially since they won’t likely be acting alone. Much of the real value in agentic  AI will come from teams of agents, where individual agents handle parts of tasks and collaborate—agent to agent—to get work done.

So how do you embrace AI safety by design without hampering innovation and killing its potential value?

Safety by Design in Action

Ad hoc safety checks aren’t the answer. But integrating safety practices into every phase of an AI implementation is.

Begin with data. Make sure data is labeled, annotated where needed, bias-free, and high quality. This is especially true for training data.

Train your models with human feedback, as human judgment is essential to shape model behavior. Reinforcement Learning with Human Feedback (RLHF) and other similar techniques allow annotators to rate and guide responses, helping LLMs generate outputs that are safe and aligned with human values.

Then, before you release a model, stress test it. Red teams that try to provoke unsafe behavior by using adversarial prompts, edge cases, and attempted jailbreaks can expose vulnerabilities. Getting them fixed before they reach the public keeps things safe before there is a problem.

While this testing makes sure your AI models are robust, keep monitoring them with an eye on emerging threats and adjustments that might be needed to the models.

In a similar vein, regularly monitor content sources and digital interactions for signs of fraud. Critically, use a hybrid AI-human approach, letting AI-automation take care of the enormous volume of data to be monitored, and skilled humans handle reviews for enforcement and to ensure accuracy.

Applying agentic AI requires even more care. A basic requirement: train the agent to know its limitations. When it encounters uncertainty, ethical dilemmas, new situations, or particularly high-stakes decisions, ensure it knows how to ask for help.

Also, design traceability into your agents. This is especially important so that its interactions occur only with verified users, to avoid fraudulent actors influencing an agent’s actions.

If they seem to be working effectively, it could be tempting to turn the agents loose and let them do their thing. Our experience says to keep monitoring them and the tasks they’re fulfilling to watch for errors or unexpected behavior. Use both automated checks and human review.

In fact, an essential element of AI safety is regular human involvement. Humans should be intentionally engaged where critical judgment, empathy, or nuance and ambiguity are involved in a decision or action.

Again, to be clear, these are all practices that you build into the AI implementation in advance, by design. They are not the result of something going wrong and then rushing to figure out how to minimize the damage.

Does It Work?

We have been applying an AI Safety First philosophy and “by design” framework with our clients throughout the emergence of GenAI and now on the fast track to agentic AI. We’re finding that, contrary to worries about it slowing things down, it actually helps accelerate them.

Agentic AI has the potential to lower the cost of customer support by 25-50%, for example, while driving up customer satisfaction. But that all depends on trust.

Humans using AI must trust it, and the customers interacting with AI-enabled human agents or with actual AI agents can’t experience a single interaction that would undermine their trust. One bad experience can obliterate confidence in a brand.

We don’t trust what isn’t safe. So, when we build safety into every layer of the AI we’re about to roll out, we can do so with confidence. And when we’re ready to scale it, we’re able to do so rapidly—with confidence.

While putting AI Safety First into practice may seem overwhelming, you’re not alone. There are many experts to help and partners who can share what they’ve learned and are learning so you can tap into the value of AI safely without slowing you down.

AI has been an exciting ride so far, and as the ride accelerates, I find it exhilarating. But I’m also glad I’m wearing my helmet.

Joe Anderson is the Senior Director of Consulting and Digital Transformation at TaskUs, where he leads go-to-market strategy and innovation. He focuses on the intersection of AI, customer experience, and digital operations, and heads TaskUs’ new agentic AI consulting practice.