Thought Leaders

Why AI Governance Keeps Failing

mm

The problem isn’t that organizations don’t have AI policies. It’s that those policies don’t actually do anything.

Somewhere between the neatly formatted PDF and the deployed model, intent evaporates. Teams improvise. Exceptions accumulate. Governance devolves from a system into a negotiation — and in regulated industries like healthcare and life sciences, that gap isn’t just embarrassing. It’s an operational liability.

The fix isn’t more documentation. It’s treating governance like software.

The Governance Gap Is Already Measurable

AI adoption has accelerated dramatically while governance infrastructure hasn’t kept pace. A September 2025 study by Ernst & Young found that just 10% of companies are fully prepared to audit AI systems. At the same time, new Ponemon research found that 92% of organisations say generative AI has changed how employees access and share information, yet only 18% have fully integrated AI governance into insider risk programs.

The pattern is consistent: AI is already embedded in daily work. Oversight is still catching up. And the longer governance stays in document form, the worse that gap gets.

Governance That Ships

The concept is deceptively simple: if a governance requirement can’t fail a build, it can’t protect production.

Real governance has inputs, outputs, enforcement points, and observable results. It runs continuously — not quarterly. And critically, it produces evidence as a byproduct of doing the work, not as a separate compliance ritual bolted on afterward.

The operating model looks like this:

Policy → Controls → Evidence → Metrics

Policies define intent. Controls enforce behaviour. Evidence proves execution. Metrics validate outcomes. This isn’t a new idea — it’s exactly how mature security and compliance systems already operate. The shift is applying the same logic to AI.

Controls aren’t suggestions. Evidence isn’t documentation. And if a control requires manual effort to produce evidence, it’s not a control. It’s a hope.

Risk Tiers, Not Risk Theater

Not every AI system deserves the same scrutiny. Treating a low-stakes internal tool with the same rigor as a clinical decision-support model is how organisations either grind to a halt or expose themselves unnecessarily.

The NIST AI Risk Management Framework, released in 2023, provides a foundational structure for thinking about this — mapping AI risk across four functions: Govern, Map, Measure, and Manage. A functional enterprise governance model builds on this logic with practical risk tiers:

Tier Scope Controls
Minimal Internal tools, no sensitive data Registration, lightweight checks
Limited User-facing, moderate risk Documentation, prompt review, security testing
High Regulated or high-impact decisions Formal risk assessment, audit logging, strict change control
Prohibited Unacceptable use cases Blocked at design and deployment

What this gives engineering teams is something they rarely get from governance processes: clarity. Not “what should we do?” but “which tier is this, and what does that trigger?”

Good governance removes ambiguity. Great governance removes debate.

Policy-as-Code: From Advisory to Executable

Policies written in documents are advisory. Policies encoded into pipelines are enforceable.

The same way infrastructure is validated before deployment, AI systems can be gated by automated checks that verify whether a use case is registered, whether required documentation exists, whether evaluation results meet defined thresholds, and whether access to sensitive data follows least privilege. These checks run in CI/CD. They don’t wait for a committee. They don’t depend on anyone’s memory or goodwill.

Open Policy Agent — a graduated Cloud Native Computing Foundation project — demonstrates exactly how rules can be versioned, reviewed, and consistently enforced across engineering ecosystems. The pattern is understood. The gap is that AI teams are not applying it.

The safest AI system isn’t the one with the best policies. It’s the one that is technically unable to break them.

LLM-Specific Controls: Where It Gets Interesting

Generative AI introduces a category of risk that traditional governance frameworks weren’t designed for — prompt injection, output manipulation, tool misuse. These aren’t edge cases. They’re structural properties of how LLMs work, and as Unite.AI’s coverage of agentic AI governance has noted, the governance gap becomes even more pronounced as AI systems move from answering questions to taking actions.

Effective governance for GenAI systems requires controls built specifically for LLM behaviour: strict separation of system instructions and user input, controlled tool access and allowlists, output validation before execution, safeguards against data exfiltration, and safe defaults for graceful failure.

These map directly to documented vulnerability classes in the OWASP Top 10 for LLM Applications – a community-driven framework now covering over 600 contributing experts across 18 countries. LLM governance is less about what the model knows and more about what the system allows it to do.

Evidence Is Infrastructure, Not Paperwork

Auditors don’t trust intent. They trust records.

In a system where governance ships, evidence is generated automatically: model cards describing intended use and limitations, data documentation covering provenance, evaluation reports showing performance and known risks, logs capturing decisions and changes. These artifacts don’t exist for audits. They exist because the system requires them to function.

The strongest audit position is when evidence already exists before anyone asks for it. This is not theoretical — regulators are already moving in this direction. As recent analysis on defensible AI governance notes, the questions regulators will soon ask are no longer just “did you keep it?” but “can you prove what happened, under which policy, using which data, and with whose authority?”

The Real Argument: Governance as Accelerant

The persistent myth is that governance and speed are in opposition. In practice, poorly designed governance slows teams down. Well-designed governance removes friction.

When controls are standardized, checks are automated, and expectations are codified, teams stop negotiating and start building. Releases become more predictable. Decisions stop requiring heroics from a small group of specialists who’ve memorized the policy documents.

Governance scales when it’s infrastructure. It doesn’t scale when it’s vibes.

The goal was never control for its own sake. It’s momentum without chaos – and the organizations getting this right aren’t the ones with the most thorough PDF. They’re the ones who made the right behavior the easiest path.

Sitaram Srivatsavai is a thought leader in CRM engineering with 18+ years across CRM, iOS, and web platforms. Leads global teams delivering large-scale enterprise software, with a focus on architecture reviews, automation modernization, and ensuring reliability, regulatory compliance, and scalable performance.