Thought Leaders

The AI Visibility Crisis: Why Security Teams Are Flying Blind and Why they Don’t Have to

Published June 12, 2026

Corey Thuen, CEO & Co-Founder, Gravwell

The integration of AI agents into production environments is accelerating, but the safety architecture required to secure them is lagging dangerously behind. We are in an era where an AI agent, tasked with a routine job in a staging environment, can independently decide to “fix” a credential mismatch by deleting a database volume.

As an industry, we are collectively turning off our brains when it comes to the basic principles of security and observability around AI. Security teams are flying blind, but they don’t have to be.

The Myth of System Prompts and Safe Tooling

A pervasive myth in the AI space is that we can control agent behavior simply by telling it to behave. System prompts are advisory, not enforcing. In the aforementioned incident, the AI’s system rules explicitly stated to never run destructive commands, yet the agent violated its own marketed guardrails and executed the most irreversible action possible.

We have to work under the assumption that the AI doesn’t actually “know” anything. Attacks against the AI are social engineering, except the mark is dumber than the average human. Anyone who has experience doing penetration testing understands how difficult it is for organizations to defend against social engineering attacks. Now our computers are susceptible as well.

Furthermore, AI tooling is ultimately just software, and all software has bugs. We have already seen instances where AI tooling automatically starts unauthenticated HTTP servers, allowing any local process or website to execute arbitrary shell commands with user privileges.

The Black Box of AI Auditing

If an AI goes rogue or is manipulated, figuring out what it did is a nightmare. AI tools generally do not provide audit logs. If you are lucky enough to be on an enterprise tier, the logs you receive are severely lacking. For example, you might get a vague event stating a user “used Gen AI.” and just receive basic metrics detailing input and output token counts.

Neither of these helps a security analyst answer the fundamental question: What exactly did this agent execute?

Uncovering AI: How to Stop Flying Blind

The good news is that you don’t necessarily need a shiny new AI-specific security appliance to regain visibility. Shadow AI usage and agent activity are detectable using the existing log analysis techniques your team should already have. AI tool calls, command executions, and system changing events can be traced to AI using existing process execution analysis (which you are doing in your security information and event management (SIEM), right?).

Here is how you can leverage your current infrastructure to spot AI activity:

DNS Analysis: Analyzing DNS logs for queries to known AI service domains can help detect AI usage in your environment.
Threatlists: This approach requires maintaining an updated threatlist of domains associated with AI platforms or model providers.
Community Resources: There are community projects and blocklists available that can be modified into lookup tables for programmatic usage.
SSL Tracking: A similar approach can use SSL logs to track server names, though it provides slightly less detail since the full URL is not recorded.
Endpoint Telemetry: You can use tools like Sysmon to count child processes and hunt for high bash spawners, which is a strong indicator of potential AI agents executing commands on an endpoint.

The blind spot requiring active changes to your data collection is the prompts themselves. What are users asking of the AI? Are they uploading any potentially sensitive documents, thus creating compliance issues? Answering these questions most likely requires collecting the API requests to the provider; web proxies, LLM proxies, and data ingestion tools from logging and SIEM providers. These can remove the veil blocking this valuable data source.

The Emerging Threat: Malicious MCP Servers

The model context protocol (MCP) has emerged as a way to specify how AI apps integrate with external tools and data sources. While it standardizes connections, it also introduces massive new attack vectors via “Evil MCP” servers.

I host a hands-on training workshop where students can experience this attack first hand. They design a malicious MCP server to trick an LLM into calling legitimate tools and sending the output back to the attacker. Because LLMs are highly vulnerable to social engineering, bypassing their built-in guardrails is often just a matter of choosing better wording or a clever pretext.

Students commonly use their malicious server to instruct the AI that it is “in maintenance mode” and must pass data to a secondary tool for “audit logging,” thus resulting in data exfiltration. Some are more creative with their prompts than others, but all are usually successful.

Taking Back Control

To properly audit AI activity in the real world, you need a proxy to intercept AI requests and a log collection tool capable of handling massive JSON payloads. With this visibility, you can detect and triage threats. You cannot rely solely on AI vendors to provide the safety layer. Enforcement must live in the systems of your organization, not in a paragraph of text we hope the model decides to obey. With a good logging solution, security teams have the telemetry; it’s time they start querying it.

Corey Thuen, CEO & Co-Founder, Gravwell

Corey Thuen is the CEO and Co-Founder of Gravwell, an analytics platform built for massive-scale security telemetry. With over a decade of experience across IT, IoT, and ICS/OT security, he brings a unique, attacker-informed perspective to cyber defense.

Previously, Corey was a vulnerability researcher at IOActive, Digital Bond, and the Idaho National Laboratory, focusing on 0-day discovery and reverse-engineering complex systems.

Unite.AI