Thought Leaders
Unchecked LLMs and the Healthcare Compliance Conundrum

Across industries, generative AI (GenAI) has achieved rapid breakthroughs in a relatively short period of time. These advancements are driven by foundation models, which The California Report on Frontier AI Policy defines as, “a class of general-purpose technologies that are resource-intensive to produce, requiring significant amounts of data and compute to yield capabilities that can power a variety of downstream AI applications.”
These general-purpose large language models (LLMs), such as Gemini and ChatGPT, are showing growing power to replicate and exceed human cognitive capabilities in areas such as data analysis, writing, and reasoning. In healthcare specifically, GenAI adoption is rising as clinicians and other healthcare professionals look to the technology to reduce administrative burden, accelerate operations, and even support clinical decision-making.
However, while the technology offers great promise, GenAI adoption in healthcare does raise key compliance risks if not implemented or utilized responsibly. Particularly, the use of general-purpose LLMs comes with specific compliance concerns that healthcare organizations must fully understand to prevent privacy or security breaches. These models may rely on unverified data sources, leverage patient health information in unauthorized ways, or perpetuate bias and/or inaccurate information.
To uphold patient data privacy, remain in compliance with evolving regulations, and minimize costly risks, healthcare leaders must take a decisive approach to defuse the ticking compliance “time bomb” of “unchecked” LLM use.
The Current State of General-Purpose LLM Use in Healthcare
Across healthcare, staff are increasingly leveraging LLMs to support everyday tasks, from administrative work to patient communication. Multimodal LLMs also further expand on these applications with their ability to easily process text, images, and audio. In addition to administrative support, we’re also seeing an uptick in providers turning to the technology to support more than just clerical work, but also clinical tasks.
These models are already demonstrating what some may view as impressive results, with several studies showing that LLM performance meets or even exceeds human capabilities in specific areas. For example, the GPT-4 model passed the United States Medical Licensing Examination with an overall score of 86.7%.
Hybrid AI is another emerging approach to GenAI use in healthcare that combines machine learning (ML) and LLMs to handle complex analysis and translate results into plain language. By integrating both models, this approach seeks to overcome the shortcomings of LLMs, including hallucinations, inaccuracies, and bias, while playing to its strengths. Agentic AI is also rising in adoption for its ability to automate key tasks without human input, such as responding to patient messages or scheduling appointments.
However, the potential AI holds also highlights a pressing need for more proactive governance. The more embedded these tools become in healthcare operations, the higher the stakes for ensuring accuracy, safety, and compliance.
The Compliance Risks of General-Purpose LLMs in Healthcare
While digital adoption in healthcare has unlocked a plethora of new possibilities, it has also exposed key vulnerabilities. Between November 1, 2023, and October 31, 2024, for example, the healthcare sector experienced 1,710 security incidents, 1,542 of which involved confirmed data disclosures.
The AI age deepens these cracks, adding a new layer of complexity to data privacy and security. More specifically, the use of general-purpose LLMs in healthcare evokes several key compliance risks:
Risk #1: Opaque-box development prevents continuous monitoring or verification
Closed models lack transparency about their development process, such as what specific data the model was trained on or how updates are made. This opacity prevents developers and researchers from digging into the model to determine the origin of safety risks or discern decision-making processes. As a result, closed LLMs can enable the use of unverified medical data sources and allow safety vulnerabilities to go unchecked.
Risk #2: Patient data leakage
LLMs do not always rely on deidentified patient data. Specialized prompts or interactions could inadvertently reveal identifiable health information, creating potential HIPAA violations.
Risk #3: Perpetuation of bias and inaccurate information
In one experiment, researchers injected a small percentage of incorrect facts into one category of a biomedical model’s knowledge base, while preserving its behavior in all other domains. Researchers found that misinformation was propagated across the model’s output, highlighting LLMs’ vulnerabilities to misinformation attacks.
Any defects found in foundation models are inherited by all the adopted models and resulting applications from the parent. Disparities in outputs may worsen health inequities, such as inaccurate advice for underrepresented groups.
Risk #4: Regulatory misalignment
The use of general-purpose LLMs may not comply with HIPAA, GDPR, or evolving AI specific regulations, especially if vendors cannot validate training data. These risks are compounded by healthcare organization employees using unapproved or unmonitored AI tools, or shadow AI. According to IBM, 20% of surveyed organizations across all sectors suffered a breach due to security incidents involving shadow AI.
Ultimately, general-purpose LLM risks in healthcare have real-world implications, including legal action, reputational damage, loss of patient trust, and litigation costs.
Best Practices: LLM Guidelines and Considerations
To responsibly adopt GenAI, healthcare leaders must establish clear guardrails that protect patients and organizations alike. The following best practices can help healthcare organizations set a foundation for responsible, compliant AI use:
Best Practice #1: Choose AI Tech Wisely
Require clarity from vendors on how AI tech is developed and what data sources are used in the development process. Prioritize tools that leverage only expert validated medical content, have transparent decision-making processes, and avoid training models on patient health information.
Best Practice #2: Build Human-in-the-Loop Safeguards
Ensure clinicians review any AI-generated outputs that could impact care decisions. AI can be a powerful tool, but in an industry that has a direct impact on patient lives, clinical oversight is key to ensuring responsible use and the accuracy of any AI-assisted information.
Best Practice #3: Training & Workforce Readiness
Educate clinicians and staff on both the benefits and risks of AI use to reduce shadow AI adoption. Healthcare staff are navigating a complex workforce, strained by staffing shortages, and high rates of burnout. Simplifying the AI education process helps ensure compliance without adding further burden to their workload.
Best Practice #4: Establish a Culture of Governance
Integrate third-party evaluations of AI solutions to verify safety, reliability, and compliance. In tandem, implement a clear, organization-wide framework for AI oversight that defines approval, usage, and monitoring to further enhance trust in the technology and prevent staff from turning to unauthorized tools.
Best Practice #5: Align with Leadership on AI Stewardship
Collaborate with leadership to stay ahead of evolving regulations, as well as guidance from the FDA and ONC. Regulatory efforts are emerging at the state level. For example, California instituted the Transparency in Frontier AI Act, which emphasizes risk disclosure, transparency, and mitigation, especially in healthcare settings, and there’s also the Colorado Artificial Intelligence Act (CAIA), which is designed to prevent algorithmic discrimination.
Best Practice #6: Continuous Monitoring & Feedback Loops
The use of AI within a healthcare setting should never be approached with the “set it and forget it” mindset. Setting up a framework for ongoing monitoring can help ensure accuracy of AI tools, strengthen accountability, and maintain compliance over time.
Best Practice #7: Pursue Partnerships to Optimize Oversight and Research
Healthcare organizations should leverage partnerships with regulators and the public sector to maximize oversight, contribute their industry perspective to safety standards, and combine expert resources.
Building Trust Through Compliance Leadership
The differentiation of AI solutions in healthcare will increasingly depend on the quality of their expert content, the integrity of their evaluation processes, and responsible integration into clinical workflows. The next phase of AI adoption will hinge less on code and more on compliance leadership.
Trust is as critical as compliance itself. For the technology to truly be effective, patients and providers must believe AI is safe and aligned with high-quality, ethical care. Compliance leadership is a strategic advantage, not just a defensive measure. Forward-looking organizations that establish guardrails early on, before harmful incidents occur, will differentiate themselves in an AI-powered healthcare future.












