Most Healthcare Organizations Can't Answer These 4 Questions About AI
AI Governance6 min read

Most Healthcare Organizations Can't Answer These 4 Questions About AI

How to assess AI readiness in healthcare beyond checklists. Four questions that measure whether your organization can identify PHI in AI prompts, enforce controls, detect failures, and produce an audit trail.

Three Gates Team

A lot of healthcare organizations will say they are working on AI readiness.

Far fewer can explain what they have actually measured.

The gap matters more than most people realize. Because AI readiness is not a policy PDF. It is not a vendor review checklist. It is not one annual training module with a slide about responsible use.

It is whether your organization can recognize risk in the moment, apply controls before data leaves, respond when those controls fail, and reconstruct what happened when accountability requires it.

If you want to know whether an organization is actually ready for AI use in a regulated environment, four questions get you much closer than the usual talking points.


Question 1: Can Your Staff Identify PHI in an AI Prompt?

Most people in healthcare know the basics. They know the usual identifiers. They know they should not paste patient information into the wrong system. But the real test is what happens when someone sees a dense paragraph of clinical text, is in a hurry, and wants help rewriting, summarizing, or organizing it.

A patient name is obvious. A medical record number embedded in a longer note is less obvious. And here is the part that trips up even experienced compliance professionals: in a clinical note tied to an identifiable patient, the diagnoses and medications are also PHI. Not just the names and dates. Most organizations overestimate their staff on this point because they assume recognition from training. They have not actually measured whether staff can identify risk in context, including the clinical content most people think of as "just medical terminology."


Question 2: Can You Describe What Happens Between an Employee Typing a Prompt and the AI Receiving It?

For many organizations, the honest answer is: not much.

The user types. The user sends. The data leaves.

The industry is beginning to converge on a pattern for solving this. Some people call it an AI control plane. Not a compliant chatbot. Not a wrapper around a single vendor's API. An independent enforcement layer that classifies what is in the data, evaluates whether the user is authorized to send it, and determines where it is safe to go. It works regardless of which AI tool the employee is using.

A stronger answer to this question explains whether classification runs on the content before it leaves the environment, and whether that classification understands clinical context rather than just matching patterns. It explains what policies determine whether data should be allowed, transformed, or blocked, and whether those policies account for the user, the data type, and the destination. It explains what routing logic determines which AI provider receives the request based on the sensitivity of what was found. And it explains where the audit record lives and whether it captures the policy decision, not just the event.

If that path cannot be explained clearly, then the organization does not really have a control. It has an assumption. And if the answer is "we bought a HIPAA-compliant AI tool," that is control number three without controls one and two.


Question 3: If Your Primary Detection Missed Something Yesterday, How Would You Know?

This is where a lot of AI governance conversations get unrealistic. People talk as if the goal is perfect detection, perfect classification, perfect prevention. Mature control environments do not work that way. They assume misses will happen and plan accordingly.

Some organizations still have no detection layer at all. Some have one, but no meaningful fallback. Some have multiple layers, but no way to verify whether those layers are performing.

The most interesting architectural pattern emerging in this space is self-correcting enforcement. The idea is straightforward: run a fast primary detection, conditionally verify with an independent method, track disagreements between the two, and automatically escalate when disagreement rates cross a threshold. It is not enough to have layers. The layers need to audit each other.

The most mature organizations are not the ones pretending the system never misses. They are the ones that can show detection, verification, and evidence that both are functioning over time.


Question 4: If You Were Audited Tomorrow, Could You Produce an Evidence Trail for AI Interactions from the Last Ninety Days?

Not just logs somewhere. Not just screenshots. An actual trail.

Could you show what was sent, what controls were applied, why a decision was made, and where that record now lives?

This is where even organizations with decent technical tooling often come up short. They may have logs, but the logs are fragmented. One tool records output but not input. Another records activity but not policy decisions. Another keeps data in a format nobody has ever tried to reconstruct under pressure. So on paper there is evidence, but in practice there is no usable evidence trail.


Why These Four Questions Matter

These four questions are useful because they move the conversation out of abstractions. They give you four dimensions you can actually evaluate:

  • Whether staff can identify PHI in context
  • Whether architecture enforces policy before data leaves
  • Whether controls are resilient when detection misses
  • Whether the organization can produce evidence later

Very few healthcare organizations are consistently strong in all four.

Most are stronger than they think in one area, weaker than they realize in another, and almost entirely unmeasured in the rest.

The architectural question is the one that matters most as AI usage expands beyond a single approved tool. When employees are using ChatGPT, Copilot, Claude, Gemini, and whatever launches next quarter, enforcement that lives inside any one of those tools is enforcement that only covers a fraction of your surface area. That is the problem the next post in this series addresses, and it is why the CISO conversation needs to start with architecture rather than vendors.


The Right Next Step

The right next step is usually not another meeting about AI strategy. It is a baseline. Something concrete enough to show where the gaps really are.

We built a free assessment around exactly these dimensions. It uses realistic scenarios instead of trivia questions and gives teams a practical readiness snapshot they can actually use.

Take the AI Readiness Assessment →

This is the second post in a three-part series. The first covers the shadow AI patterns already happening in most healthcare organizations. The third covers how CISOs should be framing AI risk to leadership.


About Three Gates

Three Gates is an AI control plane for regulated industries. Before any AI model sees your data, Three Gates classifies what's sensitive, checks whether the request is authorized, and routes it to a compliant destination. The architecture is vendor-independent and deployment-agnostic. Healthcare is the first vertical; government, legal, and financial verticals are on the roadmap. The goal is not to restrict AI usage, but to make it possible to say yes to AI safely.

Shadow AI Trilogy — Part 2 of 3

Ready to Get Started?

See where your organization stands on AI readiness, or talk to us about what a compliant AI pathway looks like for your team.