All posts
·7 min read

Your AI Vendor Says Your Data Is Isolated. Here's How to Verify It.

FERPA compliance doesn't mean your data is isolated. Here are five questions every institution should ask before adopting an AI knowledge platform.

TB

Todd Burner

Founder, Hone Labs

Every AI platform selling to higher education makes the same claim: your data is secure. Most of them add FERPA compliant for good measure. And most institutions stop asking questions there.

That's a problem. Because FERPA compliance and actual data isolation are two very different things — and the gap between them is where institutional risk lives.

I've spent over a decade working in privacy and security at scale, including leading the deployment of privacy-preserving contact tracing technology across 28 US states. The pattern I see in higher ed AI adoption right now is one I've seen before: institutions adopting technology faster than they're evaluating it, relying on vendor marketing language instead of architectural verification.

This post is about closing that gap. Not by turning you into a security engineer, but by giving you the right questions.

FERPA Is a Floor, Not a Ceiling

Here's something most institutions don't realize: FERPA does not mandate any specific data isolation architecture. It requires “reasonable methods” to protect student records, but it says nothing about whether your data should live in its own database, share infrastructure with other institutions, or be processed alongside someone else's documents in an AI pipeline.

FERPA compliance is a minimum bar. A vendor can be fully FERPA compliant while storing your institution's strategic plans, accreditation materials, and research documents in the same database tables as every other client — separated only by a column that says which row belongs to whom.

When a vendor says “we're FERPA compliant,” they're telling you they meet a regulatory floor. They're not telling you anything about how your data is actually separated from everyone else's.

The Isolation Spectrum: What “Secure” Actually Means

Not all isolation is created equal. When vendors say “isolated,” they typically mean one of three things:

Shared everything. Your data sits in the same database, same tables, same rows as every other client. A filter in the application code decides who sees what. This is the most common model in SaaS. It's also the most fragile — a single missing filter in a single query exposes everything. In December 2025, one missing WHERE clause in a SaaS platform leaked data across 50 tenants for months before anyone noticed.

Logical isolation. Your data gets its own schema or row-level security policies within a shared database. Better. But still one misconfiguration away from exposure. In 2025, a vulnerability in Microsoft's Entra ID (their identity platform for Azure) received a CVSS 10.0 severity rating — the maximum possible — because it allowed cross-tenant privilege escalation. If Microsoft can get multi-tenant isolation wrong, your vendor can too.

Infrastructure isolation. Your data lives in a completely separate database, on separate servers, with separate authentication. A bug in another client's environment literally cannot reach yours because there's no shared infrastructure to traverse.

Most vendors claiming “isolation” mean the first or second model. Few mean the third. The word is the same; the architecture is not.

Why AI Makes This Worse

Traditional SaaS isolation is hard enough. AI adds entirely new categories of risk that most security frameworks haven't caught up to.

When an AI platform processes your documents, it doesn't just store them — it transforms them. Your accreditation self-study becomes vector embeddings. Your strategic plan becomes tokens in a context window. Your institutional knowledge becomes retrievable fragments in a vector database.

Each of these transformations creates a new surface where data can leak:

Embeddings aren't anonymous. Research shows that attackers can recover 50–70% of original text from stolen vector embeddings. If your vendor stores your embeddings alongside other institutions' embeddings in a shared vector database, that's a leakage surface that traditional database security doesn't cover.

Shared AI pipelines mix data. If your vendor's AI processes your documents through the same inference infrastructure as other clients, context window contamination is a real risk. In 2023, a bug in ChatGPT's infrastructure leaked conversations — including payment information — between users. The OWASP Top 10 for LLM Applications now lists “Sensitive Information Disclosure” as a primary concern for exactly this reason.

Retrieval systems can be poisoned. In 2025, researchers demonstrated that Slack's AI feature could be exploited through indirect prompt injection — content in public channels could be crafted to extract data from private channels. If your vendor's retrieval system doesn't enforce strict tenant boundaries at every layer, similar attacks apply.

The point isn't that AI is inherently unsafe. It's that AI creates security challenges that traditional isolation models weren't designed to handle. And most vendors haven't updated their architecture to match.

Five Questions to Ask Any AI Vendor

You don't need to become a security expert. You need five questions and the willingness to push past marketing language.

1. “Do we get our own database, or do we share one with other clients?”

This is the foundational question. You want infrastructure-level isolation — your own database instance, not a shared one with row-level filters. If the answer involves “logical isolation” or “row-level security policies,” understand that you're one misconfigured query away from exposure.

2. “If another client's environment is compromised, can ours be affected?”

This is the blast radius question. In a truly isolated architecture, the answer is no — there's no shared infrastructure for an attack to traverse. If your vendor can't give you an unqualified “no,” that tells you something.

3. “Where does AI processing happen? Is our data ever mixed with other clients' data in any AI pipeline?”

This is the question most vendors aren't prepared for. Even if they separate your database, they may process your documents through shared AI infrastructure — shared embedding models, shared vector stores, shared inference servers. Ask specifically about each layer.

4. “Can you show me your isolation architecture in a diagram?”

Vendors who have real isolation can explain it simply and visually. Vendors who don't will give you paragraphs of qualifications. This is a transparency test as much as a technical one.

5. “Has your isolation been independently tested?”

Penetration testing, SOC 2 Type II audits, third-party security assessments — these are the mechanisms that verify claims. If your vendor's isolation has never been tested by someone other than the people who built it, you're trusting marketing, not evidence.

The HECVAT (Higher Education Community Vendor Assessment Toolkit) from EDUCAUSE provides a structured framework for these conversations. But the five questions above cut to what matters most.

What Good Looks Like

True isolation for an AI knowledge platform means separation at every layer: separate databases, separate AI processing pipelines, separate storage, separate authentication. It means your vendor's AI never processes your documents alongside another institution's. It means zero-retention agreements with AI providers so your data doesn't persist in third-party systems. And it means the architecture is documented, diagrammed, and independently verifiable.

This is how we built Hone Labs. Not because it's the easiest architecture — it isn't. But because institutional knowledge is irreplaceable. It compounds over time. The strategic documents, accreditation materials, and research your institution builds today become the foundation for decisions years from now. That kind of data deserves more than a WHERE clause.

The 2025 EDUCAUSE AI Landscape Study found that only 9% of institutions say their cybersecurity policies adequately address AI risks. That number should concern anyone evaluating AI platforms for their institution.

The gap between “we take security seriously” and “your data is architecturally isolated” is the gap higher education needs to close before AI adoption accelerates further. You don't need to slow down — you need to ask better questions.


Want to see how Hone Labs approaches tenant isolation? Visit our Trust Center or book a demo to see the architecture firsthand.

Want to see what we're building?

Book a 30-minute demo and we'll walk you through Hone Studio with your organization's actual documents.