AI Platform Case Study

Rafia Rizwan

April 16, 2026

A practical case study on SOC 2 scope for AI platforms, showing how startups define boundaries around models, data, and support access.

AI Startup Case Study • SOC 2 Scope • Models • Data Stores • Support Access

AI Platform Case Study

How a Startup Scoped SOC 2 Around Models, Data Stores, and Support Access

For AI startups, SOC 2 scoping gets complicated fast.
The problem is usually not a lack of systems. The problem is that the environment is rarely simple, and the real service path is wider than the app alone.

An AI startup may have application infrastructure, model pipelines, vector databases, prompt and response logs, cloud storage, customer workspaces, support tooling, admin consoles, third-party model providers, and internal testing environments.

Somewhere in the middle of all that, the team has to answer one deceptively simple question: what exactly is in scope for SOC 2?

This case study shows how one fictional AI startup worked through that question and built a SOC 2 scope that was practical, defensible, and aligned to how its platform actually operated.

Why SOC 2 scoping is harder for AI platforms

Traditional SaaS scoping is already important. AI platforms add another layer of difficulty because the service often depends on a mix of application layers, data pipelines, inference services, model providers, storage systems, human review steps, and support workflows.

That means scoping cannot stop at “our app and cloud environment.”

For AI companies, customers often want to know:

where their data goes
whether prompts or outputs are stored
who can access model-related logs
how support teams interact with customer environments
whether data used by models is separated properly
what happens inside third-party AI services

Simple takeaway:
for an AI platform, SOC 2 scope needs to reflect not only the app, but the real path customer data and operational access take through the service.

Meet the startup

This example is fictional, but the scoping challenges are real. Let’s call the company SignalForge AI.

SignalForge provides an AI workflow platform for enterprise teams. Customers use it to upload internal documents, run question-answering and summarization workflows, search across knowledge bases, generate draft content, and route support or operations requests through AI-assisted workflows.

customer-facing web app

API endpoints and backend services

object storage for uploaded files

vector database for retrieval

prompt and output logging

third-party LLM provider integrations

As enterprise deals grew, SignalForge started hearing the same message over and over: “We need SOC 2.”

But before the audit could begin, the company had to answer a harder internal question: what exactly are we certifying?

The initial problem

At first, leadership made a common assumption: “We’ll scope the app, the production cloud environment, and company IT. That should be enough.”

Once readiness conversations began, several problems surfaced right away.

Problem	Why it mattered
The model workflow was not clearly separate from the app	Customer prompts moved through orchestration, retrieval, third-party LLMs, logging, and support flows.
Data lived in more places than expected	It was not only in the main app database. It also existed in storage, vector systems, logs, traces, and support workflows.
Support access was broader than leadership realized	Support staff could see tenant metadata, logs, issue context, and some debugging outputs tied to customer use.

The company quickly realized that if it scoped too narrowly, the SOC 2 report would not reflect how the platform actually worked.

What the team needed the scope to achieve

SignalForge did not just want a report. It wanted a scope that would stand up to customer scrutiny, reflect real data handling, cover the systems that mattered, avoid unnecessary sprawl, support cleaner evidence collection, and clarify ownership internally.

That meant the scoping exercise had to be practical, not theoretical.

A better way to think about scope

The company stopped asking, “Which tools do we use?”

Instead, it started asking, “Which systems, people, and processes materially affect the security, availability, and confidentiality of the customer-facing service?”

That shift led to three operational pillars:

models and model operations
data stores and data movement
support access and administrative handling

Those three pillars gave the team a much clearer way to decide what belonged in scope.

A strong SOC 2 scope follows service reality

For AI startups, the scope should mirror how the platform really works, how customer data moves, and how people can access it, not just which app screens customers see.

Scope your AI platform more clearly
Talk to a vCISO

1) Scoping around models

One of the first questions was whether “the model” itself was in scope. The answer needed more nuance.

SignalForge was not training foundation models from scratch. It used third-party LLM APIs, internal orchestration logic, prompt templates, guardrails, retrieval configuration, model selection rules, and evaluation workflows.

So the real question became: which parts of the model layer materially affect customer data handling and service trust?

Included in scope

Production orchestration services, prompt construction logic, retrieval components, model provider integrations, routing configuration, and logging tied to inference workflows.

Not automatically included

Isolated R&D sandboxes using synthetic data, local prototypes, and experiments with no production connectivity or customer data path.

This discipline mattered. Without it, the company could have over-scoped every AI-related activity simply because it involved models. Instead, it focused on model-related systems that influenced the live customer service.

2) Scoping around data stores

This became one of the most important parts of the project. At first, leadership thought of “the database” as the main sensitive store. The real data landscape was much broader.

Data store	What it held	Why it mattered
Relational production database	Accounts, tenant records, settings, metadata	Core app operation and customer segregation
Object storage	Uploaded customer documents	Source content for retrieval and workflows
Vector database	Embeddings and retrieval indexes	Key part of AI search and content access
Prompt and output logs	Troubleshooting and service diagnostics	Could contain customer-generated content
Monitoring and trace systems	Operational telemetry and debugging details	Could reveal workflow or access details
Support systems	Case notes, issue summaries, screenshots	Could contain copied customer context

The key realization was that customer information did not only live in “the app database.” It also lived in retrieval context, storage layers supporting AI workflows, operational logs, support workflows, and temporary troubleshooting evidence.

That meant any serious SOC 2 scope had to account for the full lifecycle of how customer data moved and where it rested.

3) Scoping around support access

This became one of the most valuable decisions in the whole project. At the beginning, support access was treated like a secondary operations issue. In practice, support teams had meaningful access to tenant metadata, logs, account-level settings, troubleshooting context, limited impersonation workflows, screenshots, exports, and escalation paths into engineering.

That made support access directly relevant to customer trust.

SignalForge included in scope:

support tooling used for troubleshooting
access approval rules for support roles
impersonation workflows
logging and review of support access
training expectations for handling customer content
engineering escalation procedures
periodic review of support permissions
separation between standard support and higher-risk admin capability

This made the company much better prepared for questions like: Can support staff read customer content? Do you log support access? Can support impersonate users? How do you control troubleshooting access? What prevents broad support visibility across tenants?

The startup’s final scope

After working through the environment carefully, SignalForge defined a much more defensible boundary.

In scope

production application platform and APIs
cloud infrastructure supporting the live service
production databases and object storage
vector database environments used in production
production model orchestration and inference routing
logging and monitoring systems tied to security and customer-impacting troubleshooting
support administration and issue-handling systems with access to customer context
identity and access management systems
employee endpoints and internal systems relevant to supporting the service
change management and deployment workflows for in-scope services

Explicitly out of scope

isolated experimental environments with no customer data path
personal prototypes and local-only R&D work
synthetic-data-only exploratory testing environments
marketing websites not connected to the customer service environment
non-production experiments that cannot affect customer workflows

What the startup gained from better scoping

Once scope was defined properly, several things became easier.

Customer trust conversations improved

The team could explain the environment more clearly and answer diligence questions with confidence.

Evidence gathering became more realistic

The company could focus on systems that truly supported the live service.

Ownership got clearer

Engineering, support, platform, and security teams could see their responsibilities more clearly.

Risk discussions got sharper

The company could ask better questions about prompt logs, support escalation, vector stores, and provider dependencies.

Lessons from the case study

Do not scope only the front-end app. For AI platforms, trust depends on what happens behind the interface.
Treat data stores broadly. Customer information may live in logs, embeddings, support workflows, and storage layers that are easy to overlook.
Support access is often more material than expected. It can create major trust implications, especially where prompts, outputs, or uploaded files are visible.
Not every model-related activity belongs in scope. Scope production-impacting model operations, not every experiment with “AI” attached to it.
Scope should follow the service reality. A good SOC 2 scope reflects how the platform truly works, how people access it, and how customer data moves through it.

Canadian Cyber’s take

AI startups often begin SOC 2 scoping like a traditional SaaS exercise and then run into trouble when customer diligence gets more specific. That usually happens when important areas are under-scoped, especially model operations, vector and log storage, support access, and third-party AI provider dependencies.

The strongest AI-platform scopes usually avoid both extremes. They are not so narrow that real trust boundaries are ignored, and they are not so broad that every experiment becomes audit overhead.

Instead, they focus on the systems, data paths, and operational roles that materially affect the live customer service. That is what makes the scope both practical and credible.

If your AI startup is defining SOC 2 scope right now

Canadian Cyber helps teams scope AI platforms in a way that reflects real model operations, data flows, support access, and customer trust boundaries, without creating unnecessary audit sprawl.

Get help scoping your SOC 2
Talk to a vCISO

Takeaway

For AI startups, SOC 2 scoping should not stop at the application layer. If your service depends on models, multiple data stores, and human support workflows, the scope needs to reflect that reality.

This case study shows that a cleaner scoping approach starts with three questions: which model-related systems materially affect the live service, where does customer data actually live and move, and who can access that data through support, admin, or troubleshooting workflows?

When those questions are answered clearly, the result is a SOC 2 scope that is more useful to auditors, customers, and internal teams alike. Because in the end, a good scope is not just about passing the audit. It is about proving that your control environment matches how your platform really works.

Follow Canadian Cyber

Practical cybersecurity and compliance guidance:

Website
LinkedIn
Instagram
Facebook
YouTube

2026

May

SOC 2 Evidence Gaps That Delay Reports

A practical guide to SOC 2 evidence gaps, covering access reviews, vendor records, approvals, restore testing, and audit-ready documentation.

A practical guide to SharePoint permission issues, covering access governance, external sharing, stale accounts, and audit readiness.

0 Comment

Rafia Rizwan