email-svg
Get in touch
info@canadiancyber.ca

AI Platform Case Study

A practical case study on SOC 2 scope for AI platforms, showing how startups define boundaries around models, data, and support access.

Main Hero Image

AI Startup Case Study • SOC 2 Scope • Models • Data Stores • Support Access

AI Platform Case Study

How a Startup Scoped SOC 2 Around Models, Data Stores, and Support Access
For AI startups, SOC 2 scoping gets complicated fast.
The problem is usually not a lack of systems. The problem is that the environment is rarely simple, and the real service path is wider than the app alone.

An AI startup may have application infrastructure, model pipelines, vector databases, prompt and response logs, cloud storage, customer workspaces, support tooling, admin consoles, third-party model providers, and internal testing environments.

Somewhere in the middle of all that, the team has to answer one deceptively simple question: what exactly is in scope for SOC 2?

This case study shows how one fictional AI startup worked through that question and built a SOC 2 scope that was practical, defensible, and aligned to how its platform actually operated.

Why SOC 2 scoping is harder for AI platforms

Traditional SaaS scoping is already important. AI platforms add another layer of difficulty because the service often depends on a mix of application layers, data pipelines, inference services, model providers, storage systems, human review steps, and support workflows.

That means scoping cannot stop at “our app and cloud environment.”

For AI companies, customers often want to know:
  • where their data goes
  • whether prompts or outputs are stored
  • who can access model-related logs
  • how support teams interact with customer environments
  • whether data used by models is separated properly
  • what happens inside third-party AI services
Simple takeaway:
for an AI platform, SOC 2 scope needs to reflect not only the app, but the real path customer data and operational access take through the service.

Meet the startup

This example is fictional, but the scoping challenges are real. Let’s call the company SignalForge AI.

SignalForge provides an AI workflow platform for enterprise teams. Customers use it to upload internal documents, run question-answering and summarization workflows, search across knowledge bases, generate draft content, and route support or operations requests through AI-assisted workflows.

customer-facing web app
API endpoints and backend services
object storage for uploaded files
vector database for retrieval
prompt and output logging
third-party LLM provider integrations

As enterprise deals grew, SignalForge started hearing the same message over and over: “We need SOC 2.”

But before the audit could begin, the company had to answer a harder internal question: what exactly are we certifying?

The initial problem

At first, leadership made a common assumption: “We’ll scope the app, the production cloud environment, and company IT. That should be enough.”

Once readiness conversations began, several problems surfaced right away.

Problem Why it mattered
The model workflow was not clearly separate from the app Customer prompts moved through orchestration, retrieval, third-party LLMs, logging, and support flows.
Data lived in more places than expected It was not only in the main app database. It also existed in storage, vector systems, logs, traces, and support workflows.
Support access was broader than leadership realized Support staff could see tenant metadata, logs, issue context, and some debugging outputs tied to customer use.

The company quickly realized that if it scoped too narrowly, the SOC 2 report would not reflect how the platform actually worked.

What the team needed the scope to achieve

SignalForge did not just want a report. It wanted a scope that would stand up to customer scrutiny, reflect real data handling, cover the systems that mattered, avoid unnecessary sprawl, support cleaner evidence collection, and clarify ownership internally.

That meant the scoping exercise had to be practical, not theoretical.

A better way to think about scope

The company stopped asking, “Which tools do we use?”

Instead, it started asking, “Which systems, people, and processes materially affect the security, availability, and confidentiality of the customer-facing service?”

That shift led to three operational pillars:
  • models and model operations
  • data stores and data movement
  • support access and administrative handling

Those three pillars gave the team a much clearer way to decide what belonged in scope.

A strong SOC 2 scope follows service reality
For AI startups, the scope should mirror how the platform really works, how customer data moves, and how people can access it, not just which app screens customers see.

1) Scoping around models

One of the first questions was whether “the model” itself was in scope. The answer needed more nuance.

SignalForge was not training foundation models from scratch. It used third-party LLM APIs, internal orchestration logic, prompt templates, guardrails, retrieval configuration, model selection rules, and evaluation workflows.

So the real question became: which parts of the model layer materially affect customer data handling and service trust?

Included in scope
Production orchestration services, prompt construction logic, retrieval components, model provider integrations, routing configuration, and logging tied to inference workflows.
Not automatically included
Isolated R&D sandboxes using synthetic data, local prototypes, and experiments with no production connectivity or customer data path.

This discipline mattered. Without it, the company could have over-scoped every AI-related activity simply because it involved models. Instead, it focused on model-related systems that influenced the live customer service.

2) Scoping around data stores

This became one of the most important parts of the project. At first, leadership thought of “the database” as the main sensitive store. The real data landscape was much broader.

Data store What it held Why it mattered
Relational production database Accounts, tenant records, settings, metadata Core app operation and customer segregation
Object storage Uploaded customer documents Source content for retrieval and workflows
Vector database Embeddings and retrieval indexes Key part of AI search and content access
Prompt and output logs Troubleshooting and service diagnostics Could contain customer-generated content
Monitoring and trace systems Operational telemetry and debugging details Could reveal workflow or access details
Support systems Case notes, issue summaries, screenshots Could contain copied customer context

The key realization was that customer information did not only live in “the app database.” It also lived in retrieval context, storage layers supporting AI workflows, operational logs, support workflows, and temporary troubleshooting evidence.

That meant any serious SOC 2 scope had to account for the full lifecycle of how customer data moved and where it rested.

3) Scoping around support access

This became one of the most valuable decisions in the whole project. At the beginning, support access was treated like a secondary operations issue. In practice, support teams had meaningful access to tenant metadata, logs, account-level settings, troubleshooting context, limited impersonation workflows, screenshots, exports, and escalation paths into engineering.

That made support access directly relevant to customer trust.

SignalForge included in scope:
  • support tooling used for troubleshooting
  • access approval rules for support roles
  • impersonation workflows
  • logging and review of support access
  • training expectations for handling customer content
  • engineering escalation procedures
  • periodic review of support permissions
  • separation between standard support and higher-risk admin capability

This made the company much better prepared for questions like: Can support staff read customer content? Do you log support access? Can support impersonate users? How do you control troubleshooting access? What prevents broad support visibility across tenants?

The startup’s final scope

After working through the environment carefully, SignalForge defined a much more defensible boundary.

In scope
  • production application platform and APIs
  • cloud infrastructure supporting the live service
  • production databases and object storage
  • vector database environments used in production
  • production model orchestration and inference routing
  • logging and monitoring systems tied to security and customer-impacting troubleshooting
  • support administration and issue-handling systems with access to customer context
  • identity and access management systems
  • employee endpoints and internal systems relevant to supporting the service
  • change management and deployment workflows for in-scope services
Explicitly out of scope
  • isolated experimental environments with no customer data path
  • personal prototypes and local-only R&D work
  • synthetic-data-only exploratory testing environments
  • marketing websites not connected to the customer service environment
  • non-production experiments that cannot affect customer workflows

What the startup gained from better scoping

Once scope was defined properly, several things became easier.

Customer trust conversations improved
The team could explain the environment more clearly and answer diligence questions with confidence.
Evidence gathering became more realistic
The company could focus on systems that truly supported the live service.
Ownership got clearer
Engineering, support, platform, and security teams could see their responsibilities more clearly.
Risk discussions got sharper
The company could ask better questions about prompt logs, support escalation, vector stores, and provider dependencies.

Lessons from the case study

  • Do not scope only the front-end app. For AI platforms, trust depends on what happens behind the interface.
  • Treat data stores broadly. Customer information may live in logs, embeddings, support workflows, and storage layers that are easy to overlook.
  • Support access is often more material than expected. It can create major trust implications, especially where prompts, outputs, or uploaded files are visible.
  • Not every model-related activity belongs in scope. Scope production-impacting model operations, not every experiment with “AI” attached to it.
  • Scope should follow the service reality. A good SOC 2 scope reflects how the platform truly works, how people access it, and how customer data moves through it.

Canadian Cyber’s take

AI startups often begin SOC 2 scoping like a traditional SaaS exercise and then run into trouble when customer diligence gets more specific. That usually happens when important areas are under-scoped, especially model operations, vector and log storage, support access, and third-party AI provider dependencies.

The strongest AI-platform scopes usually avoid both extremes. They are not so narrow that real trust boundaries are ignored, and they are not so broad that every experiment becomes audit overhead.

Instead, they focus on the systems, data paths, and operational roles that materially affect the live customer service. That is what makes the scope both practical and credible.

If your AI startup is defining SOC 2 scope right now
Canadian Cyber helps teams scope AI platforms in a way that reflects real model operations, data flows, support access, and customer trust boundaries, without creating unnecessary audit sprawl.

Takeaway

For AI startups, SOC 2 scoping should not stop at the application layer. If your service depends on models, multiple data stores, and human support workflows, the scope needs to reflect that reality.

This case study shows that a cleaner scoping approach starts with three questions: which model-related systems materially affect the live service, where does customer data actually live and move, and who can access that data through support, admin, or troubleshooting workflows?

When those questions are answered clearly, the result is a SOC 2 scope that is more useful to auditors, customers, and internal teams alike. Because in the end, a good scope is not just about passing the audit. It is about proving that your control environment matches how your platform really works.

Follow Canadian Cyber
Practical cybersecurity and compliance guidance:

Related Post