[Technical]

Prompt-level DLP for the AI era

Legacy DLP was built for files, mailboxes, and SaaS APIs. AI moved the sensitive-data channel to an encrypted POST body full of free-text prompts. This paper explains why the old controls miss it, and what a transparent, on-device proxy replaces them with.

Security engineers, DevOps, CISOs·13 min·Architecture

[Key takeaways]

  • Legacy DLP fails on AI for structural reasons, not tuning reasons. The data channel it was designed for no longer carries the risk.
  • AI prompts are encrypted, ephemeral, free-text POST bodies to shared CDNs. Network CASB, endpoint agents, and email/SaaS DLP each miss a different part of that.
  • Regex and fingerprint matching cannot classify natural-language prompts. The unit of sensitivity is meaning, not a pattern.
  • The only reliable enforcement point is where TLS terminates and the prompt is readable: an inline proxy at the prompt layer.
  • A transparent, on-device proxy inspects prompts and responses locally, so control moves to the data without the data moving to a vendor.

The sensitive-data channel moved, the controls did not

Data loss prevention was architected around three channels: data at rest in storage, data in motion as files and email, and data in sanctioned SaaS reached over OAuth-mediated APIs. Every mature DLP stack, network CASB, endpoint agent, mail gateway, SaaS scanner, is a specialization of one of those channels. The channels were stable for a decade, so the controls hardened around them.

Generative AI opened a fourth channel that none of them was designed to watch. When an employee submits a prompt, the browser or desktop app composes an HTTPS POST to an LLM endpoint with a JSON body of free text. There is no file, no attachment, no recognized exfiltration signature, and no mail hop. Source code, customer records, credentials, and strategy documents flow out one paragraph at a time, and the request never touches a surface the legacy stack can read. This is not a coverage gap you close by writing more rules. It is a channel the existing tools cannot see into.

The rest of this paper is an argument in two halves: first, why each class of legacy DLP structurally cannot inspect AI traffic, and second, why terminating and inspecting at the prompt layer, on the endpoint, is the control that actually fits the channel.

Encryption to shared CDNs blinds the network

Start at the network, where CASB and secure web gateways live. When a browser opens a TLS session to chatgpt.com, claude.ai, or api.openai.com, an in-path device sees the connection metadata, SNI, destination IP, certificate fingerprint, and nothing more. The prompt is inside the encrypted body. Without a man-in-the-middle appliance that breaks and re-terminates TLS, the content is invisible by design.

Two properties of modern AI make even that MITM approach brittle. First, the major AI products sit behind shared CDNs and cloud front ends, so destination IP and SNI no longer map cleanly to a single tool: the same edge can serve an approved API and an unapproved consumer chatbot. Blocking by IP is a blunt instrument that breaks legitimate traffic. Second, a network MITM proxy has to re-encrypt for every AI destination in the world and keep pace with certificate pinning and new endpoints, which is an operational treadmill that leaks coverage every week a new model launches.

The result is that a CASB can tell you a connection to an AI destination happened. It generally cannot tell you what was in the prompt, and that is the only thing that determines whether data was lost.

Endpoint, email, and SaaS DLP watch the wrong surface

Move to the endpoint. Endpoint DLP is good at files leaving a laptop: copy to USB, upload to an unmanaged share, drag into a personal drive. But once the browser composes an AI request, the assembled POST body enters the browser's own TLS stack before it hits the wire. The endpoint agent may see keystrokes, yet it does not see the final request the way the model receives it, and it does not render the streamed response the way the user reads it in-app. The most sensitive moment, the assembled prompt, falls in a blind spot between the keyboard and the socket.

Email DLP is on a path the request never takes

Mail-gateway DLP inspects outbound messages at the SMTP boundary. The AI request path does not pass through the mail gateway at all, so email DLP has exactly zero visibility into it. It is not weak here, it is absent.

SaaS DLP and CASB API scanning were built for OAuth, not prompts

SaaS-focused DLP and CASB API connectors were designed for documents sitting in storage APIs reached through OAuth grants. Their policy surface assumes a known catalog of apps and a file-shaped object to classify after the fact. A per-request, identity-bound, free-text prompt is neither a file nor a catalogued OAuth resource, so the classification window that works for a document in Drive does not apply to a prompt in an LLM API body.

Prompts are ephemeral and rendered in-app

There is a timing problem underneath all of this. A prompt exists for the duration of one request. The response streams token by token and is rendered inside the app, never landing as a file the endpoint or SaaS scanner can pick up later. Controls that classify data at rest have nothing to scan, because the sensitive object is gone by the time they would look.

Regex matches strings, prompts carry meaning

Suppose you solved decryption and got the prompt in cleartext. Legacy DLP would still fail on it, because its classification engine matches patterns and fingerprints, and a prompt's sensitivity lives in meaning rather than in format. Structured detectors work when data presents as a structured field: a credit-card number, a national ID, a known document hash. A free-text prompt rarely cooperates.

Anything outside a predefined pattern is missed, and natural language is almost entirely outside predefined patterns. A name next to a number could be a public reference or a customer record. A paragraph could be a published summary or proprietary strategy paraphrased in the employee's own words. Source code pasted for a quick review carries no regex signature at all. Pattern matching on this channel produces false positives that train people to ignore alerts, and, far worse, false negatives at scale on exactly the content you most need to catch.

The unit of analysis has to change from the string to the semantics: what is this text about, who is sending it, to which model, and is that combination risky. That is a classification problem legacy DLP was never built to solve, and it can only be solved at a point in the path where the full prompt is readable.

Terminate and inspect at the prompt layer, on the device

Every failure above points to the same fix. The one place all AI traffic is both present and decryptable is the point where TLS terminates for the AI request. Put an inspection point there and the channel becomes legible. That is a transparent proxy at the prompt layer: it sits inline between the AI client and the model, terminates the session, reads the assembled prompt, applies policy, and brokers the request onward. The same path sees the response on the way back, so redaction and blocking are symmetric.

Where that proxy runs matters as much as that it exists. Run detection on the endpoint and two problems disappear at once. Shared CDNs stop mattering, because the split between an approved API and a consumer chatbot is obvious at the client before it ever reaches a shared edge. And the privacy objection to inspecting prompts inverts: nothing leaves the network by default, so control moves to the data instead of the data moving to a vendor's cloud for scanning.

On that single inline path you get the controls the channel actually needs. Semantic inspection classifies the prompt by meaning rather than by regex. Secrets and PII are redacted inline before the request reaches the model, with the response filtered the same way on return. Unapproved models are blocked at the request, not discovered in a log the next day. MCP servers are risk-scored and gated on the same proxy that watches the browser, desktop, and CLI. Because the proxy is transparent, users keep their tools and the security team gets an enforcement point instead of an after-the-fact report.

This is the architecture Cerbera is built on: a transparent proxy across browser LLMs, desktop apps, coding agents and CLIs, and MCP servers, detecting locally on the endpoint and mapping the evidence it produces to ISO 42001, the EU AI Act, SOC 2, and ISO 27001. For the hands-on rollout, deploying prompt-level DLP across the browser, IDE, and CLI step by step, see the companion guide DLP for the AI era. This paper is the why and the shape; that guide is the how.

Visibility

Legacy sees connection metadata to a shared CDN. Prompt-level DLP reads the assembled prompt and the streamed response, the only content that determines whether data was lost.

Granularity

Regex and fingerprints match formats and known documents. Semantic inspection classifies free text by meaning, catching paraphrased strategy and pasted source code that patterns miss.

Latency

Data-at-rest scanning looks after the fact, too late for an ephemeral prompt. Inline inspection redacts and blocks on the live request, before it reaches the model.

Privacy

Cloud DLP ships prompts to a vendor to scan them. On-device detection keeps content in the network by default, moving control to the data rather than the data to a vendor.

[Related]

Keep reading

[Get started]

Inspect AI at the prompt layer

Cerbera terminates and inspects AI traffic across browser, desktop, CLI, and MCP on one transparent proxy, redacting secrets and PII inline while detection stays on the endpoint.

Book a demo