GDPR + AI in Frontend: Safe LLM Integration Without PII Leaks in 2026
Adding smart AI features to your web app is table stakes in 2026. But the moment user-typed text hits an LLM API, you are in GDPR territory. Here is how to build it right without slowing your team down.
At Symfio we run a multi-tenant SaaS with users across the EU. When we started adding AI-powered suggestions to the dashboard — autocomplete, content generation, smart search — our legal team flagged a question nobody on the engineering side had fully thought through: what exactly are we sending to the LLM provider, and does any of it count as personal data?
It turned out the answer was "yes, sometimes, and it depends." That kicked off a three-month effort to build a privacy-safe AI layer in our React frontend. This article documents what we learned.
Why Frontend Is the Riskiest Layer
Most GDPR + AI discussions focus on the backend: data retention policies, data processing agreements (DPAs) with providers like OpenAI or Anthropic, audit logs. All of that matters. But the frontend is where the data originates, and it is the layer developers touch most casually.
Consider what flows through a typical AI feature:
- The user's free-text input — which might contain their name, address, or medical information
- Context injected by the app — often the user's profile, recent activity, or account data
- System prompts — which might inadvertently include PII from a database query
Each of these is a potential leak vector. And unlike a backend data breach, a frontend prompt leak can happen silently, in production, with no error logged anywhere.
The Three-Layer Defense Model
We settled on a model with three distinct layers of protection. None of them is sufficient alone; together they make a defensible system.
Layer 1: Input Sanitization Before the Prompt
The first line of defense is stripping or masking PII before it enters any prompt. We built a lightweight sanitizeForPrompt utility that runs on the client before any API call:
// utils/sanitize-prompt.ts
const EMAIL_RE = /[a-zA-Z0-9._%+\-]+@[a-zA-Z0-9.\-]+\.[a-zA-Z]{2,}/g;
const PHONE_RE = /(\+?\d[\d\s\-().]{7,}\d)/g;
// Simplified — production uses a more complete pattern set
export function sanitizeForPrompt(input: string): string {
return input
.replace(EMAIL_RE, '[EMAIL]')
.replace(PHONE_RE, '[PHONE]');
}
// Usage in a React component
const handleSubmit = async (userInput: string) => {
const safeInput = sanitizeForPrompt(userInput);
const response = await callLLM({ prompt: buildPrompt(safeInput) });
};
Layer 2: Explicit Context Allowlisting
The more dangerous leak is not user input — it is the context your app injects into prompts. Developers naturally reach for "give the AI more context" and end up serializing an entire user object into a system prompt.
We introduced an explicit allowlist pattern. Instead of passing the full user object, every AI feature declares exactly which fields it needs:
// types/ai-context.ts
type AllowedUserContext = {
accountTier: 'free' | 'pro' | 'enterprise';
locale: string;
preferredLanguage: string;
// Notably absent: name, email, id, address
};
function buildSystemPrompt(ctx: AllowedUserContext): string {
return `You are a helpful assistant for a ${ctx.accountTier} user.
Respond in ${ctx.preferredLanguage}.`;
}
// TypeScript enforces the boundary — user.email won't compile here
const context: AllowedUserContext = {
accountTier: user.accountTier,
locale: user.locale,
preferredLanguage: user.settings.preferredLanguage,
};
TypeScript does the enforcement here. If someone tries to add user.email to the context object, it will not compile. This is the kind of guard that actually survives team growth — it does not rely on code review catching a subtle mistake.
Layer 3: Network-Level Audit Logging via Proxy
The third layer is observability. We proxy all LLM API calls through our backend rather than calling the provider directly from the browser. This gives us:
- A complete audit log of every prompt and response
- The ability to redact logs before storage using server-side PII detection
- Rate limiting and cost controls per user
- A single point to rotate API keys without touching the frontend
Handling Streaming Responses Safely
Streaming is now standard for LLM UIs — users expect to see tokens arrive as they are generated. But streaming adds a wrinkle: the response arrives in chunks, so you cannot validate the full output before rendering it.
We handle this with a buffered hook that streams tokens to state while keeping the abort controller accessible for cancellation:
// hooks/use-streaming-llm.ts
export function useStreamingLLM() {
const [output, setOutput] = useState('');
const abortRef = useRef<AbortController | null>(null);
const stream = async (prompt: string) => {
abortRef.current = new AbortController();
setOutput('');
const response = await fetch('/api/ai/stream', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ prompt }),
signal: abortRef.current.signal,
});
const reader = response.body!.getReader();
const decoder = new TextDecoder();
while (true) {
const { done, value } = await reader.read();
if (done) break;
const chunk = decoder.decode(value, { stream: true });
setOutput(prev => prev + chunk);
}
};
const cancel = () => abortRef.current?.abort();
return { output, stream, cancel };
}
Consent and Transparency
Engineering controls are necessary but not sufficient. GDPR also requires that users know their data is being processed by a third-party LLM. Our approach:
- First-use disclosure — a one-time modal the first time a user activates an AI feature, explaining what data is sent and to whom
- Persistent indicator — a subtle "AI" badge on any input that sends data to an LLM
- Opt-out support — enterprise users can disable AI features entirely at the account level, stored server-side, not in
localStorage
What We Would Do Differently
Looking back, the biggest mistake was retrofitting these controls onto features that were already in production. It would have been significantly cheaper to establish the proxy architecture and the allowlist pattern before the first AI feature shipped.
If you are starting today: treat any user input that touches an LLM as if it were a form field being sent to a third-party analytics provider — because that is effectively what it is. Design for that from day one.
Privacy-by-design is not a legal checkbox. It is an architecture decision. And like most architecture decisions, it is much easier to build in than to bolt on.
Key Takeaways
- Sanitize user input client-side before building any prompt
- Use TypeScript to enforce an explicit allowlist of context fields — never serialize full user objects
- Proxy all LLM calls through your backend; never call the provider directly from the browser
- Ensure your provider has signed a DPA; for EU users, verify data residency
- Build consent disclosure and opt-out before shipping, not after