Hybrid VPC: Deploy an Enterprise AI Firewall Where Prompts Never Leave Your Network

June 6, 2026·9 min read·security

The conversation in every enterprise security review meeting sounds the same:

“We want to use GPT-4 and Claude in our applications. But our compliance team says prompt data can't leave our network. What are our options?”

Until now, the options were binary: use a cloud AI gateway and accept that prompts transit through a third-party proxy, or build your own governance stack from scratch. The first option fails compliance reviews. The second takes 6-12 months and a dedicated engineering team.

Hybrid VPC is the third option. A compiled Go proxy that runs inside your network — handling DLP scanning, PII redaction, prompt injection blocking, and budget enforcement locally — while a cloud dashboard manages policies and displays metadata-only analytics.

The Problem: AI Governance vs. Data Sovereignty

Cloud AI gateways solve the governance problem — they scan prompts for PII, enforce budgets, block injections, and log everything. But they require your prompt data to pass through their servers. For many enterprises, that's a non-starter:

✗Healthcare (HIPAA) — patient data in prompts can't transit third-party infrastructure
✗Financial services (SOX/PCI) — cardholder data and financial records must stay within the compliance boundary
✗Government / defense — classified or sensitive data has strict data residency requirements
✗Legal — attorney-client privilege demands that case details never leave firm infrastructure

Self-hosting an open-source AI proxy solves the data sovereignty problem, but you lose the governance layer: no dashboards, no centralized policy management, no violation tracking, no budget enforcement. You're back to building from scratch.

The Solution: Hybrid VPC Deployment

Hybrid VPC splits the AI firewall into two zones:

Runs in Your VPC

Compiled Go proxy with built-in DLP engine
30+ PII entity detection & redaction
Prompt injection blocking
Budget enforcement (per-request & monthly)
Presidio NER sidecar (open-source)
Sync agent for policy updates

Stays in Cloud

Policy management dashboard
Real-time metrics & violation analytics
Deployment health monitoring
Multi-project management
Budget configuration
Metadata-only telemetry ingestion

The critical guarantee: prompt and response content never leaves your network. The proxy processes everything locally and forwards cleaned requests directly to your AI provider (OpenAI, Anthropic, Google, etc.). The sync agent pushes only structural metadata to the cloud — token counts, latency measurements, entity type counts, and cost estimates. Content fields are rejected server-side.

How It Works: 5-Step Request Flow

Your application — sends a request to the hybrid proxy (same network, no external hop)

Go DLP engine — scans for PII, applies project-specific policies, checks budget limits — all locally in under 50ms

Proxy forwards — the cleaned request directly to OpenAI / Anthropic / Google (your provider keys, your network egress)

AI provider responds — and the proxy returns the response to your application

Sync agent pushes — metadata (token count, latency, entity types) to the cloud dashboard — no content

Integration: Two Lines of Code

The hybrid proxy is OpenAI SDK compatible. Point your base URL to the proxy and use your project API key:

Python — OpenAI SDK

from openai import OpenAI

client = OpenAI(
    base_url="http://hybrid-proxy:8080/v1",  # Your local proxy
    api_key="oah_your_project_api_key",
)

# Every request is now DLP-scanned, budget-checked,
# and logged — without any data leaving your network
response = client.chat.completions.create(
    model="gpt-4.1",
    messages=[{"role": "user", "content": "Analyze this contract..."}],
)

That's it. No SDK changes, no middleware, no code annotations. Every request flowing through the proxy is automatically scanned for 30+ PII entity types, checked against your project's DLP policy, and budget-verified — all before the request reaches your AI provider.

Security Architecture: Defense in Depth

Compiled Go Binary

The proxy ships as a statically-compiled binary in a scratch Docker image. No shell, no OS packages, no runtime interpreter. Even if someone gains container access, there's nothing to exploit.

Fail-Closed Architecture

If the DLP engine can't verify a request is safe, it blocks it. Policy errors, detection failures, and budget exhaustion all result in blocked requests — never silent pass-through.

Metadata-Only Telemetry

The sync agent transmits only structural metadata. Content fields are validated and rejected server-side. Even a misconfigured agent can't leak prompt data.

Local Budget Enforcement

Monthly budgets and per-request token limits are enforced locally. No cloud round-trip for budget checks. Limits are applied instantly, even during cloud connectivity interruptions.

Who Is Hybrid VPC For?

Hybrid VPC is designed for organizations that need full AI governance but can't send prompt data through a third-party cloud proxy:

Enterprises in regulated industries (healthcare, finance, legal, government)
Organizations with strict data residency requirements
Teams that need to demonstrate data sovereignty for compliance audits
Companies deploying AI in environments with restricted outbound network access
Security-conscious organizations that want defense-in-depth for AI workloads

How Does It Compare?

	Cloud Gateway	Self-Host OSS	Hybrid VPC
Prompt stays on-prem	No	Yes	Yes
Cloud dashboard	Yes	No	Yes
30+ PII entities	Yes	13	Yes
Budget enforcement	Yes	No	Yes
Policy management UI	Yes	No	Yes
Setup time	5 minutes	30 minutes	15 minutes
Ongoing maintenance	None	You	Minimal

Getting Started

Hybrid VPC is available on the Enterprise plan. Here's how to get started:

1Contact us at enterprise@aisecuritygateway.ai to enable Hybrid VPC on your account
2We provision your deployment token and container registry access
3You deploy three Docker containers in your VPC using our Docker Compose template
4You point your app's base URL to the proxy — two lines of code

Ready to deploy an AI Firewall in your VPC?

Full DLP, budget enforcement, and policy management — without sending a single prompt to the cloud.

Contact Enterprise Sales Read the Docs

Security7 min read

GitHub LinkedIn X (Twitter)YouTube