Hybrid VPC: Deploy an Enterprise AI Firewall Where Prompts Never Leave Your Network

Share
June 6, 2026·9 min read·security

The conversation in every enterprise security review meeting sounds the same:

“We want to use GPT-4 and Claude in our applications. But our compliance team says prompt data can't leave our network. What are our options?”

Until now, the options were binary: use a cloud AI gateway and accept that prompts transit through a third-party proxy, or build your own governance stack from scratch. The first option fails compliance reviews. The second takes 6-12 months and a dedicated engineering team.

Hybrid VPC is the third option. A compiled Go proxy that runs inside your network — handling DLP scanning, PII redaction, prompt injection blocking, and budget enforcement locally — while a cloud dashboard manages policies and displays metadata-only analytics.

The Problem: AI Governance vs. Data Sovereignty

Cloud AI gateways solve the governance problem — they scan prompts for PII, enforce budgets, block injections, and log everything. But they require your prompt data to pass through their servers. For many enterprises, that's a non-starter:

  • Healthcare (HIPAA) — patient data in prompts can't transit third-party infrastructure
  • Financial services (SOX/PCI) — cardholder data and financial records must stay within the compliance boundary
  • Government / defense — classified or sensitive data has strict data residency requirements
  • Legal — attorney-client privilege demands that case details never leave firm infrastructure

Self-hosting an open-source AI proxy solves the data sovereignty problem, but you lose the governance layer: no dashboards, no centralized policy management, no violation tracking, no budget enforcement. You're back to building from scratch.

The Solution: Hybrid VPC Deployment

Hybrid VPC splits the AI firewall into two zones:

Runs in Your VPC

  • Compiled Go proxy with built-in DLP engine
  • 30+ PII entity detection & redaction
  • Prompt injection blocking
  • Budget enforcement (per-request & monthly)
  • Presidio NER sidecar (open-source)
  • Sync agent for policy updates

Stays in Cloud

  • Policy management dashboard
  • Real-time metrics & violation analytics
  • Deployment health monitoring
  • Multi-project management
  • Budget configuration
  • Metadata-only telemetry ingestion

The critical guarantee: prompt and response content never leaves your network. The proxy processes everything locally and forwards cleaned requests directly to your AI provider (OpenAI, Anthropic, Google, etc.). The sync agent pushes only structural metadata to the cloud — token counts, latency measurements, entity type counts, and cost estimates. Content fields are rejected server-side.

How It Works: 5-Step Request Flow

1

Your applicationsends a request to the hybrid proxy (same network, no external hop)

2

Go DLP enginescans for PII, applies project-specific policies, checks budget limits — all locally in under 50ms

3

Proxy forwardsthe cleaned request directly to OpenAI / Anthropic / Google (your provider keys, your network egress)

4

AI provider respondsand the proxy returns the response to your application

5

Sync agent pushesmetadata (token count, latency, entity types) to the cloud dashboard — no content

Integration: Two Lines of Code

The hybrid proxy is OpenAI SDK compatible. Point your base URL to the proxy and use your project API key:

Python — OpenAI SDK
from openai import OpenAI

client = OpenAI(
    base_url="http://hybrid-proxy:8080/v1",  # Your local proxy
    api_key="oah_your_project_api_key",
)

# Every request is now DLP-scanned, budget-checked,
# and logged — without any data leaving your network
response = client.chat.completions.create(
    model="gpt-4.1",
    messages=[{"role": "user", "content": "Analyze this contract..."}],
)

That's it. No SDK changes, no middleware, no code annotations. Every request flowing through the proxy is automatically scanned for 30+ PII entity types, checked against your project's DLP policy, and budget-verified — all before the request reaches your AI provider.

Security Architecture: Defense in Depth

Compiled Go Binary

The proxy ships as a statically-compiled binary in a scratch Docker image. No shell, no OS packages, no runtime interpreter. Even if someone gains container access, there's nothing to exploit.

Fail-Closed Architecture

If the DLP engine can't verify a request is safe, it blocks it. Policy errors, detection failures, and budget exhaustion all result in blocked requests — never silent pass-through.

Metadata-Only Telemetry

The sync agent transmits only structural metadata. Content fields are validated and rejected server-side. Even a misconfigured agent can't leak prompt data.

Local Budget Enforcement

Monthly budgets and per-request token limits are enforced locally. No cloud round-trip for budget checks. Limits are applied instantly, even during cloud connectivity interruptions.

Who Is Hybrid VPC For?

Hybrid VPC is designed for organizations that need full AI governance but can't send prompt data through a third-party cloud proxy:

  • Enterprises in regulated industries (healthcare, finance, legal, government)
  • Organizations with strict data residency requirements
  • Teams that need to demonstrate data sovereignty for compliance audits
  • Companies deploying AI in environments with restricted outbound network access
  • Security-conscious organizations that want defense-in-depth for AI workloads

How Does It Compare?

Cloud GatewaySelf-Host OSSHybrid VPC
Prompt stays on-premNoYesYes
Cloud dashboardYesNoYes
30+ PII entitiesYes13Yes
Budget enforcementYesNoYes
Policy management UIYesNoYes
Setup time5 minutes30 minutes15 minutes
Ongoing maintenanceNoneYouMinimal

Getting Started

Hybrid VPC is available on the Enterprise plan. Here's how to get started:

  1. 1Contact us at enterprise@aisecuritygateway.ai to enable Hybrid VPC on your account
  2. 2We provision your deployment token and container registry access
  3. 3You deploy three Docker containers in your VPC using our Docker Compose template
  4. 4You point your app's base URL to the proxy — two lines of code

Ready to deploy an AI Firewall in your VPC?

Full DLP, budget enforcement, and policy management — without sending a single prompt to the cloud.

Related Articles