SHIPPED

Recursive Loop Protection

Share

No other API proxy blocks agent loops at the gateway layer — most frameworks require you to instrument this in your application code. AISG detects and kills infinite retry loops automatically, before they reach any provider.

Autonomous AI agents can get stuck sending the same request hundreds of times, draining your budget in minutes. Gateway-level detection works across all frameworks, languages, and agent architectures with zero code changes.

The Problem

Agent frameworks like LangChain, CrewAI, AutoGPT, and custom agent loops share a common failure mode: when the model returns an unexpected response, the agent retries with the same prompt. If the model keeps returning the same response, the agent keeps retrying — indefinitely.

Real-world impact: A single misconfigured agent can send thousands of identical requests in minutes. At $3/1M input tokens for GPT-4.1, a loop sending 1,000-token prompts 10 times per second burns through $108 in an hour. With a longer context window, it's much worse.

How It Works

1

Fingerprint

Every request is fingerprinted using SHA-256 of: API key prefix + model name + last 3 message contents (case-insensitive, normalized). This creates a unique signature for each distinct request pattern.

2

Count

A sliding window counter tracks how many times each fingerprint has been seen within the detection window (default: 60 seconds).

3

Block

When a fingerprint exceeds the threshold (default: 5 identical requests), the request is blocked with HTTP 429 and a clear error message.

4

Cool down

After blocking, a 30-second cooldown prevents the same fingerprint from being accepted. This gives agents time to recover or for operators to intervene.

What the Client Sees

When a loop is detected, the client receives a clear, actionable error:

HTTP 429 — Loop Detected
{
  "detail": {
    "error": "recursive_loop_detected",
    "message": "Blocked: identical request sent 6 times in 60 seconds.
                This usually indicates an agent retry loop, infinite
                recursion, or misconfigured automation.",
    "fingerprint": "a3f7c2d1b8e4...",
    "hit_count": 6,
    "cooldown_seconds": 30
  }
}

Handling in the AISG SDK

Python — AISG SDK
from aisg import AISG
from aisg.exceptions import LoopDetectedError

client = AISG()
try:
    response = client.chat.completions.create(
        model="oah/llama-4-maverick",
        messages=[{"role": "user", "content": "Hello"}],
    )
except LoopDetectedError as e:
    print(f"Loop detected: {e.hit_count} identical requests")
    print(f"Cooldown: {e.cooldown_seconds}s")
    print(f"Fingerprint: {e.fingerprint}")
    # Implement backoff or alert your team

Configuration

Loop detection parameters are configurable via the global config:

ParameterDefaultDescription
LOOP_GUARD_WINDOW_SECONDS60Sliding window duration
LOOP_GUARD_MAX_HITS5Identical requests before blocking
LOOP_GUARD_COOLDOWN_SECONDS30Cooldown after detection

What Makes Requests “Identical”

The fingerprint includes the API key prefix, model name, and the last 3 messages (content only, case-insensitive). This means:

  • Adding any variation to your prompt resets the counter
  • Different models get independent counters
  • Different API keys get independent counters
  • Normal conversational usage with changing messages is never affected
  • Repeating the exact same prompt 6+ times in 60 seconds triggers protection

Batch Processing & Test Suites

If you're running legitimate high-frequency identical requests (batch evaluation jobs, automated test suites, benchmark runs), increase the threshold before starting. Set LOOP_GUARD_MAX_HITS to a value above your expected batch size, or temporarily set LOOP_GUARD_MAX_HITS=0 to disable detection for that project.

Common scenarios where you may need to adjust:

  • Batch evaluation with a fixed template prompt across many inputs
  • CI/CD test suites sending the same test prompt repeatedly
  • Load testing or benchmarking with identical payloads

Webhook Integration

Loop detection events are available as webhook notifications. Subscribe to the loop.detected event to get real-time alerts when agent loops are blocked.

Webhook payload — loop.detected
{
  "event": "loop.detected",
  "timestamp": "2026-05-22T14:30:00Z",
  "project_id": "proj_abc123",
  "request_id": "req_def456",
  "data": {
    "fingerprint": "a3f7c2d1b8e4...",
    "hit_count": 6,
    "model": "oah/llama-4-maverick",
    "cooldown_seconds": 30
  }
}

Related Documentation