Recursive Loop Protection
No other API proxy blocks agent loops at the gateway layer — most frameworks require you to instrument this in your application code. AISG detects and kills infinite retry loops automatically, before they reach any provider.
Autonomous AI agents can get stuck sending the same request hundreds of times, draining your budget in minutes. Gateway-level detection works across all frameworks, languages, and agent architectures with zero code changes.
The Problem
Agent frameworks like LangChain, CrewAI, AutoGPT, and custom agent loops share a common failure mode: when the model returns an unexpected response, the agent retries with the same prompt. If the model keeps returning the same response, the agent keeps retrying — indefinitely.
Real-world impact: A single misconfigured agent can send thousands of identical requests in minutes. At $3/1M input tokens for GPT-4.1, a loop sending 1,000-token prompts 10 times per second burns through $108 in an hour. With a longer context window, it's much worse.
How It Works
Fingerprint
Every request is fingerprinted using SHA-256 of: API key prefix + model name + last 3 message contents (case-insensitive, normalized). This creates a unique signature for each distinct request pattern.
Count
A sliding window counter tracks how many times each fingerprint has been seen within the detection window (default: 60 seconds).
Block
When a fingerprint exceeds the threshold (default: 5 identical requests), the request is blocked with HTTP 429 and a clear error message.
Cool down
After blocking, a 30-second cooldown prevents the same fingerprint from being accepted. This gives agents time to recover or for operators to intervene.
What the Client Sees
When a loop is detected, the client receives a clear, actionable error:
{
"detail": {
"error": "recursive_loop_detected",
"message": "Blocked: identical request sent 6 times in 60 seconds.
This usually indicates an agent retry loop, infinite
recursion, or misconfigured automation.",
"fingerprint": "a3f7c2d1b8e4...",
"hit_count": 6,
"cooldown_seconds": 30
}
}Handling in the AISG SDK
from aisg import AISG
from aisg.exceptions import LoopDetectedError
client = AISG()
try:
response = client.chat.completions.create(
model="oah/llama-4-maverick",
messages=[{"role": "user", "content": "Hello"}],
)
except LoopDetectedError as e:
print(f"Loop detected: {e.hit_count} identical requests")
print(f"Cooldown: {e.cooldown_seconds}s")
print(f"Fingerprint: {e.fingerprint}")
# Implement backoff or alert your teamConfiguration
Loop detection parameters are configurable via the global config:
| Parameter | Default | Description |
|---|---|---|
| LOOP_GUARD_WINDOW_SECONDS | 60 | Sliding window duration |
| LOOP_GUARD_MAX_HITS | 5 | Identical requests before blocking |
| LOOP_GUARD_COOLDOWN_SECONDS | 30 | Cooldown after detection |
What Makes Requests “Identical”
The fingerprint includes the API key prefix, model name, and the last 3 messages (content only, case-insensitive). This means:
- ✓ Adding any variation to your prompt resets the counter
- ✓ Different models get independent counters
- ✓ Different API keys get independent counters
- ✓ Normal conversational usage with changing messages is never affected
- ✗ Repeating the exact same prompt 6+ times in 60 seconds triggers protection
Batch Processing & Test Suites
If you're running legitimate high-frequency identical requests (batch evaluation jobs, automated test suites, benchmark runs), increase the threshold before starting. Set LOOP_GUARD_MAX_HITS to a value above your expected batch size, or temporarily set LOOP_GUARD_MAX_HITS=0 to disable detection for that project.
Common scenarios where you may need to adjust:
- ⚠ Batch evaluation with a fixed template prompt across many inputs
- ⚠ CI/CD test suites sending the same test prompt repeatedly
- ⚠ Load testing or benchmarking with identical payloads
Webhook Integration
Loop detection events are available as webhook notifications. Subscribe to the loop.detected event to get real-time alerts when agent loops are blocked.
{
"event": "loop.detected",
"timestamp": "2026-05-22T14:30:00Z",
"project_id": "proj_abc123",
"request_id": "req_def456",
"data": {
"fingerprint": "a3f7c2d1b8e4...",
"hit_count": 6,
"model": "oah/llama-4-maverick",
"cooldown_seconds": 30
}
}Related Documentation
Join the Community