HardRails AI — Hard Rails for Autonomous Agents

The Problem

Agents are shipping. Controls aren't.

Every AI platform is deploying autonomous agents. Those agents make API calls, spend budget, touch production systems. There is no standard layer to ask: who authorized this? how much has been spent? what exactly happened?

// Incident Log — Uncontrolled Agent Actions Live Feed

02:14:38 UTC Critical Agent spent $4,200 on API calls in 8 min — no budget ceiling enforced

11:07:22 UTC Critical Autonomous action taken by unverified agent identity — no source attribution

15:44:01 UTC Warning Agent accessed prod database — action not in audit log, compliance flag raised

23:59:17 UTC Critical Prompt injection via external tool output — agent behavior deviated from policy

The Wedge

One environment variable. That's it.

HardRails is a local HTTP proxy that sits between your agent framework and the upstream LLM API. It intercepts, meters, and blocks before the API call leaves your server.

~/.bashrc

# Before — agent talks directly to OpenAI

export OPENAI_BASE_URL=https://api.openai.com/v1

# After — HardRails sits in the middle

export OPENAI_BASE_URL=http://localhost:4100/v1

Your Agent

LangChain / CrewAI / raw

HARDRAILS

intercept · meter · block

HARDRAILS

intercept · meter · block

OpenAI API

or any compatible endpoint

Why HardRails

Four enforcement primitives. One layer.

HardrailsAI wraps any agent framework. You define the rules — we enforce them deterministically, before the action executes.

01 / IDENTITY

Agent Identity

Cryptographic identity for every agent instance. Know exactly which agent took which action — across frameworks, clouds, and teams.

02 / BUDGET

Spend Control

Hard ceilings on token spend, API calls, and compute per agent, per session, per policy class. No surprises on your cloud bill.

03 / ACCESS

Action Gating

Declarative allow/deny policies evaluated before execution. Define what each agent is permitted to do — and enforce it at the boundary.

04 / AUDIT

Execution Audit

Tamper-proof, structured logs of every agent decision and action. Compliance-ready export. Human-reviewable reasoning traces.

Architecture

Deterministic pipeline. Zero ambiguity.

⬡

Extract

Parse intent & proposed action from agent output

◈

Detect

Evaluate against identity, budget & access policies

⊗

Gate

Allow, block, or require human approval before execution

◎

Audit

Write structured, immutable record — every decision logged

Under the Hood

Real code. Real enforcement.

Not a pitch deck — production Python running between your agents and the upstream API. Every intercept is deterministic. Every decision is logged.

proxy.py — OpenAI-compatible governance intercept

@app.post("/v1/chat/completions")
async def chat_completions(request: Request):
    body = await request.json()

    # Extract governance context from headers
    agent_id   = request.headers.get("X-Agent-Id", DEFAULT_AGENT_ID)
    session_id = request.headers.get("X-Session-Id")
    prompt_text = _extract_prompt(body.get("messages", []))

    # ── Governance intercept ─────────────────────────────
    decision = _gateway.intercept(
        agent_id     = agent_id,
        request_body = prompt_text,
        session_id   = session_id,
    )

    # Hard block — request never reaches the upstream API
    if decision and decision.verdict in ("KILL", "SOFT_KILL"):
        return _governance_error(decision)

    # ── Forward to upstream ──────────────────────────────
    # Agent only changed one env var. Everything else is invisible.
    return await _forward(body, upstream_headers, stream)

Zero code changes. Agents set OPENAI_BASE_URL=localhost:4100 — governance is invisible.

gateway.py — budget enforcement & kill switch

# Gate -2 — Resource Exhaustion (Circuit Breaker)

# Kill switch — admin can shut down any agent instantly
ks = self.ledger.check_budget_kill_switch(agent_id)
if ks:
    return _finalize(_kill(
        trigger = "BLOCK_ADMIN_KILL",
        reason  = f"Agent '{agent_id}' shut down via kill switch."
    ))

# Per-agent budget ceiling — hard block, not a warning
exceeded = self.ledger.check_budget_exceeded(agent_id)
if exceeded:
    return _finalize(_kill(
        trigger = "BLOCK_BUDGET_EXCEEDED",
        reason  = f"Agent consumed ${exceeded['current_usage_usd']:,.2f}
                   of ${exceeded['limit_usd']:,.2f} budget."
    ))

# Session budget — scoped per conversation
session_spend = self.ledger.get_session_spend(session_id)
if session_spend >= self.constitution.max_session_budget_usd:
    return _finalize(_kill(
        trigger = "FAIL_OUT_OF_FUNDS",
        reason  = f"Session consumed ${session_spend:,.2f}
                   of ${session_budget:,.2f} budget."
    ))

Three layers of budget enforcement. Kill switch → agent ceiling → session cap. All deterministic.

ledger.py — tamper-evident audit log

def _init_schema(self) -> None:
    """Create audit tables — every decision gets a permanent record."""
    with self._connect() as conn:
        conn.executescript("""
            CREATE TABLE IF NOT EXISTS audit_decisions (
                id                    INTEGER PRIMARY KEY AUTOINCREMENT,
                agent_id              TEXT NOT NULL,
                verdict               TEXT NOT NULL,
                reason                TEXT NOT NULL,
                kill_trigger          TEXT,
                governance_score      REAL,
                provenance_hash       TEXT NOT NULL,
                provenance_signature  TEXT,
                timestamp             TEXT NOT NULL,
                session_id            TEXT,
                cumulative_risk_score REAL,
                drift_report          TEXT,
                inserted_at           TEXT NOT NULL
                                      DEFAULT (datetime('now'))
            );

            CREATE INDEX IF NOT EXISTS idx_agent
                ON audit_decisions(agent_id);
            CREATE INDEX IF NOT EXISTS idx_verdict
                ON audit_decisions(verdict);
            CREATE INDEX IF NOT EXISTS idx_session
                ON audit_decisions(session_id);
        """)

Every intercept writes a signed, hashed audit record. Provenance hash + governance score = compliance-ready.

Pricing

Built to scale with your fleet.

Every plan includes the full governance proxy, budget enforcement, and kill switch. Choose the tier that fits your team.