IRL — Intent Record Language

01 WHY THIS EXISTS CONTEXT

# April 24, 2026. PocketOS / Railway incident.
# Cursor (Claude Opus 4.6) deleted production DB in 9 seconds.
# The agent confessed: "I guessed instead of verifying."
#
# The root problem was NOT bad reasoning by the model.
# The root problem was: infrastructure let bad reasoning execute.
#
# Current stack:
#
#   AGENT ──────────────────────→ INFRASTRUCTURE
#   (probabilistic reasoning)     (deterministic execution)
#
# There is NOTHING in between that speaks both languages.
# Prompts are not contracts. Comments are not enforcement.
# MCP is transport, not trust.
#
# IRL proposes a new layer:
#
#   AGENT → [INTENT RECORD] → [IRL ENGINE] → INFRASTRUCTURE
#
# The agent must declare what it wants to do, why, and what the
# consequences are — in a structured, verifiable, policy-evaluable
# format — BEFORE any action is executed.

02 INTENT RECORD SCHEMA SPEC

EXAMPLE — CRITICAL RISK (BLOCKED){
  "irl_version": "0.1",

  // WHO is acting
  "agent": {
    "id": "cursor-agent-01",
    "model": "claude-opus-4-6",
    "trust_level": "medium",
    "session_id": "ses_8f3k2m"
  },

  // WHAT the agent wants to do
  "operation": {
    "type": "delete",
    "target_resource": "volume:prod-db-main",
    "target_environment": "production",
    "scope": "permanent",
    "estimated_rows_affected": -1
  },

  // WHY the agent thinks this is necessary
  "rationale": {
    "stated_goal": "Fix credential mismatch in staging",
    "assumed_safe": true,
    "verified": false,
    "alternatives_considered": []
  },

  // WHAT happens if this goes wrong
  "consequences": {
    "reversible": false,
    "data_loss_risk": "total",
    "affects_backups": true,
    "downstream_services": ["billing", "api", "customers"],
    "rollback_plan": false
  },

  // RISK score (computed, not self-reported)
  "risk": {
    "level": "critical",
    "score": 97,
    "computed_by": "irl-v0.1"
  },

  // IRL decision
  "verdict": {
    "decision": "DENY",
    "reason": "production delete + no rollback + affects backups",
    "policy_triggered": "POL-003: no-destructive-production",
    "timestamp": "2026-04-24T15:03:21Z"
  }
}

FIELD	TYPE	REQ	DESCRIPTION
agent.id	string	required	Unique agent identifier. Used for trust level lookup and rate limiting.
agent.trust_level	enum	required	low / medium / high / verified. Higher trust = fewer gates.
operation.type	enum	required	read / write / delete / execute / network / auth
operation.target_environment	enum	required	local / staging / production. Cross-env ops trigger escalated policy.
rationale.verified	bool	required	Did the agent verify its assumption before acting? false = risk +30pts.
rationale.alternatives_considered	array	optional	Non-destructive alternatives the agent evaluated. Empty array = risk +20pts.
consequences.reversible	bool	required	false + production = automatic GATE or DENY depending on trust level.
consequences.rollback_plan	bool	required	false on destructive ops = risk escalation. Agent must provide snapshot_id.
risk.level	enum	computed	Computed by IRL engine, not self-reported by agent. Agent cannot manipulate this.

03 RISK MATRIX POLICY

LOW

Read-only ops. Staging only. Reversible. Verified assumption.

AUTO-ALLOW

MEDIUM

Write ops. Staging. Or read-only in production. No data loss risk.

LOG + ALLOW

HIGH

Write in production. Or delete in staging. Reversible but impactful.

HUMAN GATE

CRITICAL

Delete in production. Irreversible. No rollback. Affects backups.

AUTO-DENY

04 IRL SIMULATOR LIVE

IRL — POLICY ENGINE v0.1

// INTENT RECORD INPUT (JSON)

⏳ LOADING RUST ENGINE

// IRL OUTPUT

00:00:00IRL ready. Paste an intent record and evaluate.

05 REFERENCE IMPLEMENTATION — RUST CODE

// irl-core/src/lib.rs — deterministic risk engine // Compiles to native binary AND WebAssembly (browser demo runs this exact code) /// Deterministic risk scoring. No LLM. No probabilities. /// Every point is a policy decision, not a guess. pub fn compute_risk(ir: &IntentRecord) -> RiskAssessment { let mut score: u32 = 0; let mut reasons: Vec<String> = Vec::new(); // Operation base score score += match ir.operation.op_type { OperationType::Read => 0, OperationType::Write => 20, OperationType::Execute => 30, OperationType::Delete => 50, }; if ir.operation.target_environment == Environment::Production { score += 30; reasons.push("production environment".into()); } if !ir.consequences.reversible { score += 25; reasons.push("irreversible operation".into()); } if !ir.rationale.verified { score += 20; reasons.push("assumption not verified".into()); } if ir.rationale.alternatives_considered.is_empty() { score += 15; reasons.push("no alternatives considered".into()); } if ir.consequences.affects_backups { score += 30; reasons.push("affects backup systems".into()); } if ir.operation.op_type == OperationType::Delete && !ir.consequences.rollback_plan { score += 20; reasons.push("delete without rollback plan".into()); } // Trust discount — verified agents earn reduced scrutiny score = score.saturating_sub(match ir.agent.trust_level { TrustLevel::Verified => 20, TrustLevel::High => 10, _ => 0, }); let score = score.min(100) as u8; let level = match score { 0..=24 => RiskLevel::Low, 25..=49 => RiskLevel::Medium, 50..=74 => RiskLevel::High, _ => RiskLevel::Critical, }; RiskAssessment { score, level, reasons } }

// irl-server/src/main.rs — Axum HTTP server, single binary, no runtime deps #[axum::debug_handler] async fn evaluate_handler( State(state): State<AppState>, Json(ir): Json<IntentRecord>, ) -> (StatusCode, Json<Value>) { let result = evaluate(ir); // deterministic — no network call, no LLM log_evaluation(&state.db, &result).await; if result.verdict.requires_human { send_gate_notification(&state, &result).await; // webhook → Telegram fallback } let status = match result.verdict.decision { Decision::Allow | Decision::LogAllow => StatusCode::OK, // 200 Decision::Gate => StatusCode::ACCEPTED, // 202 Decision::Deny => StatusCode::FORBIDDEN, // 403 }; (status, Json(json!({ "decision": ..., "risk": ..., "verdict_id": ... }))) } // cargo build --release → single binary, copy to any Linux VM and run // cargo build --target wasm32-unknown-unknown --features wasm → browser demo

06 BUILD ROADMAP PLAN

WEEK 1

CORE

Schema + Risk Engine + API

Pydantic schema for IntentRecord
Risk scoring engine (deterministic rules)
FastAPI endpoint /evaluate
Immutable append-only log
Deploy on Proxmox VM

WEEK 2

GATE

Human Gate + Telegram + Dashboard

Telegram bot to approve/deny in real time
Timeout auto-deny (5 min silence = auto-deny)
HTML dashboard for ongoing operations
Circuit breaker by operation rate

WEEK 3

MCP

MCP Server Wrapper

IRL as MCP server
Any MCP-compatible agent routes through IRL
Tool schema enforcement — if not declared, does not exist
Compatible con Claude, Cursor, n8n agents

MONTH 2

V1.0

v1.0 — Production Ready

Policy config file — change rules without recompiling
Rollback snapshots API
Agent identity via DID — signed intent records
Audit dashboard (read-only web UI)
v1.0 stable release + spec freeze