Gatekeeper – open-source policy engine and sandbox for AI coding agents

2 pointsposted 6 hours ago
by gemini2026

3 Comments

gemini2026

6 hours ago

  AI agents (Claude Code, Cline, Aider, OpenClaw) execute real side effects — writing
  files, running shell commands, making network requests. Most security approaches
  evaluate each action in isolation against a blocklist. That misses the pattern that
  actually matters.

  Gatekeeper tracks behavioural state across the entire session. If an agent reads
  credentials, then ingests content from an untrusted source, and then attempts a network
  call — that combination triggers escalation to human review, even if each individual
  The action would normally be allowed. We call it the exfiltration trifecta:
  read_sensitive + ingested_untrusted + has_egress.

  OpenClaw is the tightest integration: Gatekeeper launches it as a managed child
  process inside an OS-native sandbox (macOS sandbox-exec, Linux unshare), generates
  its config automatically, and intercepts every tool call before it executes. One
  command: `gatekeeper run --agent openclaw --workspace /path/to/project`.

  Other things it does:
  - Policy-as-code: YAML rulepacks signed with Ed25519 (tamper-evident, auditable)
  - Approval flow: ASK decisions pause execution and wait for human approval in a UI
  - Append-only audit log with SHA-256 hash chain
  - Prompt injection scanner on tool call inputs/outputs (16 patterns, NFKC normalized)
  - Agent identity guard: blocks writes to CLAUDE.md, .cursorrules, system_prompt files
  - Claude Code, Cline, Aider, and Continue also supported via MCP or REST

  Honest limitations: operates at the execution boundary, not the cognitive layer. If
  An agent's context was poisoned before any tool call fires; Gatekeeper won't catch
  the injection — only its downstream consequences.