Guardian Checks¶
Guardian runs 7 deterministic checks on every tool invocation, in this order. The first failing check short-circuits the pipeline.
1. Tool revocation list¶
What it catches: known-bad tool IDs.
When a tool is found to be malicious, vulnerable, or deprecated, its ID is added to the revocation list. The list is a small table queried in O(1) memory; the entire list lives in Guardian's in-memory cache.
CREATE TABLE revoked_tools (
tool_id TEXT PRIMARY KEY,
revoked_at TIMESTAMPTZ DEFAULT now(),
reason TEXT NOT NULL
);
Decision: if tool_id IN revoked_tools → deny with reason = "tool_revoked: <reason>".
2. Hash validation¶
What it catches: supply-chain tool tampering.
Every registered tool has a SHA-256 of its loaded code in tool_registry. At invocation time, Guardian re-hashes the live code and compares.
A mismatch means the tool's source changed since it was registered — possibly due to a dependency upgrade replacing a function, a malicious package update, or local file tampering.
Decision: if hash(live_code) != registry.expected_hash → deny with reason = "hash_mismatch". Threat event TOOL_HASH_MISMATCH is logged.
3. Capability boundary¶
What it catches: capability creep — an agent invoking a tool outside its task's authorized scope.
Every task carries a capability scope: an array of strings like ["read", "fetch:web", "summarize"]. Every tool declares its required capability: shell_exec requires shell:safe or shell:full; db_write requires db:write.
Decision: if tool.required_capability NOT IN task.capability_scope → deny with reason = "capability_boundary: missing <capability>".
This is the layer that stops the "give me capability X just for this one thing" sliding-scale failure mode. Scope is set at task creation and never widens during execution.
4. Destructive pattern detection¶
What it catches: dangerous arguments — usually the result of an LLM splicing user-controlled text into a tool argument.
The check is a list of regex patterns applied to JSON-serialized args:
| Pattern | Why |
|---|---|
rm\s+-rf\s+/ |
Recursive root delete |
(DROP|TRUNCATE)\s+TABLE |
Destructive SQL |
:(){:|:&};: |
Fork bomb |
curl.+(\|sh|\|bash) |
Pipe-to-shell |
169\.254\.169\.254 |
Cloud metadata service IP |
chmod\s+777 |
World-writable permission |
The pattern set is extensible via threat_rules (action = deny, match_field = args).
Decision: if any pattern matches → deny with reason = "destructive_pattern: <pattern>".
5. Sequence contracts¶
What it catches: out-of-order tool calls that violate stated contracts.
Some tool sequences have implicit ordering. Examples:
delete_fileshould be preceded byread_file(so the agent has acknowledged what it's deleting)db_commitshould be preceded bydb_beginemail_sendshould be preceded byemail_draft(so HITL can review)
Sequence contracts are configured in the sequence_contracts table:
CREATE TABLE sequence_contracts (
contract_name TEXT PRIMARY KEY,
requires_prior TEXT NOT NULL, -- tool name that must precede
triggers_for TEXT NOT NULL, -- tool name that triggers the check
within_steps INTEGER NOT NULL DEFAULT 5
);
Decision: if triggers_for is invoked and requires_prior was not invoked within the last within_steps → deny with reason = "sequence_contract: <contract_name>".
6. Ed25519 signature verification¶
What it catches: unauthorized tool registration.
Tools aren't just hashed — they're signed. The Ed25519 private key lives in macOS Keychain (legionforge_tool_signer) and is injected as TOOL_SIGNING_PRIVATE_KEY env var only at signing time. The public key is in Guardian's config.
Every row in tool_registry has a signature column. Guardian verifies signature(hash) == registry.signature using the public key.
Decision: if signature verification fails → deny with reason = "signature_invalid".
This catches the case where someone with PostgreSQL write access inserts a malicious tool registration. They can write the row, but they can't produce a valid signature without the private key.
7. Adaptive threat rules¶
What it catches: novel patterns discovered post-deployment.
This is the extension point. The threat_rules table holds operator-defined rules that fire on:
- Tool name
- Argument content
- Code hash
- Combinations
Rules are hot-reloaded every 10 seconds, so a new rule goes live without redeploying anything.
INSERT INTO threat_rules (rule_name, match_pattern, match_field, action, reason)
VALUES ('block_external_drive_writes',
'/Volumes/(?!MAC_MINI_1TB)',
'args',
'deny',
'Writes to non-canonical external drives are blocked');
Decision: if any enabled rule matches and action = deny → deny with reason = "adaptive_rule: <rule_name>".
Rules with action = flag are logged as threat_events but the call is allowed. Useful for measuring how often a candidate rule would fire before turning it into a deny.
Performance characteristics¶
| Check | Typical latency |
|---|---|
| 1. Revocation list | ~0.05ms (in-memory lookup) |
| 2. Hash validation | ~0.5ms (SHA-256 of typical tool code) |
| 3. Capability boundary | ~0.05ms (set membership) |
| 4. Destructive patterns | ~0.5ms (regex pipeline) |
| 5. Sequence contracts | ~0.3ms (history query) |
| 6. Signature verification | ~1ms (Ed25519 verify) |
| 7. Adaptive rules | ~1-3ms (regex pipeline, depends on rule count) |
| Total | ~3-5ms in practice |
The total budget is small enough that Guardian sits in the hot path of every tool call without measurable user-visible impact.