TrapDefense Docs
Public docs

Practical runtime security docs for real agent teams.

TrapDefense focuses on one narrow but important layer: controlling risky tool use, protecting sensitive outputs, and keeping audit trails for tool-using AI agents. This page is the fastest way to understand the runtime model without digging through internal design notes.

Guard tool calls before execution.
Redact PII in tool arguments and results.
Roll out safely with shadow, warn, and enforce.
Protect MCP handlers without building a proxy first.
Overview

What the SDK actually does.

TrapDefense does not try to be a full AI safety platform. It stays focused on runtime checks that matter when an agent is about to act or has just produced sensitive output.

Guard

Evaluate tool calls before they run.

Guard checks tool blocklists, egress rules, file path controls, PII in args, capability fallback, and default action handling.

Audit

Record runtime evidence.

AuditLogger writes structured JSONL events for scanner findings, before_tool and after_tool decisions, redaction, and errors.

Scanner

Catch basic content injection signals.

Scanner is intentionally lightweight. It detects hidden text, HTML comments, metadata payloads, base64 instructions, and prompt injection keywords as a first-pass filter.

Policy reference

Core Guard settings.

The policy model is intentionally small so teams can understand and tune it quickly during early rollout.

Setting Purpose Example
mode Choose rollout behavior: observe, warn, or enforce. shadow, warn, enforce
domain_allowlist Define allowed outbound URL or email destination domains. ["api.internal.com"]
block_egress Turn outbound domain checks on for URLs and email recipients. true
file_path_allowlist Limit tools to approved file system paths. ["/tmp/safe"]
tool_blocklist Hard-block named tools before all other policy checks. ["rm_rf"]
pii_action Control how PII in args or results is handled. off, warn, block
capability_policy Fallback policy for capability-tagged tools. {"shell_exec": "block"}
default_action Final fallback when no specific policy matched. warn
Rollout note
shadow always allows execution but keeps the original decision.
warn downgrades block to warn so teams can observe impact first.
after_tool redaction still applies even in shadow mode.
MCP guide

Protect MCP handlers in-process.

MCP is a strong starting point because the tool boundary is clear. TrapDefense keeps Guard synchronous and wraps only the MCP handler with a lightweight async decorator.

Install
pip install agent-runtime-security[mcp]

Use from asr.mcp import mcp_guard. The MCP adapter is optional and stays out of the base dependency path.

Behavior
Maps blocked tool calls into MCP-compatible tool errors.
Records before and after decisions when audit is configured.
Returns redacted results when PII protection triggers.
Rejects sync handlers at decoration time.
Scope

What TrapDefense does and does not do.

A clear boundary makes the SDK easier to adopt and easier to trust.

Good fit
Internal assistants that can call APIs, send email, or read files.
MCP servers that need enforcement and audit logging quickly.
Teams rolling out runtime policies through shadow, warn, then enforce.
Security reviews that need structured evidence of decisions.
Not the goal
Full model safety, evaluation, or jailbreak benchmarking platforms.
Standalone traffic proxy infrastructure out of the box.
Centralized multi-team governance in the open-source SDK alone.
Every agent risk category across every framework on day one.
Next step

Need centralized policies and rollout support?

TrapDefense Enterprise is the next layer for teams that want shared policy workflows, audit operations, and onboarding support around agent runtime controls.