TrapDefense Docs
Public docs

Practical runtime security docs for real agent teams.

TrapDefense focuses on practical runtime defenses for content injection and behavioral-control risks, especially risky tool use, sensitive outputs, and audit evidence for tool-using AI agents. This page is the fastest way to understand the runtime model without digging through internal design notes.

Guard tool calls before execution.
Redact PII in tool arguments and results.
Roll out safely with shadow, warn, and enforce.
Protect MCP, LangChain, and LangGraph workflows.
Guided start

Follow the path that matches your team.

Start with the HTTP quickstart if you want to evaluate the API fast. Start with the MCP pack if you are protecting tool handlers in production. Then move into the API reference library for endpoint, preset, and PII details.

Quickstart

Try `/scan`, `/decide`, and `/redact` in five minutes.

Copy three curl requests, inspect the planned request and response flow, and understand where the API fits before public hosted access opens.

MCP Pack

Protect an MCP server with a recommended policy bundle.

Use the `mcp-server` preset, regional PII profiles, rollout guidance, and integration examples built for tool-using agents.

API Library

Developer-facing API content, organized by decision point.

These pages are the practical reference layer behind TrapDefense: endpoint schemas, measured examples, policy presets, PII profiles, coverage boundaries, and minimal integration guidance.

Reference

API Reference

Field-level docs for `/scan`, `/decide`, `/redact`, auth, and current error behavior.

Examples

Use-case Cookbook

Measured request and response examples for MCP servers, internal assistants, support, and finance flows.

Policy

Preset Catalog

Understand the 17 presets by mode, egress posture, PII handling, and capability-level controls.

PII

PII Profile Catalog

See exactly what each regional or payment profile detects, redacts, and where false positives may show up.

Coverage

Coverage Matrix

Check what TrapDefense blocks, what it warns on, and what is still intentionally out of scope.

Integration

Integration Guide

Minimal Python and MCP insertion guidance for teams wiring the API around real tool handlers.

Overview

What the SDK actually does.

TrapDefense does not try to be a full AI safety platform. It stays focused on practical runtime defenses in the perception and action layers, where an agent is about to act or has just produced sensitive output.

Guard

Evaluate tool calls before they run.

Guard checks tool blocklists, egress rules, file path controls, PII in args, capability fallback, and default action handling.

Audit

Record runtime evidence.

AuditLogger writes structured JSONL events for scanner findings, before_tool and after_tool decisions, redaction, and errors.

Scanner

Detect hidden prompt payloads and content injection signals.

Scanner is intentionally lightweight. It detects hidden text, HTML comments, metadata payloads, base64 instructions, and prompt injection keywords as a first-pass layer for hidden prompt payloads and content injection.

Policy reference

Core Guard settings.

The policy model is intentionally small so teams can understand and tune it quickly during early rollout.

Setting Purpose Example
mode Choose rollout behavior: observe, warn, or enforce. shadow, warn, enforce
domain_allowlist Define allowed outbound URL or email destination domains. ["api.internal.com"]
block_egress Turn outbound domain checks on for URLs and email recipients. true
file_path_allowlist Limit tools to approved file system paths. ["/tmp/safe"]
tool_blocklist Hard-block named tools before all other policy checks. ["rm_rf"]
pii_action Control how PII in args or results is handled. off, warn, block
capability_policy Fallback policy for capability-tagged tools. {"shell_exec": "block"}
default_action Final fallback when no specific policy matched. warn
Rollout note
shadow always allows execution but keeps the original decision.
warn downgrades block to warn so teams can observe impact first.
after_tool redaction still applies even in shadow mode.
MCP guide

Protect MCP handlers in-process.

MCP is a strong starting point because the tool boundary is clear. TrapDefense keeps Guard synchronous and wraps only the MCP handler with a lightweight async decorator.

Install
pip install agent-runtime-security[mcp]

Start with @guard.tool() for normal Python tools. Use mcp_guard only when you specifically need MCP-native ToolError conversion during the current transition window.

Behavior
Maps blocked tool calls into MCP-compatible tool errors.
Records before and after decisions when audit is configured.
Returns redacted results when PII protection triggers.
Rejects sync handlers at decoration time.
Framework adapters

Guard LangChain and LangGraph tools.

The same Guard engine that protects MCP handlers also works with LangChain tools and LangGraph ToolNodes, using lightweight adapters that keep the integration surface minimal.

LangChain
pip install agent-runtime-security[langchain]

from asr import Guard
from asr.adapters.langchain import guard_tool

guard = Guard.from_policy_file("policy.yaml")

protected = guard_tool(
    my_tool, guard=guard,
    capabilities=["network_send"],
)

guard_tool() wraps any LangChain BaseTool with before_tool policy checks and after_tool PII redaction. Blocked calls return a ToolException that the agent handles gracefully.

LangGraph
pip install agent-runtime-security[langgraph]

from asr import Guard
from asr.adapters.langgraph import (
    create_guarded_tool_node,
)

tool_node = create_guarded_tool_node(
    tools=[search, file_reader],
    guard=guard,
    capabilities_map={
        "search": ["network_send"],
    },
)

create_guarded_tool_node() wraps every tool inside a ToolNode with Guard policies. Plug it into any LangGraph state graph as a drop-in replacement.

Scope

What TrapDefense does and does not do.

A clear boundary makes the SDK easier to adopt and easier to trust.

Good fit
Internal assistants that can call APIs, send email, or read files.
MCP servers that need enforcement and audit logging quickly.
Teams rolling out runtime policies through shadow, warn, then enforce.
Security reviews that need structured evidence of decisions.
Not the goal
Full model safety, evaluation, or jailbreak benchmarking platforms.
Standalone traffic proxy infrastructure out of the box.
Centralized multi-team governance in the open-source SDK alone.
Every agent risk category across every framework on day one.
Next step

Need centralized policies and rollout support?

TrapDefense Enterprise is the next layer for teams that want shared policy workflows, audit operations, and onboarding support around agent runtime controls.