TrapDefense Docs
Coverage Matrix

Coverage Matrix

What TrapDefense blocks, what it warns on, and what it does not try to solve yet.

A plain view of what TrapDefense blocks, warns on, or does not cover yet.

Coverage by Threat Type

Action Layer — strongest coverage

ThreatResponseHow
Unauthorized outbound data transferBlock/decide egress policy with domain_allowlist
High-risk tools such as shell, eval, or destructive DB actionsBlock/decide tool blocklist
Unauthorized file accessBlock/decide file path allowlist
Sensitive capabilities such as credential access or admin actionsBlock/Warn/decide capability policy
Tool arguments containing PIIBlock/Warn/decide PII detection
Tool results containing PIIRedact/redact output redaction
Webhook-based exfiltrationDetect/scan webhook_exfil
Channel-based exfiltration to Discord, Telegram, Gist, cloud upload URLs, or presigned URLsDetect/scan premium exfil patterns when sensitive exfil context is present
Credential harvesting phrasesDetect/scan credential_harvest
Secret bundle or secret-file referencesDetect/scan credential_bundle_dump, env_secret_reference
Privilege-escalation phrasesDetect/scan privilege_escalation
Consent bypass and bulk archive export promptsDetect/scan consent_bypass_phrase, bulk_archive_export

Perception Layer — strong coverage

ThreatResponseHow
Injection hidden in CSS-hidden textDetect/scan css_hidden_text
Injection hidden in HTML commentsDetect/scan html_comment_injection
Injection hidden in metadataDetect/scan metadata_injection
Injection hidden in markdown linksDetect/scan markdown_link_payload
Prompt-injection keywordsDetect/scan prompt_injection_keywords
Base64-encoded instructionsDetect/scan base64_encoded_instruction
Invisible Unicode manipulationDetect/scan invisible_unicode
Role override attemptsDetect/scan role_override_attempt
Exfiltration promptsDetect/scan data_exfil_phrase
Encoding-bypass attemptsDetect/scan encoded_bypass

Injection Defense — conditional coverage

ThreatResponseHowConstraint
SQL injection patternsDetect/scan sql_injectionRequires stronger injection indicators, not generic DML
NoSQL injection patternsDetect/scan nosql_injectionLimited to JSON-object style contexts
Shell command injectionDetect/scan command_injection
Directory traversalDetect/scan path_traversal
SSRFDetect/scan ssrf_attemptFocused on metadata endpoints and internal services

Infrastructure Signals — partial coverage

ThreatResponseHow
JWT exposureDetect/scan jwt_exposure
Internal IP URL targetsDetect/scan internal_ip_reference in URL contexts
Log injection or forging phrasesDetect/scan log_injection
Suspicious shortened or direct-IP URLsDetect/scan suspicious_url

What TrapDefense Does Not Cover Yet

AreaStatusWhy
Memory poisoningNot coveredRuntime access to agent memory layers is outside the current scope
Multi-agent systemic attacksNot coveredThe system does not observe interactions across separate agents
Dynamic cloakingNot coveredStatic pattern checks cannot fully catch payloads that mutate at execution time
Persona manipulation and subtle biased phrasingNot coveredThis would require deeper semantic analysis beyond the current regex-focused layer
Steganographic payloadsNot coveredImage and media steganography is out of scope
Zero-day injection techniquesPartialKnown patterns are covered; novel attacks need pattern updates
Full semantic understanding of intentNot coveredThe API does not currently use LLM-based semantic classification

Coverage by Endpoint

EndpointPrimary roleMain coverage
/v1/scanInput inspectionInjection signals, bypass attempts, exfiltration prompts, suspicious URLs
/v1/decidePre-execution decisionOutbound transfer control, tool gating, file path control, argument-side PII exposure
/v1/redactOutput protectionPII masking across 17 profiles and multiple regional/payment formats

Alignment with DeepMind's Framework

Compared with the six attack layers described in Google DeepMind's AI Agent Traps:

LayerCoverageNotes
PerceptionStrongHidden payload and injection pattern coverage
ActionStrongestPolicy evaluation, egress control, and result redaction
ReasoningPartialIndirect protection through prompt-injection pattern checks
MemoryNot coveredNo direct runtime access to memory layers
Multi-agentNot coveredNo cross-agent visibility
Human oversightNot coveredOutside the current runtime API scope

Bottom line: TrapDefense is strongest in the Perception and Action layers, where tools run, data moves, and damage becomes operationally real.