How Detection Works
Deadfall uses a novel “cognitive honeypot” approach to detect AI agent compromise — exploiting the instruction-following behavior of AI models.The Cognitive Honeypot Strategy
Unlike traditional honeypots that rely on file access monitoring (which requires kernel-level access), Deadfall takes a different approach:- Trap Creation — Plant files that look like valuable context (
.cursorrules,CLAUDE.md, etc.) - Instruction Embedding — These files contain instructions for AI agents to call a verification tool
- Token Correlation — Each trap has a unique token that identifies which file was read
- Alert Triggering — When an AI agent follows the instruction, the Honey-MCP server alerts you
Why This Works
AI coding assistants are designed to:- Read context files automatically (
.cursorrules,CLAUDE.md,CONTEXT.md) - Follow instructions they find in these files
- Use available MCP tools when instructed
deadfall_ping, triggering an alert that confirms the agent read the file.
Token Correlation
Each trap gets a unique 32-character token:deadfall_ping is called with a token:
- The Honey-MCP server looks up the token in
deadfall.json - If found, the alert includes the file path, trap type, and creation time
- If not found, a warning is logged (possible tampering or manual invocation)
Alert Information
When a trap triggers, the alert includes:| Field | Description |
|---|---|
token | The unique trap token |
file | Path to the trapped file that was read |
trap_type | Type of trap (cursor-rules, claude-context, etc.) |
created_at | When the trap was deployed |
timestamp | When the alert was triggered |
Honeypot Tools
Beyond trap detection, the Honey-MCP server provides honeypot tools that attract malicious agents:| Tool | Purpose |
|---|---|
admin_get_secrets | Attracts credential-seeking agents |
db_full_dump | Attracts data exfiltration attempts |
execute_shell_command | Attracts command execution attempts |
get_api_keys | Attracts API key theft attempts |
Detection vs. Prevention
Deadfall is a detection tool, not a prevention tool. It tells you when an AI agent:- Read a trapped configuration file
- Attempted to use a honeypot tool
- Block malicious actions
- Modify AI agent behavior
- Require kernel-level access