Skip to main content

Custom Analyzer SDK

Tollgate provides an SDK for creating custom content analyzers that integrate seamlessly with the policy engine. Custom analyzers let you add specialized risk analysis for protocols, APIs, or data formats not covered by the built-in analyzers.

Quick Start

Create a custom analyzer in a TypeScript or JavaScript file:
// analyzers/graphql.ts
import { defineAnalyzer } from '@dotsetlabs/tollgate';

export default defineAnalyzer({
  name: 'graphql',
  description: 'Analyzes GraphQL queries and mutations',

  analyze(content, context) {
    if (content.includes('mutation')) {
      return { risk: 'write', reason: 'GraphQL mutation detected' };
    }
    if (content.includes('subscription')) {
      return { risk: 'read', reason: 'GraphQL subscription' };
    }
    return { risk: 'read', reason: 'GraphQL query' };
  }
});
Register it in your configuration:
version: "1"

analyzers:
  - ./analyzers/graphql.ts

servers:
  hasura:
    command: npx
    args: ["-y", "@modelcontextprotocol/server-hasura"]
    tools:
      "execute":
        analyzer: graphql
        risks:
          read: allow
          write: prompt

defineAnalyzer

The primary API for creating custom analyzers.
import { defineAnalyzer } from '@dotsetlabs/tollgate';

export default defineAnalyzer({
  // Required: Unique name (alphanumeric, hyphens, underscores)
  name: 'my-analyzer',

  // Optional: Description for documentation
  description: 'Analyzes my custom protocol',

  // Required: The analysis function
  analyze(content, context) {
    // Return risk assessment
    return {
      risk: 'read',  // 'safe' | 'read' | 'write' | 'destructive' | 'dangerous'
      reason: 'Explanation for the classification',
      triggers: ['pattern1', 'pattern2'],  // Optional: what triggered this
      metadata: { key: 'value' }  // Optional: additional data
    };
  },

  // Optional: Initialize resources
  init() {
    // Called when analyzer is registered
  },

  // Optional: Cleanup resources
  cleanup() {
    // Called when analyzer is unregistered
  },

  // Optional: Custom content extraction from tool arguments
  extractContent(tool, args) {
    return args.query as string ?? null;
  }
});

Analysis Context

The context parameter provides additional information:
analyze(content, context) {
  if (context?.server === 'production') {
    // Stricter checks for production
    return { risk: 'dangerous', reason: 'Production access requires approval' };
  }
  // ...
}
Available context properties:
  • server: The MCP server name
  • tool: The tool being called
  • args: Full tool arguments (for reference)

defineAsyncAnalyzer

For analyzers that need to perform async operations:
import { defineAsyncAnalyzer } from '@dotsetlabs/tollgate';

export default defineAsyncAnalyzer({
  name: 'ml-classifier',

  async init() {
    await loadModel();
  },

  async analyze(content) {
    const result = await classifyWithML(content);
    return {
      risk: result.risk,
      reason: result.explanation,
      metadata: { confidence: result.confidence }
    };
  },

  async cleanup() {
    await unloadModel();
  }
});
Async analyzers may add latency to tool call processing. Use caching or fast inference for production workloads.

createPatternAnalyzer

A helper for simple pattern-based analyzers:
import { createPatternAnalyzer } from '@dotsetlabs/tollgate';

export default createPatternAnalyzer('redis', {
  dangerous: [/FLUSHALL/i, /FLUSHDB/i, /CONFIG\s+SET/i],
  destructive: [/DEL\s/i, /EXPIRE\s/i, /UNLINK\s/i],
  write: [/SET\s/i, /HSET\s/i, /LPUSH\s/i, /SADD\s/i],
  read: [/GET\s/i, /HGET\s/i, /LRANGE\s/i, /SMEMBERS\s/i],
}, 'read');  // Default risk if no patterns match
Pattern matching is done in order of severity (dangerous first), and the first match wins.

Custom Content Extraction

By default, Tollgate tries common argument names to extract content for analysis. For custom protocols, you may need to specify how to extract the relevant content:
defineAnalyzer({
  name: 'json-rpc',

  extractContent(tool, args) {
    // Extract method from JSON-RPC calls
    if (args.jsonrpc && args.method) {
      return args.method as string;
    }
    return null;
  },

  analyze(content) {
    // content is now the RPC method name
    if (content.startsWith('admin.')) {
      return { risk: 'dangerous', reason: 'Admin RPC method' };
    }
    return { risk: 'safe', reason: 'Regular RPC method' };
  }
});

Registration and Lifecycle

Config-Based Registration

Add analyzers to your tollgate.yaml:
analyzers:
  - ./local-analyzers/graphql.ts
  - ./local-analyzers/redis.js
  - @myorg/tollgate-analyzers/pii
Paths can be:
  • Relative paths (./analyzers/custom.ts)
  • Absolute paths (/path/to/analyzer.js)
  • Package names (@myorg/package-name)

Programmatic Registration

import { analyzerRegistry, defineAnalyzer } from '@dotsetlabs/tollgate';

const myAnalyzer = defineAnalyzer({
  name: 'custom',
  analyze: (content) => ({ risk: 'safe', reason: 'ok' })
});

analyzerRegistry.register(myAnalyzer);

// Initialize all custom analyzers
await analyzerRegistry.initializeCustomAnalyzers();

// Later: cleanup
await analyzerRegistry.cleanupCustomAnalyzers();

Examples

MongoDB Query Analyzer

import { defineAnalyzer } from '@dotsetlabs/tollgate';

export default defineAnalyzer({
  name: 'mongodb',
  description: 'Analyzes MongoDB operations',

  analyze(content) {
    const lower = content.toLowerCase();

    // Dangerous operations
    if (/db\.dropDatabase|drop\s*\(/.test(lower)) {
      return { risk: 'dangerous', reason: 'Database drop operation' };
    }

    // Destructive operations
    if (/deleteMany|deleteOne|remove\s*\(/.test(lower)) {
      return { risk: 'destructive', reason: 'Delete operation' };
    }

    // Write operations
    if (/insertOne|insertMany|updateOne|updateMany|replaceOne/.test(lower)) {
      return { risk: 'write', reason: 'Write operation' };
    }

    // Aggregations can be expensive
    if (/aggregate\s*\(/.test(lower)) {
      return {
        risk: 'read',
        reason: 'Aggregation pipeline',
        metadata: { potentiallyExpensive: true }
      };
    }

    return { risk: 'read', reason: 'Read operation' };
  }
});

PII Detection Analyzer

import { defineAnalyzer } from '@dotsetlabs/tollgate';

const PII_PATTERNS = {
  ssn: /\b\d{3}-\d{2}-\d{4}\b/,
  email: /\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b/,
  creditCard: /\b\d{4}[\s-]?\d{4}[\s-]?\d{4}[\s-]?\d{4}\b/,
  phone: /\b\d{3}[\s.-]?\d{3}[\s.-]?\d{4}\b/,
};

export default defineAnalyzer({
  name: 'pii-detector',
  description: 'Detects personally identifiable information',

  analyze(content) {
    const detectedPii: string[] = [];

    for (const [type, pattern] of Object.entries(PII_PATTERNS)) {
      if (pattern.test(content)) {
        detectedPii.push(type);
      }
    }

    if (detectedPii.length > 0) {
      return {
        risk: 'dangerous',
        reason: `PII detected: ${detectedPii.join(', ')}`,
        triggers: detectedPii,
        metadata: { piiTypes: detectedPii }
      };
    }

    return { risk: 'safe', reason: 'No PII detected' };
  }
});

Best Practices

Be Specific

Focus on one protocol or data type per analyzer. This makes policies clearer and more maintainable.

Return Clear Reasons

Always provide descriptive reasons. They appear in logs and help users understand decisions.

Handle Errors Gracefully

If parsing fails, return a conservative risk level rather than crashing.

Use Triggers

Include triggers in results to show what patterns matched. Helps with debugging.

Error Handling

analyze(content) {
  try {
    const parsed = JSON.parse(content);
    // analyze parsed content...
  } catch {
    // Can't parse - treat as potentially risky
    return {
      risk: 'write',
      reason: 'Could not parse content, defaulting to write risk'
    };
  }
}

Performance Tips

  1. Compile patterns once: Store compiled RegExp objects at module level
  2. Short-circuit evaluation: Check dangerous patterns first
  3. Avoid blocking: Use defineAsyncAnalyzer for slow operations
  4. Cache results: For expensive analysis, consider caching by content hash