Package Exports

@sandrobuilds/tracerney

Readme

Tracerney

Lightweight prompt injection detection for LLM applications. Runs 100% locally with zero data leaving your server.

🚀 Explore the full platform at tracerney.com — includes dashboard, analytics, and API management.

Install

npm install @sandrobuilds/tracerney

Usage

import { Tracerney } from '@sandrobuilds/tracerney';

const shield = new Tracerney();

const result = await shield.scanPrompt(userInput);

if (result.suspicious) {
  console.log('⚠️ Suspicious:', result.patternName);
  // Handle flagged prompt (log, block, rate-limit, etc.)
}

What's Included

933 attack patterns — comprehensive prompt injection and jailbreak detection
- 259 core forensic patterns (system overrides, prompt leaks, code execution, etc.)
- 675 real-world variants from Garak security research
Local detection — <0.021ms latency per prompt, zero network overhead
Zero dependencies — single npm package, 100% local processing
Privacy-first — no data leaves your server, zero data storage
Egress & PII scanning — detects API keys, secrets, emails, and data exfiltration attempts

Result Object

Layer 1 (Pattern Detection)

{
  suspicious: boolean;     // true if pattern matched
  patternName?: string;    // e.g., "Ignore Instructions"
  severity?: string;       // "CRITICAL" | "HIGH" | "MEDIUM" | "LOW"
  blocked: boolean;        // false (Layer 1 only marks suspicious)
}

Layer 2 (LLM Sentinel)

{
  action: "BLOCK" | "ALLOW";     // Final decision from LLM Sentinel
  confidence: number;             // 0.0 to 1.0 confidence score
  class: string;                  // Threat classification (e.g., "jailbreak_llm_detected")
  fingerprint: string;            // Unique threat identifier for tracking
}

Detected Patterns

Core Patterns (259)

Instruction overrides ("ignore all instructions")
Role-play jailbreaks ("act as unrestricted AI")
Hypothetical constraint bypass ("what would you do without constraints?")
System prompt exfiltration
Context confusion attacks
Data extraction attempts
Code execution risks
And 251 more forensic patterns...

Garak Research Patterns (675)

Advanced jailbreak variants from real-world research
DAN (Do Anything Now) attack variations
Sophisticated prompt injection techniques
Encoding-based evasion patterns
Character-based constraint bypass
Multi-turn attack sequences
And 670+ more variants from security research...

Multi-Layer Runtime Defense

Layer 1: Pattern Matching

933 total patterns — 259 core + 675 Garak research patterns
<0.021ms detection on modern hardware (238x faster than target)
Zero network overhead
100% local processing
Detects: instruction overrides, role-play jailbreaks, context confusion, code execution risks, data extraction attempts, and more

Layer 1 also runs a deterministic egress and PII scan on every prompt before the injection patterns fire. If a match is found, it returns suspicious: true with a label and reason — the SDK never decides the penalty, the developer does.

const result = await tracer.scanPrompt(input);

if (result.suspicious) {
  console.log(result.label);  // "SUSPICIOUS_EGRESS" | "SUSPICIOUS_SECRET" | "SUSPICIOUS_PII"
  console.log(result.reason); // "Detected 1 finding(s): Markdown Image with URL Query Params"

  // Your policy, your call:
  if (result.label === 'SUSPICIOUS_EGRESS') {
    return NextResponse.json({ error: 'Security violation' }, { status: 400 });
  }
}

Egress findings never reach Layer 2 — they are binary and deterministic. A markdown image tag smuggling data in query params either exists or it doesn't. Layer 2 is reserved for probabilistic threats where a regex alone cannot make a confident call.

Garak Attack Pattern Dataset

Includes 675 patterns from the Garak security research dataset, covering real-world prompt injection variants discovered through automated fuzzing and empirical testing.

Coverage includes:

648 real-world variants from in-the-wild attacks
12 DAN (Do Anything Now) attack variations
3 AutoDAN patterns
12 advanced prompt injection techniques

All patterns are deterministic regex matches — no behavioral changes, sub-millisecond latency. The SDK remains 100% local with zero data storage.

Layer 2: LLM Sentinel (Optional)

AI-powered response verification — LLM-based analysis for novel attack patterns
Context-aware scanning — understands your application's specific security policies
Delimiter salting — prevents prompt injection through response boundaries
Zero prompt storage — responses are analyzed in-memory, never saved or logged
Structured threat metadata — detailed fingerprints for audit trails and tracking
Advanced rate limiting — prevents cost spikes with intelligent throttling

Layer 2: LLM Sentinel Deep Dive

Layer 2 adds advanced security with LLM Sentinel, an AI-powered verification system that analyzes LLM responses for injection patterns and validates output safety. Combines local pattern detection (Layer 1) with server-side verification for defense-in-depth protection.

How Layer 1 & Layer 2 Work Together

Layer 1: Pattern Detection	Layer 2: LLM Sentinel (Optional)
933 patterns (local)	Server-side verification
Pattern matching	Output validation
<0.021ms latency	JSON safety checks
No data leaves device	Delimiter salting
Zero network calls	Context-aware analysis

Enabling Layer 2 (Optional)

Layer 2 is optional. Initialize with LLM Sentinel for additional AI-powered verification:

const shield = new Tracerney({
  apiKey: process.env.TRACERNEY_API_KEY,
  sentinelEnabled: true,
});

Layer 2 is automatically configured to use the hosted LLM Sentinel service.

Custom Layer 2 Configuration (Self-Hosted)

Want to self-host or use a custom backend? Override the sentinel endpoint:

const shield = new Tracerney({
  apiKey: process.env.TRACERNEY_API_KEY,
  sentinelEnabled: true,
  baseUrl: process.env.TRACERNEY_BASE_URL, // e.g., http://localhost:3000
  sentinelEndpoint: process.env.TRACERNEY_SENTINEL_ENDPOINT, // e.g., /api/v1/verify-prompt
});

You can build your own verification endpoint using the same pattern as our hosted service.

Scanning with Layer 2

With Layer 2 enabled, scanPrompt validates both input and LLM responses. Handle errors appropriately:

try {
  // Scan input (Layer 1 + Layer 2)
  const result = await tracer.scanPrompt(userInput);
  // If we get here, input is safe. Call LLM
  const llmResponse = await llm.chat(userInput);
  // Verify LLM output wasn't compromised
  const outputCheck = await tracer.verifyOutput(llmResponse);
  return llmResponse;
} catch (err) {
  if (err instanceof ShieldBlockError) {
    return NextResponse.json(
      { error: "Input content is flagged as suspicious" },
      { status: 400 }
    );
  }
  throw err;
}

API Response Format

The verify-prompt endpoint returns structured responses. Success (HTTP 200) includes classification, confidence, and fingerprint. Errors include specific error codes and messages.

✅ Content is Safe (HTTP 200)

{
  "action": "ALLOW",
  "confidence": 0.15,
  "class": "safe_content",
  "fingerprint": "a3f7k2"
}

🔴 Content is Blocked (HTTP 200)

{
  "action": "BLOCK",
  "confidence": 0.99,
  "class": "jailbreak_semantic_pattern",
  "fingerprint": "c1p5n3"
}

⚠️ Quota Exceeded (HTTP 402)

{
  "blocked": true,
  "reason": "scan_limit_exceeded",
  "scansUsed": 50,
  "limit": 50,
  "message": "Free plan limit reached (50/month)..."
}

Production Usage

Basic Setup (Layer 1 only)

const shield = new Tracerney();

Optimized for Production

const shield = new Tracerney({
  enableTelemetry: false,   // Disable if not using backend
  sentinelEnabled: false    // Disable if not using Layer 2
});

With Layer 2 (Advanced)

const shield = new Tracerney({
  sentinelEnabled: true,
  apiKey: process.env.TRACERNEY_API_KEY
});

License

MIT