JSPM

crowbar-security

0.1.3
  • ESM via JSPM
  • ES Module Entrypoint
  • Export Map
  • Keywords
  • License
  • Repository URL
  • TypeScript Types
  • README
  • Created
  • Published
  • Downloads 12
  • Score
    100M100P100Q49662F
  • License MIT

autonomous black-box web penetration testing. give it a URL, it finds everything exploitable.

Package Exports

    This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (crowbar-security) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.

    Readme

    crowbar

    autonomous web penetration testing. give it a url, it finds everything exploitable.

    what it does

    crowbar crawls a live website, fingerprints the stack, selects attack vectors, exploits vulnerabilities, and hands you a proof-of-concept report. no source code needed. no manual endpoint mapping. point and shoot.

    npx crowbar-security scan https://target.com
    crowbar v0.1.0
    
    target: https://target.com
    [recon] passive: 14 subdomains, 23 historical endpoints
    [recon] complete: 47 endpoints, 12 forms, 8 JS files
    [mapper] stack: server=nginx, language=Node.js, framework=Express, database=PostgreSQL, waf=Cloudflare
    [mapper] found 6 hidden parameters across 3 endpoints
    [mapper] 142 attack vectors planned
    [mapper] payload order randomized (anti-fingerprinting)
    [attack] 142/142 (5 vulns found)
    [verify] 4 vulnerabilities confirmed
    [chain] 2 attack chains identified
    [report] saved: ./crowbar-report.md, ./crowbar-report.json
    
    crowbar scan complete in 27.3s
    4 vulnerabilities confirmed across 47 endpoints

    install

    npm install -g crowbar-security
    npx playwright install chromium

    requires node 20+.

    api keys

    on first run, crowbar prompts for an API key automatically. or set it manually:

    crowbar config set anthropic-key sk-ant-YOUR_KEY

    max settings

    crowbar scan https://target.com \
      --ai-aggressive \
      --swarm \
      --max-requests 5000 \
      --max-depth 5 \
      --rate-limit 50 \
      --format json,md,html,sarif \
      --output ./crowbar-report

    cli

    9 commands:

    # full autonomous scan
    crowbar scan https://target.com
    
    # recon only (no attacks)
    crowbar recon https://target.com
    
    # attack specific endpoint
    crowbar attack https://target.com --endpoint /api/users
    
    # re-verify findings from a previous report
    crowbar verify ./crowbar-report.json
    
    # CI/CD pipeline scan (non-interactive, exits with status code)
    crowbar ci https://target.com --fail-on high --webhook https://hooks.slack.com/xxx
    
    # scan multiple targets from a file
    crowbar multi targets.txt --parallel 3
    
    # continuous monitoring (periodic scans, delta alerts)
    crowbar watch https://target.com --interval 6h --webhook https://hooks.slack.com/xxx
    
    # web dashboard (localhost only)
    crowbar serve --port 3333
    
    # manage API keys and config
    crowbar config set anthropic-key sk-ant-...
    crowbar config list

    scan options

    # authentication
    crowbar scan https://target.com --cookie "session=abc123"
    crowbar scan https://target.com --bearer "eyJ..."
    
    # modes
    crowbar scan https://target.com --mode stealth
    crowbar scan https://target.com --mode aggressive
    
    # output formats
    crowbar scan https://target.com --format md,json,html,sarif,csv
    
    # external knowledge bases
    crowbar scan https://target.com --nuclei-templates ~/nuclei-templates
    crowbar scan https://target.com --wordlist ~/SecLists
    crowbar scan https://target.com --payloads ~/PayloadsAllTheThings
    
    # incremental scanning (only test new/changed endpoints)
    crowbar scan https://target.com --incremental
    
    # resume interrupted scan
    crowbar scan https://target.com --resume
    
    # with pinata integration (white-box-guided)
    crowbar scan https://target.com --gaps gaps.json
    
    # dry run (recon + mapping, no attacks)
    crowbar scan https://target.com --dry-run
    
    # scope and safety
    crowbar scan https://target.com --rate-limit 5 --max-requests 1000
    crowbar scan https://target.com --scope "target.com,api.target.com"

    ci/cd integration

    # .github/workflows/security.yml
    - uses: crowbar-security/scan@v1
      with:
        target: https://staging.yourapp.com
        fail-on: high
        format: sarif
        auth-cookie: ${{ secrets.SESSION_COOKIE }}

    the GitHub Action runs crowbar ci, uploads SARIF to GitHub Security tab, and fails the build if findings exceed the severity threshold.

    how it works

    URL
     |
    RECON --> MAPPER --> ATTACKER --> VERIFIER --> REPORTER
     |          |          |            |
     v          v          v            v
              KNOWLEDGE GRAPH
              (queryable by all phases)
                  |
               AI BRAIN
       (strategy, adaptation, chains)

    12-phase pipeline:

    1. passive recon: DNS enumeration (100+ subdomain wordlist), certificate transparency (crt.sh), Wayback Machine historical endpoints
    2. active recon: Playwright crawl with network interception (captures every real API call SPAs make), JS bundle analysis, source map harvesting, path probing (~150 common paths), SPA interaction (clicks buttons, navigates Angular/React routes)
    3. tech fingerprinting: 60+ detection rules across headers, cookies, error messages, body patterns, WAF signatures
    4. auto-authentication: discovers login endpoints, tries SQLi bypass + default credentials, registers test accounts, stores JWT for authenticated testing
    5. attack planning: context-aware attack selection. source-aware priority boosting when --repo is provided
    6. attack execution: 41 attack plugins + targeted probes against well-known API paths. 5-layer WAF evasion
    7. verification: deterministic plugin verification + AI fallback for ambiguous cases
    8. proof-by-exploitation: replays every confirmed vuln in a real Playwright browser, captures screenshots as evidence
    9. autonomous exploitation: ReAct agent loop uses Claude tool_use to escalate each vuln to maximum impact (UNION data extraction, IDOR enumeration, credential theft)
    10. swarm deep-dive (--swarm): 6 specialist AI agents attack in parallel, adversarial QA agent filters false positives
    11. chain discovery: 8 chain templates plus AI-driven novel chain discovery
    12. reporting: markdown, JSON, HTML, SARIF, CSV with curl PoCs and compliance mapping

    41 attack plugins

    injection: SQL injection (error + blind boolean + blind timing + UNION), NoSQL injection (MongoDB operators), command injection (output + timing), server-side template injection (Jinja2, Twig, Freemarker, Velocity, ERB, Pug), XML external entity (file + SSRF + parameter entity), second-order SQLi/XSS (cross-endpoint correlation)

    cross-site: reflected XSS (context-aware: HTML body, attribute, script, URL), stored XSS, DOM XSS (Playwright source-sink tracing), CSRF (cross-origin token check), postMessage origin validation

    access control: IDOR (sequential + UUID probing), BOLA (cross-object authorization, method override), CORS misconfiguration (origin reflection, null trust, regex bypass), forced browsing, auth bypass (default credentials), mass assignment (27 sensitive fields), broken access control (method + header injection), rate limit bypass (header-based IP spoofing)

    infrastructure: SSRF (localhost, cloud metadata AWS/GCP/Azure/DigitalOcean, 12 IP bypass variants), path traversal (encoding variants, null byte), subdomain takeover (13 service fingerprints), host header injection, WebSocket security (cross-site hijacking, origin validation)

    code execution: malicious file upload (web shells, polyglot, extension bypass), prototype pollution (server-side + client-side via Playwright), insecure deserialization

    auth: JWT algorithm confusion (alg none + admin forgery), OAuth/OIDC (state parameter CSRF, redirect_uri bypass with 10 evasion variants, scope escalation, implicit flow token exposure)

    logic: race conditions (parallel TOCTOU), workflow bypass (step skipping, state machine violation), open redirect, GraphQL (introspection dump, batching, field suggestion)

    caching: web cache poisoning (unkeyed header injection, unkeyed parameter injection, path-based cache deception, delimiter discrepancy, hop-by-hop header abuse)

    external: known CVE detection via nuclei templates

    waf evasion

    5 escalating layers, triggered automatically when a WAF blocks:

    1. encoding: URL, double URL, unicode, HTML entity
    2. structural: case randomization, SQL comment insertion, null bytes
    3. http-level: content-type switching, HTTP parameter pollution
    4. protocol-level: chunked transfer encoding obfuscation
    5. network-level: IP spoofing headers (10 variants), proxy rotation

    detects Cloudflare, AWS WAF, ModSecurity, Akamai, Imperva, F5, Azure WAF.

    ai brain

    uses Anthropic Claude and OpenAI GPT with model routing: cheap models (gpt-4o-mini) for response parsing and payload generation, expensive models (claude sonnet) for strategy planning, verification, and chain discovery.

    cost tracking: every AI call tracked. configurable budget cap (default $10). typical scan costs $1-5. works without AI too.

    --ai-aggressive mode

    crowbar scan https://target.com --ai-aggressive

    enables 4 AI-powered capabilities that turn crowbar from a template matcher into an adaptive hacker:

    • AI recon expansion: after crawling, the AI analyzes discovered URL patterns and suggests hidden endpoints (admin panels, API versioning, debug endpoints). crowbar probes each suggestion and adds confirmed ones to the attack surface.
    • AI target prioritization: the AI ranks which endpoints are most likely exploitable based on the full knowledge graph (tech stack, parameter names, response patterns).
    • AI novel payload generation: when a plugin's template payloads all fail, the AI generates 3 creative payloads that differ structurally from the failures, considering the detected tech stack and WAF. research shows 80%+ WAF bypass rates with LLM-generated payloads.
    • AI response analysis: each AI-generated payload's response is analyzed by the AI to determine if exploitation succeeded, catching edge cases that regex patterns miss.

    per-endpoint cost guard (max 3 AI calls per endpoint) prevents runaway spend. all AI decisions are logged with [ai] prefix for auditability.

    autonomous exploitation (ReAct agent)

    when --ai-aggressive is enabled, crowbar doesn't just detect vulnerabilities -- it exploits them to demonstrate maximum impact. after confirming a finding, a ReAct (reasoning + acting) agent loop takes over:

    1. the agent reasons about what exploitation steps to take
    2. executes HTTP requests as actions (UNION SELECT enumeration, data extraction, privilege escalation)
    3. observes the response and adapts its strategy
    4. repeats until it achieves concrete impact or exhausts approaches (max 15 steps)

    for SQLi, this means going from "error-based detection" to "extracted 3 user records including password hashes via UNION SELECT." for SSRF, from "internal service accessible" to "read AWS IAM credentials from metadata endpoint." for IDOR, from "sequential ID accepted" to "enumerated 10 user records with PII."

    the agent uses Claude's tool_use API for structured action execution. each exploit attempt costs ~$0.50-1.00. exploit logs saved to {output}/exploits/exploit-results.json with full step-by-step reasoning traces.

    this is the "no exploit, no report" philosophy: every finding in the report has proven, demonstrated impact, not just pattern-matched detection.

    external knowledge bases

    crowbar ships with its own payloads and wordlists, but scales massively with external repos:

    • nuclei-templates: --nuclei-templates ~/nuclei-templates loads YAML templates as a fast-pass known-CVE layer. runs before the AI engine. supports status/word/regex matchers, extractors, AND/OR conditions
    • SecLists: --wordlist ~/SecLists expands path discovery, parameter enumeration, subdomain bruteforce, and attack payloads. 10-100x coverage boost over built-in lists
    • PayloadsAllTheThings: --payloads ~/PayloadsAllTheThings imports comprehensive attack payloads organized by type (SQLi, XSS, SSRF, SSTI, XXE, LFI, RCE, NoSQLi)

    continuous monitoring

    crowbar watch https://target.com --interval 6h --webhook https://hooks.slack.com/xxx

    runs periodic scans, maintains a history of findings, alerts when new vulnerabilities appear, and tracks when old ones get fixed. configurable interval (1h, 6h, 1d, 7d), safety-capped at max runs. webhook alerts fire only on deltas, not every scan.

    web dashboard

    crowbar serve --port 3333

    dark-theme web UI at http://127.0.0.1:3333. start scans, view history, inspect vulnerabilities, track remediation. binds to localhost only -- never exposed to the network. optional API key auth via CROWBAR_API_KEY env var. REST API at /api/ for programmatic access. rate limited to 60 req/min. max 3 concurrent scans.

    benchmark results

    tested against OWASP Juice Shop (110 challenges, the industry standard):

    crowbar ZAP human pentest
    vulns found 46 13 18
    challenges solved 35-42 / 110 ~8 ~20
    vuln types 12 3-4 5+
    cost $0.04 free $15-30k
    time 15-25 min 15-30 min 2-4 weeks
    auto-auth SQLi bypass none manual
    logic flaws param omission, race conditions, boundary values no yes

    crowbar autonomously broke into Juice Shop via SQLi auth bypass, discovered 300+ endpoints across 91 JS files, solved 35-42 challenges including DOM XSS, UNION credential extraction, null byte file access, JWT forgery, nOAuth password derivation, race condition exploitation, and CAPTCHA bypass. 46 vulnerability findings across 12 types with 10 attack chains. no source code, no manual configuration, no OpenAPI spec.

    swarm mode

    crowbar scan https://target.com --swarm

    6 specialist AI agents attack in parallel, each with deep domain knowledge:

    specialist focus
    injection SQLi (error/blind/UNION), NoSQLi, command injection, SSTI, XXE
    xss reflected, stored, DOM, postMessage, CSP bypass
    access-control IDOR, BOLA, CORS, forced browsing, mass assignment
    infrastructure SSRF, path traversal, subdomain takeover, cache poisoning
    auth JWT, OAuth/OIDC, default credentials, rate limit bypass
    logic race conditions, workflow bypass, GraphQL, file upload

    after specialists finish, an adversarial QA agent reviews every finding and tries to disprove it. only findings that survive scrutiny reach the report.

    interactive mode (--swarm --interactive) generates a prompt for Claude Code Agent Teams, spawning each specialist in its own tmux pane for real-time observation.

    validation and benchmarks

    five benchmark suites covering every major standard:

    validation ladder (quick smoke tests)

    ./scripts/validation.sh fixture          # built-in test fixture
    ./scripts/validation.sh dvwa-low         # DVWA security=low
    ./scripts/validation.sh juice-shop       # OWASP Juice Shop
    ./scripts/validation.sh all              # run everything

    XBOW benchmark (the gold standard for AI pentesters, 104 CTF challenges)

    ./scripts/xbow-benchmark.sh --limit 10  # first 10 challenges
    ./scripts/xbow-benchmark.sh             # all 104 (1-2 hours)

    each challenge gets a random flag injected at build time. crowbar must extract the flag through exploitation to pass. directly comparable to Shannon (96.15%) and XBOW (85%). also runs on GitHub Actions (workflow_dispatch).

    API security (OWASP API Top 10)

    ./scripts/benchmark-apis.sh crapi       # crAPI: BOLA, mass assignment, auth bypass, SSRF
    ./scripts/benchmark-apis.sh vampi       # VAmPI: SQLi, BOLA, enumeration, rate limiting
    ./scripts/benchmark-apis.sh             # both

    DAST comparison (pentest-tools.com methodology: TP/FP/FN scoring)

    ./scripts/benchmark-dast.sh dvwa        # DVWA with vuln manifest scoring
    ./scripts/benchmark-dast.sh crystals    # Broken Crystals (React/Node.js)
    ./scripts/benchmark-dast.sh             # both, produces TP/FP/FN rates

    scores against known vulnerability manifests, directly comparable to Acunetix, Burp Suite, Qualys, Rapid7, ZAP from the 2024 pentest-tools.com benchmark.

    OWASP Top 10 coverage (HTB AI Range equivalent)

    ./scripts/benchmark-htb.sh              # maps findings to all 10 OWASP categories

    tests against Juice Shop with OWASP Top 10 category mapping. also documents HackTheBox MCP integration for Cursor (see .cursor/mcp.json.example).

    honeypot detection

    analyzes target responses for 5 honeypot signals before wasting attack budget: suspiciously high vuln rate (>90%), fake SQL errors on non-SQL input, artificially consistent timing, known honeypot signatures (HFish, Cowrie, T-Pot), fingerprint mismatches.

    pipeline integration

    crowbar is the third tool in a security pipeline:

    pinata (white-box) scans source code, outputs gaps.json

    whackamole (gray-box) attacks known endpoints, generates and verifies fixes

    crowbar (black-box) needs nothing. optionally pass --gaps gaps.json for guided priorities

    pinata (knows code)  -->  whackamole (knows gaps)  -->  crowbar (knows nothing)

    safety

    • scope enforcement: every request checked against domain allowlist. DNS pre-resolution. private IP blocking
    • banned targets: .gov, .mil, .edu, major platforms blocked by default
    • rate limiting: configurable with hard cap at 100 req/s. adaptive slowdown on 429
    • destructive prevention: no DELETE/DROP by default. explicit flag required
    • request logging: every request/response as compressed JSONL audit trail
    • confirmation prompt: explicit yes/no before first attack
    • cost cap: AI budget enforced. scan completes gracefully if exceeded
    • payload randomization: shuffled per scan to avoid fingerprinting
    • honeypot detection: aborts or warns before wasting budget on decoys

    compliance

    vulnerability reports include regulatory compliance references:

    • PCI-DSS (payment card security)
    • OWASP (web application security)
    • HIPAA (healthcare data)
    • SOC2 (service organization controls)
    • GDPR (data protection)

    tech stack

    • TypeScript, Node.js 20+
    • Playwright (browser crawling, DOM XSS, client-side pollution, postMessage)
    • commander.js (CLI)
    • zod (runtime type validation)
    • Anthropic SDK + OpenAI SDK (AI brain)
    • vitest (testing)
    • tsup (bundling)

    development

    npm run dev      # watch mode
    npm test         # run tests (1249 tests)
    npm run build    # production build (571KB)
    npm run lint     # type check

    this tool is for authorized security testing only. users must have explicit written permission to test targets. unauthorized access to computer systems is illegal. report findings responsibly.

    license

    MIT