JSPM

scuttlerun

0.3.0
  • ESM via JSPM
  • ES Module Entrypoint
  • Export Map
  • Keywords
  • License
  • Repository URL
  • TypeScript Types
  • README
  • Created
  • Published
  • Downloads 3
  • Score
    100M100P100Q76137F
  • License MIT

Multi-turn Claude session driver

Package Exports

    This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (scuttlerun) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.

    Readme

    scuttlerun

    npm version License: MIT

    0.x. scuttlerun is in active development; minor versions may include breaking changes until 1.0.

    CLI-only. scuttlerun is intended to be used as a command-line tool. There is no supported programmatic API; modules under dist/ are implementation details and may change without notice.

    A TypeScript CLI that drives multi-turn Claude sessions programmatically using the Claude Agent SDK. scuttlerun simulates a synthetic user powered by an LLM oracle, enabling headless, scriptable, fully-observable interactions with Claude — including interactive tools like AskUserQuestion.

    Why scuttlerun?

    The closest alternatives each leave a gap that scuttlerun fills:

    • claude -p / Claude Code one-shot mode — single-turn only; cannot answer AskUserQuestion, cannot follow up, cannot drive a back-and-forth conversation.
    • Raw Claude Agent SDK — gives you the loop, but no synthetic user, no YAML-driven session config, no built-in transcript format, no project scaffolding, no sandboxing defaults.

    scuttlerun is a thin orchestration layer on top of the Agent SDK that adds those pieces. It is a session driver, not an eval framework — it produces transcripts; scoring/grading composes downstream (see GOALS.md for the full positioning).

    Where scuttlerun fits

    scuttlerun is one tool in a small UNIX-style pipeline for evaluating Claude sessions:

    • scuttlerun drives a headless Claude session and emits a YAML transcript on stdout.
    • pincenez takes that transcript (or any text) plus a checks file and emits structured YAML verdicts.
    • craboodle orchestrates many scuttlerun + pincenez invocations across a directory of eval scenarios, averaging across repetitions.

    scuttlerun composes by pipe — scuttlerun session.yaml | pincenez checks.yaml — but is independently useful for any task that needs a scripted, observable Claude session, with or without downstream grading.

    Demo: scuttlerun running an interactive session, with the synthetic user answering an AskUserQuestion call

    Source: assets/demo.tape (re-record with vhs assets/demo.tape).

    Installation

    Prerequisites: Node.js 20 or later (LTS recommended; CI tests on 20, 22, 24).

    From npm (recommended):

    npm install -g scuttlerun
    # or run without installing:
    npx scuttlerun@latest <session.yaml>

    From source:

    git clone https://github.com/bkudria/scuttlerun.git
    cd scuttlerun
    npm install
    npm run build
    npm link          # makes `scuttlerun` available globally

    Configuration

    scuttlerun requires an Anthropic API key:

    export ANTHROPIC_API_KEY=sk-ant-...

    Get one at console.anthropic.com.

    Quick Start

    scuttlerun examples/simple.yaml
    
    # Run with overrides
    scuttlerun examples/multi-turn.yaml --timeout 120 --model claude-sonnet-4-6

    Session Config

    Sessions are defined in YAML. Only prompt is required — everything else has defaults.

    version: '1'
    prompt: |
      Write a haiku about the ocean and save it to ocean.txt

    Config Reference

    Field Type Default
    version string "1" (config schema version; only "1" is currently accepted)
    prompt string (required)
    model string claude-haiku-4-5
    max_turns number 50
    max_budget_usd number
    effort low | medium | high | xhigh | max high
    tools string[] [Read, Write, Edit, Bash, Glob, Grep, AskUserQuestion, Skill]
    additional_tools string[] — (appended to tools after defaults apply; deduped first-wins)
    disallowed_tools string[]
    permission_mode string bypassPermissions

    user (synthetic user)

    Field Type Default
    user.persona string
    user.oracle_model string claude-haiku-4-5
    user.max_turns number 0

    project (managed project scaffolding)

    When present, scuttlerun populates the project temp directory.

    Field Type Default
    project.claude_md string
    project.skills string[]
    project.settings object
    project.files Record<string, string>
    project.git_init boolean false

    project.files keys are relative paths written inside the temp project dir; values are the file contents. Useful for materializing fixtures, test data, or example source files alongside scaffolded CLAUDE.md/skills/settings.

    sdk (Agent SDK passthrough)

    Field Type Default
    sdk.system_prompt string | {preset: "claude_code", append?: string} {preset: "claude_code"}
    sdk.thinking {type: "adaptive"} | {type: "enabled"} | {type: "disabled"}
    sdk.mcp_servers object
    sdk.agents object
    sdk.plugins {type: "local", path: string}[]
    sdk.env Record<string, string>
    sdk.setting_sources string[] ["project"] if project: present, else []

    sandbox (OS-level isolation)

    Enabled by default. Restricts the agent's filesystem and network access. When the sandbox is enabled, $HOME is redirected to <projectDir>/.home so tools (npm, pip, cargo) write caches inside the sandbox rather than your real home directory.

    Field Type Default
    sandbox.enabled boolean true
    sandbox.network.allowed_domains string[] [] (no network access)
    sandbox.network.allow_local_binding boolean false
    sandbox.filesystem.deny_read string[] [~/.ssh, ~/.aws, ~/.config/gcloud]
    sandbox.filesystem.allow_write string[] [] (cwd and /tmp are always writable)
    sandbox.filesystem.deny_write string[] [.env]

    Config Merging

    Multiple YAML files are deep-merged (objects merge, arrays/scalars replace):

    scuttlerun base.yaml scenario-override.yaml

    Usage

    scuttlerun <session.yaml> [override.yaml...] [options]
    scuttlerun --version
    scuttlerun --help
    Option Description
    --model <model> Override agent model
    --oracle-model <model> Override synthetic user model
    --prompt <text> Override prompt
    --max-turns <n> Override max agent turns
    --max-budget-usd <usd> Override max session cost in USD
    --tools <tools> Override tools (comma-separated)
    --effort <level> Override effort level
    --timeout <seconds> Session timeout (default: 300)
    -v, --verbose Verbose logging to stderr
    -n, --dry-run Validate and display resolved config

    Exit Codes

    This table is the canonical reference for the scuttlerun/pincenez/craboodle exit-code taxonomy. Each tool emits a subset; pincenez and craboodle link here for the full set. Each tool's src/exit-codes.ts defines only the codes that tool itself emits — see also pincenez/src/exit-codes.ts and craboodle/src/exit-codes.ts.

    Code Meaning Emitted by
    0 Success scuttlerun, pincenez, craboodle
    1 Configuration / input error scuttlerun, pincenez, craboodle
    2 Runtime error (SDK failure, process crash, unhandled exception) scuttlerun, pincenez, craboodle
    3 Threshold failure (min_pass_rate ratchet) craboodle
    4 Infrastructure / dependency error craboodle
    5 Budget exceeded scuttlerun, craboodle
    6 Timeout scuttlerun
    7 Max turns exceeded scuttlerun
    130 Interrupted (SIGINT) scuttlerun, pincenez, craboodle

    How It Works

    scuttlerun wraps the Claude Agent SDK's query() with an async generator for multi-turn input. Two key mechanisms:

    1. canUseTool callback — Intercepts AskUserQuestion calls. An LLM oracle (Haiku by default) answers questions consistent with the configured persona.

    2. Turn policy — After each agent turn, the oracle decides whether the synthetic user should send a follow-up (reactive mode) or end the session (single mode).

    Output

    scuttlerun streams a YAML transcript to stdout as the session runs:

    session: a1b2c3d4-e5f6-7890-abcd-ef1234567890
    config: /path/to/session.yaml
    project: /tmp/scuttlerun-project-xK3f9m
    transcript: ~/.claude/projects/-tmp-.../a1b2c3.jsonl
    
    conversation:
      - user: |
          Write a haiku about the ocean and save it to ocean.txt
    
      - assistant: |
          I'll write a haiku about the ocean.
    
      - tool: Write
        path: ocean.txt
    
      - assistant: |
          Done! I saved the haiku to ocean.txt.
    
    turns: 2
    tool_calls: 1
    duration_s: 12.3

    The output is valid YAML and machine-parseable (e.g. with yq).

    Project directory — Always created in $TMPDIR as scuttlerun-project-<id>/, preserved after the session ends so you can inspect agent-created files. On the next run, scuttlerun garbage-collects its own scuttlerun-project-* directories that are older than 7 days; nothing else in $TMPDIR is touched.

    SDK session file — Full conversation record in Claude Code's native JSONL format at ~/.claude/projects/<encoded-cwd>/<session-id>.jsonl. Queryable with jq.

    Privacy

    scuttlerun is a thin client around Anthropic APIs. Be aware:

    • What is sent to Anthropic. Prompts, tool inputs and outputs, conversation history, your configured persona, and oracle decisions are sent to Anthropic via the Claude Agent SDK (agent turns) and the Messages API (synthetic-user oracle). This includes any file contents the agent reads or writes during a session. Anthropic's handling of that data is governed by their Usage Policy and Privacy Policy.
    • What scuttlerun itself collects. Nothing. scuttlerun has no telemetry, analytics, crash reporting, or "phone home". The only network calls it makes are to Anthropic.
    • What stays local. The YAML transcript on stdout, the project temp directory under $TMPDIR/scuttlerun-project-*, and the SDK session JSONL under ~/.claude/projects/... are all written to your machine only. Nothing in those locations is uploaded.
    • Secrets. Your ANTHROPIC_API_KEY is read from the environment and forwarded to the SDK; it never appears in transcripts. The default sandbox denies the agent read access to ~/.ssh, ~/.aws, and ~/.config/gcloud, and denies write access to .env.

    Examples

    See examples/ for complete session configs (and examples/README.md for an index with a feature-coverage table):

    Development

    npm install
    npm run build        # TypeScript compilation
    npm run typecheck    # Type-check without emit (faster than build)
    npm test             # Run all tests (vitest)
    npm run test:watch   # Watch mode
    npm run dev -- examples/simple.yaml   # Run via tsx

    Contributing

    See Also

    License

    MIT