JSPM

  • Created
  • Published
  • Downloads 829
  • Score
    100M100P100Q107173F
  • License Apache-2.0

Record tests with your AI agent. Replay them forever, for free. Two npx commands: `st4ck author <url> "<steps>"` + `st4ck run <file>`. No install required.

Package Exports

  • st4ck
  • st4ck/dist/index.js

This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (st4ck) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.

Readme

st4ck

Record tests with your AI agent. Replay them forever, for free.

Your AI coding agent already drives a browser when it tests features by hand. st4ck turns that walkthrough into a deterministic, replayable test — no LLM calls on replay, no SDK to learn, no service to sign up for.

npx st4ck@alpha author https://app.example.com "Sign in as alice and verify the dashboard loads"
npx st4ck@alpha run ./tests/sign-in-as-alice-and-verify-the-dashboard-loads.md

Two lines. Any agent CLI (Claude Code, Cursor, Windsurf, Zed, plain claude). No install. No account.


Three beats

1. Agent ergonomics. A 10-primitive vocabulary (click, fill, wait_until, evaluate, snapshot, press, select, check_box, hover, upload) plus line-delimited JSON over stdin/stdout. Your agent already knows how to drive a browser; this is the smallest surface that lets it.

2. Authoring-by-use. Your agent isn't writing a test — it's using the site. The recording is the test. Walk through the flow once; you get a markdown file you can rerun, version, and share.

3. Deterministic free replay. Recorded md files have zero LLM calls. Reruns are pure Playwright, 10–15× faster than the recording, $0 marginal cost. Run them in CI, on a cron, or whenever you push.


How it works

  st4ck author <url> "<instruction>"
           │
           ▼
  writes .st4ck/session.md (skill teaching your agent the IPC palette)
           │
           ▼
  your agent reads the skill, spawns the runner in record mode
           │
           ▼
  runner pauses, agent drives via JSON IPC (snapshot → click → fill → continue)
           │
           ▼
  runner writes ./tests/<slug>.md
           │
  ──────  weeks later  ──────
           ▼
  st4ck run ./tests/<slug>.md  →  deterministic replay, no LLM

Per ADR-007, the skill is just a markdown file. No magic. No vendored protocol. Inspect it. Edit it. Hand it to a teammate.


What you get free vs paid

Capability st4ck (free, this package) st4ck paid platform
10-primitive IPC vocab
Agent-driven recording
Deterministic markdown replay
Locator-priority ladder (Tier-1 self-heal)
Auto-wait + strict locators (Playwright-grade)
Block-format teaching methodology
LLM self-heal on selector drift (Tier-2)
Cross-project knowledge base
TRIAD attestation + server-enforced review
12-point review checklist with cross-validation
Agent Teams authoring orchestration
PRD/spec/dev-task binding + impact analysis
Coverage reporting against intent sources
Security test generation pipeline (4-phase)
Multi-project + multi-environment orchestration

Per ADR-001, this is explicitly a lite tier — you're getting the ergonomic substrate for free. The paid platform layers on the enforcement engine, the team orchestration, and the dev-loop integration. If you're a solo dev recording a few tests for your own product, free is the answer. If you're a team that needs signed-means-passing test attestation tied to your PRD and dev tasks, the paid platform exists for that.

→ Full platform: st4ck.io


Commands

st4ck author <url> "<instruction>" [options]

Writes .st4ck/session.md with the skill teaching your agent the IPC palette and the user's intent. Prints a one-line summary on stdout. Exits 0.

Options:

Flag Default Description
--out <path.md> tests/<slug>.md Where the recorded test should land
--name <slug> derived from instruction Override the test name slug

The slug is the kebab-cased instruction, capped at 60 chars.

st4ck run <file.md> [--headless]

Replays a recorded md file. Spawns the st4ck-runner process with --test-file <file> --no-mcp, inherits stdio, mirrors the runner's exit code.

Exit codes:

  • 0 — all blocks passed
  • 1 — one or more blocks failed (or usage error)

st4ck install [--update | --force] [--repo <url>]

Once the two-line demo has hooked you, this command bridges into the free Claude Code plugin without leaving the npx flow. Clones st4ck-lite into the user's Claude Code plugins directory:

  • macOS / Linux: ~/.claude/plugins/st4ck-lite/
  • Windows: %USERPROFILE%\.claude\plugins\st4ck-lite\

The plugin adds slash commands and skills (/qa-record-test, etc.) on top of the runner. Restart your Claude Code session after install to pick up the new commands.

Options:

Flag Behavior
--update git-pull an existing install (no re-clone)
--force re-clone over an existing install (deletes first)
--repo <url> override the source git URL (default: github.com/edo-ceder/st4ck-lite)

Marketplace alternative: when the lite plugin is approved into the Claude Code marketplace, you can also install via /plugin install st4ck-lite from inside CC.


Recording protocol

When your agent runs the runner in record mode, it gets one envelope on stdout:

{"type":"agentic_pause","brief":"Sign in as alice...","page_url":"...","block_index":0,"test_case_id":"record-mode","execution_id":null}

Then it writes line-delimited JSON commands to stdin. Each emits a result envelope; wait for it before the next.

Op Shape (locators are accessibility-first)
click {"op":"click","locator":{"by":"testid","value":"sign-in"}}
fill {"op":"fill","locator":{...},"value":"alice@example.com"}
press {"op":"press","key":"Enter","locator":{...}}
select {"op":"select","locator":{...},"value":"opt-1"}
check_box {"op":"check_box","locator":{...},"checked":true}
hover {"op":"hover","locator":{...}}
upload {"op":"upload","locator":{...},"files":["/abs/path"]}
wait_until {"op":"wait_until","args":{"kind":"visible","locator":{...}}}
evaluate {"op":"evaluate","js":"location.pathname"}
snapshot {"op":"snapshot"} (observation-only, not recorded)
url {"op":"url"} (observation-only, not recorded)
continue {"op":"continue"} — finalize and write the md file
abort {"op":"abort","reason":"..."} — discard and exit 1

Locators come in seven shapes, in priority order:

{"by":"testid","value":"email-input"}
{"by":"role","value":"button","options":{"name":"Sign in"}}
{"by":"label","value":"Email address"}
{"by":"placeholder","value":"you@example.com"}
{"by":"text","value":"Forgot password?"}
{"by":"css","value":"form > .submit"}

File layout

After st4ck author:

your-repo/
├── .st4ck/
│   └── session.md      ← skill teaching your agent the IPC palette
└── tests/
    └── <slug>.md       ← recorded test (created by the runner on `continue`)

tests/<slug>.md is plain markdown — frontmatter (name, base_url, created_at) plus a ## Blocks heading and a fenced JSON block carrying the primitive sequence. Human-editable, diffable, mergeable.


Requirements

  • Node 18+
  • An agent CLI that runs Bash (Claude Code, Cursor, Windsurf, Zed, plain claude, or just you typing JSON into a terminal — works either way)
  • Playwright's Chromium binary downloads on first run

No account. No API key. No service.


License

Apache 2.0. See LICENSE.


Status

0.1.0-alpha.0 — public alpha. Phase 3 of the LLM-native platform plan. The CLI surface and md format are stable; the underlying runner and skill content will evolve through 0.1.x based on alpha feedback.