Package Exports
- st4ck
- st4ck/dist/index.js
This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (st4ck) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.
Readme
st4ck
Record tests with your AI agent. Replay them forever, for free.
Your AI coding agent already drives a browser when it tests features by hand. st4ck turns that walkthrough into a deterministic, replayable test — no LLM calls on replay, no SDK to learn, no service to sign up for.
npx st4ck@alpha author https://app.example.com "Sign in as alice and verify the dashboard loads"
npx st4ck@alpha run ./tests/sign-in-as-alice-and-verify-the-dashboard-loads.mdTwo lines. Any agent CLI (Claude Code, Cursor, Windsurf, Zed, plain claude). No install. No account.
Three beats
1. Agent ergonomics. A 10-primitive vocabulary (click, fill, wait_until, evaluate, snapshot, press, select, check_box, hover, upload) plus line-delimited JSON over stdin/stdout. Your agent already knows how to drive a browser; this is the smallest surface that lets it.
2. Authoring-by-use. Your agent isn't writing a test — it's using the site. The recording is the test. Walk through the flow once; you get a markdown file you can rerun, version, and share.
3. Deterministic free replay. Recorded md files have zero LLM calls. Reruns are pure Playwright, 10–15× faster than the recording, $0 marginal cost. Run them in CI, on a cron, or whenever you push.
How it works
st4ck author <url> "<instruction>"
│
▼
writes .st4ck/session.md (skill teaching your agent the IPC palette)
│
▼
your agent reads the skill, spawns the runner in record mode
│
▼
runner pauses, agent drives via JSON IPC (snapshot → click → fill → continue)
│
▼
runner writes ./tests/<slug>.md
│
────── weeks later ──────
▼
st4ck run ./tests/<slug>.md → deterministic replay, no LLMPer ADR-007, the skill is just a markdown file. No magic. No vendored protocol. Inspect it. Edit it. Hand it to a teammate.
What you get free vs paid
| Capability | st4ck (free, this package) |
st4ck paid platform |
|---|---|---|
| 10-primitive IPC vocab | ✓ | ✓ |
| Agent-driven recording | ✓ | ✓ |
| Deterministic markdown replay | ✓ | ✓ |
| Locator-priority ladder (Tier-1 self-heal) | ✓ | ✓ |
| Auto-wait + strict locators (Playwright-grade) | ✓ | ✓ |
| Block-format teaching methodology | ✓ | ✓ |
| LLM self-heal on selector drift (Tier-2) | — | ✓ |
| Cross-project knowledge base | — | ✓ |
| TRIAD attestation + server-enforced review | — | ✓ |
| 12-point review checklist with cross-validation | — | ✓ |
| Agent Teams authoring orchestration | — | ✓ |
| PRD/spec/dev-task binding + impact analysis | — | ✓ |
| Coverage reporting against intent sources | — | ✓ |
| Security test generation pipeline (4-phase) | — | ✓ |
| Multi-project + multi-environment orchestration | — | ✓ |
Per ADR-001, this is explicitly a lite tier — you're getting the ergonomic substrate for free. The paid platform layers on the enforcement engine, the team orchestration, and the dev-loop integration. If you're a solo dev recording a few tests for your own product, free is the answer. If you're a team that needs signed-means-passing test attestation tied to your PRD and dev tasks, the paid platform exists for that.
→ Full platform: st4ck.io
Commands
st4ck author <url> "<instruction>" [options]
Writes .st4ck/session.md with the skill teaching your agent the IPC palette and the user's intent. Prints a one-line summary on stdout. Exits 0.
Options:
| Flag | Default | Description |
|---|---|---|
--out <path.md> |
tests/<slug>.md |
Where the recorded test should land |
--name <slug> |
derived from instruction | Override the test name slug |
The slug is the kebab-cased instruction, capped at 60 chars.
st4ck run <file.md> [--headless]
Replays a recorded md file. Spawns the st4ck-runner process with --test-file <file> --no-mcp, inherits stdio, mirrors the runner's exit code.
Exit codes:
0— all blocks passed1— one or more blocks failed (or usage error)
st4ck install [--update | --force] [--repo <url>]
Once the two-line demo has hooked you, this command bridges into the free Claude Code plugin without leaving the npx flow. Clones st4ck-lite into the user's Claude Code plugins directory:
- macOS / Linux:
~/.claude/plugins/st4ck-lite/ - Windows:
%USERPROFILE%\.claude\plugins\st4ck-lite\
The plugin adds slash commands and skills (/qa-record-test, etc.) on top of the runner. Restart your Claude Code session after install to pick up the new commands.
Options:
| Flag | Behavior |
|---|---|
--update |
git-pull an existing install (no re-clone) |
--force |
re-clone over an existing install (deletes first) |
--repo <url> |
override the source git URL (default: github.com/edo-ceder/st4ck-lite) |
Marketplace alternative: when the lite plugin is approved into the Claude Code marketplace, you can also install via /plugin install st4ck-lite from inside CC.
Recording protocol
When your agent runs the runner in record mode, it gets one envelope on stdout:
{"type":"agentic_pause","brief":"Sign in as alice...","page_url":"...","block_index":0,"test_case_id":"record-mode","execution_id":null}Then it writes line-delimited JSON commands to stdin. Each emits a result envelope; wait for it before the next.
| Op | Shape (locators are accessibility-first) |
|---|---|
click |
{"op":"click","locator":{"by":"testid","value":"sign-in"}} |
fill |
{"op":"fill","locator":{...},"value":"alice@example.com"} |
press |
{"op":"press","key":"Enter","locator":{...}} |
select |
{"op":"select","locator":{...},"value":"opt-1"} |
check_box |
{"op":"check_box","locator":{...},"checked":true} |
hover |
{"op":"hover","locator":{...}} |
upload |
{"op":"upload","locator":{...},"files":["/abs/path"]} |
wait_until |
{"op":"wait_until","args":{"kind":"visible","locator":{...}}} |
evaluate |
{"op":"evaluate","js":"location.pathname"} |
snapshot |
{"op":"snapshot"} (observation-only, not recorded) |
url |
{"op":"url"} (observation-only, not recorded) |
continue |
{"op":"continue"} — finalize and write the md file |
abort |
{"op":"abort","reason":"..."} — discard and exit 1 |
Locators come in seven shapes, in priority order:
{"by":"testid","value":"email-input"}
{"by":"role","value":"button","options":{"name":"Sign in"}}
{"by":"label","value":"Email address"}
{"by":"placeholder","value":"you@example.com"}
{"by":"text","value":"Forgot password?"}
{"by":"css","value":"form > .submit"}File layout
After st4ck author:
your-repo/
├── .st4ck/
│ └── session.md ← skill teaching your agent the IPC palette
└── tests/
└── <slug>.md ← recorded test (created by the runner on `continue`)tests/<slug>.md is plain markdown — frontmatter (name, base_url, created_at) plus a ## Blocks heading and a fenced JSON block carrying the primitive sequence. Human-editable, diffable, mergeable.
Requirements
- Node 18+
- An agent CLI that runs Bash (Claude Code, Cursor, Windsurf, Zed, plain
claude, or just you typing JSON into a terminal — works either way) - Playwright's Chromium binary downloads on first run
No account. No API key. No service.
License
Apache 2.0. See LICENSE.
Status
0.1.0-alpha.0 — public alpha. Phase 3 of the LLM-native platform plan. The CLI surface and md format are stable; the underlying runner and skill content will evolve through 0.1.x based on alpha feedback.