Package Exports

This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (@workos/skills) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.

Readme

@workos/skills

WorkOS skills for AI coding agents. 39 skills covering AuthKit, SSO, Directory Sync, RBAC, Vault, Migrations, API references, and more.

Install

npx skills add workos/skills

Works with Claude Code, Cursor, Codex, Goose, and any agent that supports the skills.sh format.

Skills

AuthKit

Skill	Description
`workos-authkit-nextjs`	Next.js App Router integration
`workos-authkit-react`	React SPA integration
`workos-authkit-react-router`	React Router v6/v7 integration
`workos-authkit-tanstack-start`	TanStack Start integration
`workos-authkit-vanilla-js`	Vanilla JS integration
`workos-authkit-base`	Framework detection and routing

Features

Skill	Description
`workos-sso`	Single Sign-On with SAML/OIDC
`workos-directory-sync`	User directory sync from IdPs
`workos-rbac`	Role-based access control
`workos-vault`	Encrypted data storage
`workos-events`	Webhook event handling
`workos-audit-logs`	Compliance audit logging
`workos-admin-portal`	Self-service admin portal
`workos-mfa`	Multi-factor authentication
`workos-custom-domains`	Custom domain configuration
`workos-email`	Email delivery configuration
`workos-widgets`	Embeddable UI components

Routers

Skill	Description
`workos`	Identify which skill to load based on the user's task
`workos-integrations`	Provider lookup table for 60+ IdP integrations

Migrations

Skill	Description
`workos-migrate-auth0`	Migrate from Auth0
`workos-migrate-firebase`	Migrate from Firebase Auth
`workos-migrate-clerk`	Migrate from Clerk
`workos-migrate-aws-cognito`	Migrate from AWS Cognito
`workos-migrate-stytch`	Migrate from Stytch
`workos-migrate-supabase-auth`	Migrate from Supabase Auth
`workos-migrate-descope`	Migrate from Descope
`workos-migrate-better-auth`	Migrate from Better Auth
`workos-migrate-other-services`	Migrate from custom auth
`workos-migrate-the-standalone-sso-api`	Upgrade standalone SSO to AuthKit

API References

Skill	Description
`workos-api-sso`	SSO API endpoints
`workos-api-authkit`	AuthKit/User Management API endpoints
`workos-api-directory-sync`	Directory Sync API endpoints
`workos-api-audit-logs`	Audit Logs API endpoints
`workos-api-organization`	Organizations API endpoints
`workos-api-events`	Events/Webhooks API endpoints
`workos-api-vault`	Vault API endpoints
`workos-api-roles`	Roles & Permissions API endpoints
`workos-api-widgets`	Widgets API endpoints
`workos-api-admin-portal`	Admin Portal API endpoints

Development

Generate skills

# Generate only what's changed (skips skills with matching source hash)
pnpm generate

# Force regenerate all skills from scratch
pnpm generate -- --force

# Generate + AI refinement (requires ANTHROPIC_API_KEY)
pnpm generate -- --refine

# Force regenerate + refine all
pnpm generate -- --refine --force

# Refine a single skill (only writes that skill, safe for iteration)
pnpm generate -- --refine-only=workos-sso --force

Test

pnpm test

How it works

Fetch — downloads llms.txt (URL index) and llms-full.txt (full docs) from workos.com
Parse — splits docs into sections by ## Name {#anchor} boundaries
Split — applies per-section strategies (single, per-subsection, per-api-domain) to produce skill specs
Generate — transforms specs into summary + guide pairs (or API ref stubs) with source hash markers
Refine (optional) — calls Anthropic API to transform doc prose into procedural agent instructions. Runs at concurrency 5. API ref stubs skip refinement.
Quality gate — automated rubric scoring (summaries, guides, stubs, legacy) + LLM semantic check against domain feedback
Write — skips files with matching source hash (content-addressed locking)

Progressive disclosure

Each generated skill produces two files:

Summary (workos-sso.md) — lightweight routing doc (<1KB). When to Use, Key Vocabulary (entity names + ID prefixes), guide pointer, Related Skills. Agent loads this first.
Guide (workos-sso.guide.md) — full implementation (7-11KB). Fetch docs, decision trees, one language-agnostic code example, verification commands, error recovery. Agent reads this only when implementing.

API reference skills use a third format:

Stub (workos-api-sso.guide.md) — deterministic endpoint table + doc pointer (~1KB). No LLM refinement needed.

Content-addressed locking

Each skill has a marker embedding a SHA-256 hash of its source doc content:

<!-- generated:sha256:abc123def456 -->   (scaffold)
<!-- refined:sha256:abc123def456 -->     (after refinement)

generate skips files where the hash matches — only skills whose upstream docs changed get regenerated. Use --force to bypass.

Feedback system

Per-skill .feedback.md files encode domain expert corrections:

# Feedback for workos-directory-sync

## Corrections

- WorkOS supports both webhooks AND the Events API for directory sync.
  Do not claim webhooks are mandatory or that polling is not supported.

## Emphasis

- The dsync.deleted event does NOT trigger individual user/group delete
  events. This is a common trap — emphasize it.

Feedback is loaded by scripts/lib/feedback.ts and injected into refiner prompts so the LLM knows constraints upfront. ## Corrections are hard rules; ## Emphasis are soft guidance.

Hand-crafted vs generated

Hand-crafted (6 AuthKit skills) — never overwritten by the generator
Generated (33 skills) — produced by scripts/generate.ts, refined via Anthropic API, split into summary + guide pairs
Excluded (5 sections) — skipped during generation: FGA, magic-link, pipes, domain-verification, feature-flags

Eval framework

The eval framework measures whether skills actually improve agent-generated code. Each eval case runs the same prompt with and without the skill, scores both outputs against expected signals (methods, params, imports, flow steps, hallucinations), and reports the delta.

# Quick check — verify cases load without API calls
pnpm eval -- --dry-run

# Full run (42 cases, ~$1.70)
pnpm eval -- --no-cache

# Filter by product, language, or single case
pnpm eval -- --no-cache --product=sso
pnpm eval -- --no-cache --lang=python
pnpm eval -- --no-cache --case=sso-node-basic

# Run with regression gates (fails on negative product deltas)
pnpm eval -- --no-cache --fail-on-regression

# Measure variance — run each case N times, report mean ± stddev (~$1.70 × N)
pnpm eval -- --no-cache --samples=2
pnpm eval -- --no-cache --samples=2 --save-all-samples  # persist all sample outputs

Eval tooling

Review and label eval results without re-running:

# Side-by-side transcript diff with signal highlighting
pnpm eval:diff -- --case=sso-node-basic

# Add human judgment (ship/no-ship) for scorer calibration
pnpm eval:label -- --case=sso-node-basic --ship=yes --who=nick --reason="correct SSO flow"

# Compare scorer decisions against human labels
pnpm eval:calibrate

Interpreting results

Delta	Meaning
Positive (≥ +8% generated, ≥ +15% hand-crafted)	Skill helps
Zero	Skill adds no value (LLM already knows)
Negative	Skill hurts — investigate skill content

The triage report (printed after each run) surfaces the top 10 riskiest cases automatically — negative deltas, high variance, hallucination regressions.

Eval case files

Cases live in scripts/eval/cases/*.yaml. Each defines a prompt, expected signals, and known hallucinations. See existing cases for the format.

License

MIT