Package Exports

@blacklake-systems/surface-sdk

Readme

@blacklake-systems/surface-sdk

⚠️ Deprecated. This package is now part of the unified blacklake npm package. Install blacklake and import { govern } from 'blacklake'. See the migration doc for sed-style search-and-replace examples. This package will continue to ship as a thin re-export through the next two minor versions.

TypeScript SDK for BlackLake — AI control infrastructure and analytics.

Use this SDK when your code calls LLMs or tools directly (backend services, custom agents, batch jobs) and you want every consequential action on the same ledger as MCP proxy, CI, shell, cloud audit ingest, and Depth workflows. bl.govern() returns the decision, bl.cost.record() attributes spend, bl.decisions.verify() proves a receipt later.

Note: If you are routing tool calls through the MCP proxy, you do not need this SDK. The proxy handles governance automatically. Use the SDK when you want to call the governance API directly from your own code.

Install

npm install @blacklake-systems/surface-sdk

Quick Start

import { BlackLake } from '@blacklake-systems/surface-sdk';

const bl = new BlackLake({ apiKey: process.env.BLACKLAKE_API_KEY! });

const decision = await bl.govern({
  agent: 'my-bot',
  tool: 'send_email',
  action: { to: 'alice@example.com' },
});

switch (decision.decision) {
  case 'allow':
    // safe to proceed
    break;
  case 'approval_required':
    // wait for a human reviewer; decision.approval_id has the pending approval
    break;
  case 'deny':
  case 'default_deny':
    // not allowed. decision.reason explains why.
    throw new Error(`BlackLake denied: ${decision.reason}`);
}

default_deny is the fail-safe — it means no policy matched. Treat it the same as deny in your code; if you see it for a call you expected to allow, write a policy that matches the agent + tool selectors.

baseUrl defaults to https://api.blacklake.systems. No further configuration needed for the cloud product.

Local Surface

Run npx @blacklake-systems/surface-cli first to start Surface on your machine, then point the SDK at it:

import { BlackLake } from '@blacklake-systems/surface-sdk';

const bl = new BlackLake({
  baseUrl: 'http://localhost:3100',
  apiKey: process.env.BLACKLAKE_API_KEY!,
});

// Evaluate governance before executing a tool call
const result = await bl.govern({
  agent: 'expense-bot',
  tool: 'payments.send',
  action: { amount: 4200, vendor: 'Acme Corp' },
});

if (result.decision === 'allow') {
  // proceed with tool call
}

Or use the hosted Surface console at console.blacklake.systems when you need shared policies, approvals, budgets, exports, and team visibility.

Pairs with BlackLake Depth — the durable-execution runtime that survives crashes. Use Depth to run multi-step agent workflows; Surface evaluates each tool call inside them.

Response envelopes

Singular responses (GET /v1/agents/<id>, POST /v1/agents, etc.) are content-negotiated.

Accept-Envelope: v2 — server returns { data: <resource>, ...metadata }. This applies to all singular routes that have been wired through singularEnvelope() server-side, not just agents.
No header (legacy) — server returns the bare resource and emits BlackLake-Singular-Envelope: deprecated; send "Accept-Envelope: v2" to opt in to the new shape so the caller knows to migrate.

The TS SDK ships Accept-Envelope: v2 on every request and peels .data automatically, so SDK callers always see the resource directly. Direct HTTP callers (curl, custom clients) opt in with the header.

List endpoints (GET /v1/agents, etc.) always return { data: T[], total, limit, offset, sort, order } regardless of the header.

API Reference

`new BlackLake(config)`

Option	Type	Default	Description
`apiKey`	`string`	—	Your BlackLake API key (required)
`baseUrl`	`string`	`https://api.blacklake.systems`	API base URL. Override only for local development (e.g. `http://localhost:3100`).

`bl.govern(request)`

Evaluate whether an agent is allowed to invoke a tool.

const result = await bl.govern({
  agent: 'expense-bot',       // agent name
  tool: 'payments.send',      // tool name
  action: { amount: 4200 },   // optional: tool invocation payload
  context: { ip: '10.0.0.1' } // optional: request metadata
});

// result.decision: 'allow' | 'deny' | 'approval_required' | 'default_deny'
// result.evaluation_id: string
// result.policy_id: string | null
// result.reason: string
// result.evaluated_at: string (ISO 8601)
// result.approval_id: string | undefined (set when decision === 'approval_required')
// result.decision_token: string — HMAC-signed receipt binding (evaluation_id, decision).
//   Quote this when reporting a governance outcome to a downstream operator — they can
//   confirm via bl.decisions.verify(...) that the decision really came from this server,
//   not from an LLM hallucinating a denial.

Handle every decision explicitly. default_deny is returned when no policy matches and the agent has no binding for the tool — it is distinct from deny (an explicit deny policy matched). Treating it as a generic fallback is a footgun:

switch (result.decision) {
  case 'allow':              return await payments.send(payload);
  case 'deny':               throw new Error(`blocked: ${result.reason}`);
  case 'approval_required':  return awaitApproval(result.approval_id!);
  case 'default_deny':       throw new Error(
    `no matching policy or binding — register '${tool}' for '${agent}' or add an allow policy`,
  );
}

`bl.agents`

await bl.agents.create({ name, environment, risk_classification, description?, approval_mode? });
await bl.agents.list({ environment?, status? });
await bl.agents.get(id);
await bl.agents.update(id, { name?, description?, environment?, risk_classification?, status?, approval_mode? });
await bl.agents.suspend(id);
await bl.agents.activate(id);
await bl.agents.bindTool(agentId, toolId);
await bl.agents.listTools(agentId);   // returns ToolBinding[] — each item has { binding_id, binding_created_at, tool: Tool }
await bl.agents.unbindTool(agentId, toolId);

`bl.tools`

await bl.tools.create({ name, risk_classification, description? });
await bl.tools.list();
await bl.tools.get(id);

`bl.policies`

await bl.policies.create({ name, priority, outcome, agent_selector?, tool_selector?, enabled? });
await bl.policies.list();
await bl.policies.get(id);
await bl.policies.update(id, { name?, priority?, outcome?, agent_selector?, tool_selector?, enabled? });
await bl.policies.delete(id);

`bl.evaluations`

await bl.evaluations.list({ agent_id?, tool_id?, outcome?, limit?, offset? });
await bl.evaluations.get(id);

Verifying decisions

LLM agents can fabricate text that looks like a denial — 'BlackLake denied this tool call' — without ever actually invoking the bridge. Decision tokens close that gap. Every honest govern() call returns an HMAC-signed token bound to (evaluation_id, decision); a hallucinated token fails verification. Use bl.decisions.verify(...) whenever you're acting on a governance outcome reported by an agent rather than the API directly.

import { BlackLake } from '@blacklake-systems/surface-sdk';
const bl = new BlackLake({ apiKey: process.env.BLACKLAKE_API_KEY! });

const decision = await bl.govern({
  agent: 'my-bot',
  tool: 'send_email',
  action: { to: 'alice@example.com' },
});

// Later (e.g. in an operator's audit tool, or a different process):
const verification = await bl.decisions.verify({
  evaluation_id: decision.evaluation_id,
  decision_token: decision.decision_token,
});

if (verification.valid) {
  console.log('Confirmed: this was a real BlackLake decision', verification.decision);
} else {
  console.warn('Token did not verify:', verification.reason);
}

`bl.organisation`

await bl.organisation.get();                                   // fetch the current organisation (derived from the API key)
await bl.organisation.delete(confirmation, reason?);           // permanently delete the organisation; pass the organisation's exact name as confirmation
await bl.organisation.reset(confirmation, reason?);            // wipe operational data without deleting the workspace; preserves users, API keys, sessions, billing; rate-limited 3/hour

reset() returns { reset_at, organisation_id, total_rows, counts, preserved, note } so tests can assert clean state. Same name-confirmation safeguard as delete().

`bl.apiKeys`

await bl.apiKeys.list();              // returns { keys: ApiKey[] } — each item has { id, name, key_suffix, created_at, revoked_at }
await bl.apiKeys.create('prod-key');  // returns { id, name, key, created_at, warning } — the raw key is shown ONCE; store it securely
await bl.apiKeys.revoke(id);          // sets revoked_at on the key; the API rejects revoking the key in use

`bl.approvals`

await bl.approvals.list({ status?, agent_id?, tool_id?, limit?, offset? });  // returns PaginatedResponse<Approval>
await bl.approvals.get(id);
await bl.approvals.status(id);                                               // returns ApprovalStatusResponse — lightweight poll target
await bl.approvals.approve(id, { decided_by, reason });
await bl.approvals.reject(id, { decided_by, reason });
await bl.approvals.breakGlass(id, { decided_by, reason });                  // emergency override; reason must be ≥ 40 chars and is surfaced in the audit trail
await bl.approvals.wait(id, { interval?, timeout? });                        // polls status until approved/rejected/expired; throws BlackLakeError on timeout

Both decided_by and reason are required and must be non-empty strings (the server enforces .min(1)). The reason lands in the receipt and is surfaced to webhook subscribers and the console audit trail.

wait() defaults to polling every 2 000 ms with a 5-minute total timeout, then throws BlackLakeError with code APPROVAL_WAIT_TIMEOUT and HTTP status 408. These defaults are sensible for an interactive approval queue; for high-frequency agents set a shorter timeout (ms) and handle the throw.

try {
  const resolved = await bl.approvals.wait(result.approval_id!, { timeout: 30_000 });
  if (resolved.status === 'approved') { /* proceed */ }
  else                                 { /* rejected or expired — do NOT proceed */ }
} catch (err) {
  if (err instanceof BlackLakeError && err.code === 'APPROVAL_WAIT_TIMEOUT') {
    // queue for later, page a human, or reject the original request
  }
  else throw err;
}

Returns the fully-populated Approval once the status leaves 'pending'. Always branch on resolved.status — wait() does not throw for rejected or expired; the caller must inspect the resolved record.

`bl.webhooks`

await bl.webhooks.list({ limit?, offset?, sortBy?, order? });               // returns ListResult<Webhook>
await bl.webhooks.create({ url, events, enabled? });                         // returns CreatedWebhook — the raw signing secret is shown ONCE; store it securely
await bl.webhooks.get(id);
await bl.webhooks.update(id, { url?, events?, enabled? });
await bl.webhooks.delete(id);
await bl.webhooks.listDeliveries(id, { limit?, offset? });                   // returns ListResult<WebhookDelivery>
await bl.webhooks.test(id);                                                  // fire a synthetic delivery to validate routing + signing
await bl.webhooks.resendDelivery(id, deliveryId);                            // re-fire one prior delivery (re-signed with current secret + fresh timestamp)
await bl.webhooks.resendFailedDeliveries(id);                                // bulk replay every failed delivery (capped at 100)
await bl.webhooks.health(id);                                                // success rate, p50/p95 latency, last error, consecutive-failure count
await bl.webhooks.deadLetter(id, { limit?, offset? });                       // deliveries explicitly tagged status='dead'

The full event catalogue is:

type WebhookEvent =
  | 'approval.created'
  | 'approval.approved'
  | 'approval.rejected'
  | 'budget.threshold_crossed'
  | 'budget.limit_exceeded'
  | 'evaluation.created'
  | 'evaluation.denied'
  | 'evaluation.approval_required'
  | 'cost.recorded'
  | 'upstream.unhealthy'
  | 'upstream.recovered';

Each request is signed with HMAC-SHA256 over "<timestamp>.<raw_body>"; the signature is sent in the X-BlackLake-Signature header (format: sha256=<hex>) and the millisecond timestamp in X-BlackLake-Timestamp.

Verifying webhook signatures

Always verify the signature before trusting a webhook payload. The SDK ships a constant-time helper that uses the Web Crypto API (no Node crypto dependency, so it works in Cloudflare Workers, Deno, and browsers):

import { BlackLake, BlackLakeError } from '@blacklake-systems/surface-sdk';

// In your webhook handler (Express example):
app.post('/webhooks/blacklake', express.raw({ type: 'application/json' }), async (req, res) => {
  try {
    await BlackLake.verifyWebhookSignature({
      secret: process.env.BLACKLAKE_WEBHOOK_SECRET!,
      rawBody: req.body.toString('utf8'),          // the raw bytes, not JSON.stringify(req.body)
      signature: req.header('x-blacklake-signature')!,
      timestamp: req.header('x-blacklake-timestamp')!,
    });
  } catch (err) {
    if (err instanceof BlackLakeError && err.code === 'WEBHOOK_SIGNATURE_INVALID') {
      return res.status(401).end();
    }
    throw err;
  }
  // signature verified — safe to parse the body and act on it
  const event = JSON.parse(req.body.toString('utf8'));
  res.status(204).end();
});

Rejects on length mismatch, wrong prefix, or signature mismatch. Constant-time comparison is used to avoid timing side-channels.

`bl.cost`

Cost attribution, summaries, and pricing-catalogue inspection. Use record() whenever a call ran outside the proxy paths so spend stays on the same ledger as governance.

await bl.cost.record({                                                       // attribute one LLM call's spend back to an evaluation
  evaluation_id,
  agent, tool,
  provider, model,
  input_tokens, output_tokens,
  capture_path,                                                              // CapturePath enum, see below
});
await bl.cost.estimate({ provider, model, input_tokens, output_ceiling_tokens });   // pre-call estimate; feed into bl.govern({ estimate: ... })
await bl.cost.summary(period);                                               // 'day' | '7d' | '30d' | '90d' (default '30d')
await bl.cost.timeseries(period);                                            // one row per day for the spend chart
await bl.cost.decomposition(period);                                         // agent → tool → model → cost-component tree
await bl.cost.byEvaluation(evaluationId);                                    // every cost record bound to one evaluation + v2 decision token
await bl.cost.export('csv' | 'ndjson', period);                              // streaming export — pipe into jq, BigQuery, S3
await bl.cost.pricing(version?);                                             // priced-model catalogue with exact/prefix match tags
await bl.cost.orphans({ limit?, offset?, since? });                          // cost records not linked to a governed evaluation — coverage gaps

CapturePath enum:

type CapturePath =
  | 'manual' | 'mcp' | 'sdk' | 'ci' | 'shell'
  | 'cloud_audit' | 'existing_workflow_engine' | 'depth'
  | 'proxy';                                                                  // @deprecated alias for 'mcp'

`bl.budgets`

First-class budget primitive. The check runs at govern() time — exceeding a hard limit returns decision: 'deny' with denial_reason: 'budget'.

await bl.budgets.list();                                                     // returns Budget[] (peeled from the list envelope)
await bl.budgets.create({ name, scope_type, scope_id?, period, soft_limit_usd?, hard_limit_usd, enabled? });
await bl.budgets.get(id);
await bl.budgets.status(id);                                                 // current spend + projected hit dates for soft/hard
await bl.budgets.workspaceStatus();                                          // status for every enabled budget plus the "tightest" budget (least headroom)
await bl.budgets.update(id, patch);
await bl.budgets.delete(id);

`bl.insights`

Coverage, risk, drift, anomalies, baselines, and counterfactual reports. These power the console dashboards but are also fine to call directly from automation.

await bl.insights.coverage();                                                // actors + tools + capture-path attribution
await bl.insights.risk();                                                    // decision breakdown, top deniers, high-risk tools
await bl.insights.healthSnapshot();                                          // 7-day workspace digest
await bl.insights.drift();                                                   // cost change vs prior window + hypothesis hints
await bl.insights.coverageTrend(windowDays?);                                // densified per-day governed-vs-uncovered series (default 30)

await bl.insights.anomalies({ includeDismissed?, limit? });                  // active anomalies for the workspace
await bl.insights.recomputeAnomalies(windowDays?);                           // re-detect over the given window
await bl.insights.dismissAnomaly(id);                                        // stop surfacing one anomaly

await bl.insights.observations({ kind?, limit? });                           // workspace observation feed — anomalies, drift, hints, gaps
await bl.insights.baselines(windowDays?);                                    // per-(agent, tool) token + cost percentiles
await bl.insights.recomputeBaselines(windowDays?);

await bl.insights.modelChoice(windowDays?);                                  // per-(agent, tool) model usage comparison (≥ 2 models)
await bl.insights.modelSubstitution({ from, to, windowDays? });              // counterfactual: cost if every `from` call had used `to`

await bl.insights.explain(evaluationId);                                     // counterfactual + policies-considered for one evaluation

`bl.audit`

Export the audit ledger, ingest external events, and inspect coverage gaps.

// Export hot (Postgres) rows only — default, backward-compatible
const ndjson = await bl.audit.export({
  from: new Date(Date.now() - 30 * 86400_000),
  to: new Date(),
  kinds: ['evaluation', 'approval', 'action_result'],
});

// Include archived (GCS cold-storage) rows — BL-OPS-4b
// Use when your window predates the retention cutoff (default 90 days).
// Archived rows are prepended to live rows. At the hot/cold boundary global
// sort order is not guaranteed — re-sort client-side if strict ordering is
// required.
const fullNdjson = await bl.audit.export({
  from: new Date(Date.now() - 200 * 86400_000),
  to: new Date(),
  kinds: ['evaluation'],
  includeArchived: true,
});

// Ingest an external event for reconciliation (e.g. a GitHub Actions run)
const event = await bl.audit.ingest({
  source: 'github',
  source_event_id: 'run-123',
  event_type: 'workflow_run',
  resource: 'my-org/my-repo',
  occurred_at: new Date().toISOString(),
  payload: { conclusion: 'success' },
});

await bl.audit.listEvents({ source: 'github', limit: 50 });  // paginated list
await bl.audit.listUncovered({ limit: 25 });                  // events with no matched evaluation

Each line of the exported NDJSON has shape { type: 'evaluation' | 'approval' | 'action_result', data: { ... } }. The window is capped at 365 days server-side; for longer ranges call repeatedly with non-overlapping windows.

`bl.system`

await bl.system.mode();    // { mode: 'local' | 'cloud', api_key?: string }    — unauthenticated
await bl.system.health();  // { status: 'ok' }                                  — unauthenticated
await bl.system.me();      // { auth_mode, user?, api_key?, organisation }     — identify the calling actor
await bl.system.quota();   // { plan, used, limit, remaining, ... }            — plan + usage; free tier returns limit, paid tier returns null

Use bl.system.mode() to detect whether you're talking to a local CLI-hosted Surface or the cloud one. me() resolves the actor for both session-cookie and API-key callers — the response includes the resolved organisation either way.

`bl.mcp`

bl.mcp.list/reconnect/rotate operate on the local-mode in-memory registry. bl.mcp.upstreams.* operates on the persistent org-scoped catalogue the cloud proxy reads from — that's where new upstreams live in production.

// Local registry
await bl.mcp.list();                                                          // { servers: McpServerStatus[], config_path: string }
await bl.mcp.reconnect(serverName);                                           // { connected, tools, error? }

// Org-scoped upstream catalogue
await bl.mcp.upstreams.list();                                                // { upstreams: McpUpstream[] }
await bl.mcp.upstreams.get(id);
await bl.mcp.upstreams.test(id);                                              // one-shot connection probe
await bl.mcp.upstreams.health(id);                                            // health sparkline + uptime estimate

// Rotate credentials for an upstream without losing its row or bindings.
// static_headers — pass new headers; the stored values are replaced in-place.
const result = await bl.mcp.rotate(upstreamId, { headers: { Authorization: 'Bearer new-key' } });
// → { rotation: 'headers_rotated', upstream_id, message, upstream }

// oauth2 — clears the user's stored token and returns a fresh authorization URL.
// The caller must redirect the user to authorization_url to complete re-auth.
const result = await bl.mcp.rotate(upstreamId);
// → { rotation: 'oauth_reauth_required', authorization_url, state, expires_at }

Manage MCP upstream servers programmatically (status, forced reconnect, credential rotation). Same endpoints the console MCP Servers page uses.

bl.mcp.rotate() keeps the upstream row and all its agent/tool/policy bindings intact — it is the correct way to swap an API key that changed, or to force a user through OAuth consent again without deleting and recreating the upstream. For static_headers upstreams, include { headers: { ... } } in the options; for oauth2 upstreams the options argument is ignored and the response contains an authorization_url the user must visit. OAuth rotation requires a session-authenticated caller — org-scoped API keys will receive a 401 USER_AUTH_REQUIRED.

Admin endpoints

Enterprise-audit surfaces exposed at the HTTP layer. No typed SDK method — call them directly with bl headers or curl.

GET /v1/admin/audit/events?action=&actor_user_id=&actor_api_key_id=&from=&to=&limit=&offset=

Privileged-action log: every operator action against the control plane (key creation, webhook edits, membership changes, etc.). Read-only; the recording side is wired into the mutation routes via recordAdminAction. Filters are AND'd. limit caps at 500; offset for pagination. Returns { events: AdminAuditEvent[], total, limit, offset }.

GET /v1/admin/access-review

Point-in-time inventory of every active actor + credential in the workspace, the set an auditor attests to. Returns { organisation, members, api_keys, webhooks, mcp_upstreams, github_installations }. API key entries include the suffix only — never the raw key. Webhooks include URL + event subscription; MCP upstreams include auth_type, URL, and last_pinged_at.

Demo endpoints

Used by the console "Try it" buttons and integration test harnesses to populate / wipe a workspace with sentinel demo data (owner = 'demo'). Safe to call repeatedly; idempotent on the seed side.

POST /v1/demo/seed

If demo records already exist for this org, returns { status: 'already_seeded' } without re-inserting. On a fresh seed, creates a representative set of agents, tools, policies, bindings, evaluations, cost records, and budgets, and returns { agents, tools, policies, bindings, evaluations, cost_records, budgets } counts.

POST /v1/demo/clear

Removes every record owned by owner = 'demo' for the workspace (agents, tools, bindings, evaluations, cost records, budgets). Safe to call with no demo data; returns the counts of what was removed. Use this between integration test runs to keep a workspace clean without dropping the org.

Error Handling

import { BlackLake, BlackLakeError } from '@blacklake-systems/surface-sdk';

try {
  await bl.govern({ agent: 'unknown', tool: 'unknown' });
} catch (err) {
  if (err instanceof BlackLakeError) {
    console.error(err.status, err.code, err.message);
    if (err.isRetriable()) {
      // 5xx / 408 / 429 — safe to back off and retry
    }
  }
}

BlackLakeError.isRetriable() returns true for HTTP 5xx, 408 Request Timeout, and 429 Too Many Requests. 4xx client errors are not retriable — fix the request instead.

Documentation

Full documentation at blacklake.systems/docs.