JSPM

  • ESM via JSPM
  • ES Module Entrypoint
  • Export Map
  • Keywords
  • License
  • Repository URL
  • TypeScript Types
  • README
  • Created
  • Published
  • Downloads 112
  • Score
    100M100P100Q91642F
  • License CC0-1.0

OSS spec and reference implementation for peek.json

Package Exports

  • @peekthenpay/peek-json-spec
  • @peekthenpay/peek-json-spec/peek-manifest-factory
  • @peekthenpay/peek-json-spec/pricing-schema
  • @peekthenpay/peek-json-spec/pricing-schema-factory
  • @peekthenpay/peek-json-spec/schema

Readme

Peek-Then-Pay (peek.json Specification)

Usage-based pricing and bilateral reporting for AI-era content licensing

πŸ“‹ Specification Status: This document provides an INFORMATIVE overview of the Peek-Then-Pay system. For NORMATIVE implementation requirements, see the document index below.

πŸ“š Document Index

Document Status Purpose
README.md πŸ”΅ Informative Overview, architecture, and rationale
Normative Intent Definitions πŸ”΄ Normative REQUIRED: Core intent categories, JWT security, usage contexts
peek.json Field Reference πŸ”΄ Normative REQUIRED: Manifest field definitions and schema compliance
Edge Enforcement Guide 🟑 Recommended BEST PRACTICES: Implementation patterns and architecture
Tool Service API 🟑 Recommended GUIDELINES: Service integration patterns
License API πŸ”΅ Informative API examples and usage patterns
Usage Context Guide πŸ”΅ Informative Context explanations and examples

Legend:

  • πŸ”΄ Normative = MUST implement for compliance (uses RFC 2119 keywords)
  • 🟑 Recommended = SHOULD implement for consistency
  • πŸ”΅ Informative = MAY reference for guidance

The Problem & Solution

Current AI-content relationships are binary: publishers either allow unlimited crawling or block AI agents entirely with paywalls. This binary approach creates problems:

  • Publishers lose AI visibility: Blocked content doesn't appear in AI-powered search and recommendations
  • No access control granularity: Publishers can't differentiate between free discovery content and premium monetized content
  • Agents can't make informed decisions: No way to preview content value before committing to licensing costs

Peek-Then-Pay provides the missing "movie preview" model: when AI agents encounter license-gated content, they receive a preview/peek of the content along with clear licensing terms, enabling informed access decisions.

This allows publishers to:

  1. Control monetization boundaries - decide what content should be freely discoverable vs. license-gated
  2. Maintain AI discoverability - provide previews so content still appears in AI search and recommendations
  3. Enable informed licensing - agents can evaluate content value before paying for full access

Core Innovation: Intent-Based Pricing for Pre-Transformed Content

Raw content pricing is difficult - what's a webpage "worth" and for what purpose? Peek-Then-Pay solves this by combining intent-specific transformations with usage-based pricing:

Usage Context Frequency What Agents Get Value Proposition
immediate ~60% Clean summaries, translations Clear, actionable results vs. raw HTML
session ~25% Structured Q&A, embeddings Ready-to-use context for multi-turn chat
index ~10% Publisher embeddings, metadata Pre-computed vectors vs. DIY processing
train <3% Training-ready datasets Curated, clean data for fine-tuning
distill <2% Knowledge graphs, structured data Semantic understanding vs. raw text
audit <1% Provenance, attribution data Compliance-ready content access

Economic Benefits for Both Sides

For AI Systems:

  • Clear value pricing - pay for specific transformations (summarization, embeddings) rather than ambiguous "content access"
  • CPU/time savings - receive pre-processed, clean data instead of raw HTML parsing and transformation
  • Access to publisher investments - leverage embeddings and preprocessing publishers already create for their own AI features

For Publishers:

  • Monetize existing AI investments - publishers already create embeddings for on-site search/chat; licensed access distributes costs across multiple AI systems
  • Shared infrastructure costs - one embedding computation serves multiple licensed AI agents vs. each agent computing separately
  • Value-aligned pricing - charge based on what agents actually receive (structured data, embeddings) rather than arbitrary "page access"

Architecture Overview

AI Agent β†’ Bot Detection β†’ Edge Enforcer β†’ [Tool Service] β†’ Content + tracking_id
   ↓              ↓             ↓              ↓                    ↓
License JWT   Classify      Validate       Transform           Bilateral Usage
              Traffic       Budget         (optional)          Reporting

Key Components:

  • Edge Enforcement: Publishers validate licenses and manage budgets at CDN/edge layer
  • Bilateral Reporting: Both enforcer and agent report usage for accuracy and dispute resolution
  • Composable Tooling: Optional content transformation via REST or MCP protocols
  • Usage Context Pricing: Different retention policies enable fair, nuanced pricing models

The "Peek" Mechanism: When AI agents request license-gated content, they receive:

  • Content preview/snippet - enough to understand value proposition
  • peek.json manifest - available licensing terms and pricing
  • Informed choice - agents can decide whether full content access justifies the licensing cost

This ensures publishers maintain AI discoverability while enabling fair monetization of premium content access.

The specification provides standardized contracts across discrete boundaries to maintain and

Specification Components

Standard Intents: read, quote, summarize, embed, translate, analyze, qa, search, rag_ingest

For historical context, see From robots.txt to peek.json.


How It Works

  1. Content Discovery β€” Publishers serve /.well-known/peek.json manifests defining licensing terms
  2. Peek Response β€” License-gated content returns 402 Payment Required + content preview + licensing options
  3. Informed Licensing β€” Agents evaluate preview, choose appropriate usage context, acquire JWT license
  4. Edge Enforcement β€” Publishers validate licenses and manage usage-based budgets at CDN/edge layer
  5. Bilateral Reporting β€” Both enforcers and agents report usage for billing accuracy and dispute resolution
  6. Composable Tooling β€” Optional content transformation via publisher or third-party services

For Publishers

  • Stay in Control β€” Enforce access policies directly at your domain edge (via Workers/CDNs), without ceding content to third-party proxies.
  • Simple Monetization β€” Define pricing once, and rely on a central License Server to manage payments and operator accounts.
  • AI-Ready by Default β€” Provide optional transforms (summarization, search, ingestion) so your content is consistently represented in AI systems.
  • Extend Your Reach β€” Smaller publishers gain visibility in a shared marketplace, surfacing in AI discovery where they might otherwise be missed.
  • Brand Integrity β€” Ensure that when your content is summarized, ingested, or used in AI contexts, it reflects your voice and standards.

For LLMs & Agents

  • Unified Access β€” Discover participating publishers automatically through peek.json manifests.
  • One Integration, Many Publishers β€” Acquire licenses and handle payments centrally, without negotiating with thousands of sites individually.
  • Lower Compute Costs β€” Use publisher-provided search, summarization, and transforms to avoid expensive, repeated crawling and context building.
  • Structured Contracts β€” Operate within a clear legal and technical framework, reducing risk and improving compliance.
  • Extensible Tooling β€” Access publisher-defined tools (via REST or MCP) for specialized use cases (training ingestion, semantic search, etc.).

Key Components

  • Publisher: Hosts peek.json, implements edge enforcement, provides optional tooling services
  • License Server: Centralized JWT licensing, usage-based pricing, bilateral usage reporting
  • Edge Enforcer: Publisher-hosted CDN/worker that validates licenses and manages local budgets
  • Bot Detection: Professional services (Cloudflare Enterprise, etc.) for AI traffic classification
  • Tool Services: Configurable content transformation via REST or MCP protocols

Implementation Flexibility: Publishers choose build vs. buy for each component while maintaining interoperability through standardized APIs and JWT licensing.

For technical implementation details, see Validation Utilities.


A Starting Point

Peek-Then-Pay is not a finished solution to every challengeβ€”it is a starting point. By defining standards for discovery, licensing, enforcement, and tooling, it creates the groundwork for a fair and extensible ecosystem.

This project is open source and community-driven. By working together, publishers, operators, and developers can evolve it into a standard that ensures the web remains both sustainable for creators and usable for AI systems.


Contributing

Contributions are welcome:

  • Propose changes to the specification
  • Improve documentation and examples
  • Build reference implementations
  • Develop publisher or operator tooling

Together, we can make Peek-Then-Pay the contract of trust between publishers and the AI systems that depend on their content.