Package Exports
- @peekthenpay/peek-json-spec
- @peekthenpay/peek-json-spec/peek-manifest-factory
- @peekthenpay/peek-json-spec/pricing-schema
- @peekthenpay/peek-json-spec/pricing-schema-factory
- @peekthenpay/peek-json-spec/schema
Readme
Peek-Then-Pay (peek.json Specification)
Usage-based pricing and bilateral reporting for AI-era content licensing
π Specification Status: This document provides an INFORMATIVE overview of the Peek-Then-Pay system. For NORMATIVE implementation requirements, see the document index below.
π Document Index
| Document | Status | Purpose |
|---|---|---|
| README.md | π΅ Informative | Overview, architecture, and rationale |
| Normative Intent Definitions | π΄ Normative | REQUIRED: Core intent categories, JWT security, usage contexts |
| peek.json Field Reference | π΄ Normative | REQUIRED: Manifest field definitions and schema compliance |
| Edge Enforcement Guide | π‘ Recommended | BEST PRACTICES: Implementation patterns and architecture |
| Tool Service API | π‘ Recommended | GUIDELINES: Service integration patterns |
| License API | π΅ Informative | API examples and usage patterns |
| Usage Context Guide | π΅ Informative | Context explanations and examples |
Legend:
- π΄ Normative = MUST implement for compliance (uses RFC 2119 keywords)
- π‘ Recommended = SHOULD implement for consistency
- π΅ Informative = MAY reference for guidance
The Problem & Solution
Current AI-content relationships are binary: publishers either allow unlimited crawling or block AI agents entirely with paywalls. This binary approach creates problems:
- Publishers lose AI visibility: Blocked content doesn't appear in AI-powered search and recommendations
- No access control granularity: Publishers can't differentiate between free discovery content and premium monetized content
- Agents can't make informed decisions: No way to preview content value before committing to licensing costs
Peek-Then-Pay provides the missing "movie preview" model: when AI agents encounter license-gated content, they receive a preview/peek of the content along with clear licensing terms, enabling informed access decisions.
This allows publishers to:
- Control monetization boundaries - decide what content should be freely discoverable vs. license-gated
- Maintain AI discoverability - provide previews so content still appears in AI search and recommendations
- Enable informed licensing - agents can evaluate content value before paying for full access
Core Innovation: Intent-Based Pricing for Pre-Transformed Content
Raw content pricing is difficult - what's a webpage "worth" and for what purpose? Peek-Then-Pay solves this by combining intent-specific transformations with usage-based pricing:
| Usage Context | Frequency | What Agents Get | Value Proposition |
|---|---|---|---|
immediate |
~60% | Clean summaries, translations | Clear, actionable results vs. raw HTML |
session |
~25% | Structured Q&A, embeddings | Ready-to-use context for multi-turn chat |
index |
~10% | Publisher embeddings, metadata | Pre-computed vectors vs. DIY processing |
train |
<3% | Training-ready datasets | Curated, clean data for fine-tuning |
distill |
<2% | Knowledge graphs, structured data | Semantic understanding vs. raw text |
audit |
<1% | Provenance, attribution data | Compliance-ready content access |
Economic Benefits for Both Sides
For AI Systems:
- Clear value pricing - pay for specific transformations (summarization, embeddings) rather than ambiguous "content access"
- CPU/time savings - receive pre-processed, clean data instead of raw HTML parsing and transformation
- Access to publisher investments - leverage embeddings and preprocessing publishers already create for their own AI features
For Publishers:
- Monetize existing AI investments - publishers already create embeddings for on-site search/chat; licensed access distributes costs across multiple AI systems
- Shared infrastructure costs - one embedding computation serves multiple licensed AI agents vs. each agent computing separately
- Value-aligned pricing - charge based on what agents actually receive (structured data, embeddings) rather than arbitrary "page access"
Architecture Overview
AI Agent β Bot Detection β Edge Enforcer β [Tool Service] β Content + tracking_id
β β β β β
License JWT Classify Validate Transform Bilateral Usage
Traffic Budget (optional) ReportingKey Components:
- Edge Enforcement: Publishers validate licenses and manage budgets at CDN/edge layer
- Bilateral Reporting: Both enforcer and agent report usage for accuracy and dispute resolution
- Composable Tooling: Optional content transformation via REST or MCP protocols
- Usage Context Pricing: Different retention policies enable fair, nuanced pricing models
The "Peek" Mechanism: When AI agents request license-gated content, they receive:
- Content preview/snippet - enough to understand value proposition
peek.jsonmanifest - available licensing terms and pricing- Informed choice - agents can decide whether full content access justifies the licensing cost
This ensures publishers maintain AI discoverability while enabling fair monetization of premium content access.
The specification provides standardized contracts across discrete boundaries to maintain and
Specification Components
peek.jsonmanifest β Publisher content discovery and termspricing.schema.jsonβ Usage-based pricing configuration- License API β JWT licensing with bilateral usage reporting
- Edge Enforcement Guide β Publisher implementation patterns
- Tool Service API β Content transformation services (REST/MCP)
- Usage Context Guide β Retention policies and pricing implications
- Normative Intent Definitions β Standard AI interaction patterns
Standard Intents: read, quote, summarize, embed, translate, analyze, qa, search,
rag_ingest
For historical context, see From robots.txt to peek.json.
How It Works
- Content Discovery β Publishers serve
/.well-known/peek.jsonmanifests defining licensing terms - Peek Response β License-gated content returns 402 Payment Required + content preview + licensing options
- Informed Licensing β Agents evaluate preview, choose appropriate usage context, acquire JWT license
- Edge Enforcement β Publishers validate licenses and manage usage-based budgets at CDN/edge layer
- Bilateral Reporting β Both enforcers and agents report usage for billing accuracy and dispute resolution
- Composable Tooling β Optional content transformation via publisher or third-party services
For Publishers
- Stay in Control β Enforce access policies directly at your domain edge (via Workers/CDNs), without ceding content to third-party proxies.
- Simple Monetization β Define pricing once, and rely on a central License Server to manage payments and operator accounts.
- AI-Ready by Default β Provide optional transforms (summarization, search, ingestion) so your content is consistently represented in AI systems.
- Extend Your Reach β Smaller publishers gain visibility in a shared marketplace, surfacing in AI discovery where they might otherwise be missed.
- Brand Integrity β Ensure that when your content is summarized, ingested, or used in AI contexts, it reflects your voice and standards.
For LLMs & Agents
- Unified Access β Discover participating publishers automatically through
peek.jsonmanifests. - One Integration, Many Publishers β Acquire licenses and handle payments centrally, without negotiating with thousands of sites individually.
- Lower Compute Costs β Use publisher-provided search, summarization, and transforms to avoid expensive, repeated crawling and context building.
- Structured Contracts β Operate within a clear legal and technical framework, reducing risk and improving compliance.
- Extensible Tooling β Access publisher-defined tools (via REST or MCP) for specialized use cases (training ingestion, semantic search, etc.).
Key Components
- Publisher: Hosts
peek.json, implements edge enforcement, provides optional tooling services - License Server: Centralized JWT licensing, usage-based pricing, bilateral usage reporting
- Edge Enforcer: Publisher-hosted CDN/worker that validates licenses and manages local budgets
- Bot Detection: Professional services (Cloudflare Enterprise, etc.) for AI traffic classification
- Tool Services: Configurable content transformation via REST or MCP protocols
Implementation Flexibility: Publishers choose build vs. buy for each component while maintaining interoperability through standardized APIs and JWT licensing.
For technical implementation details, see Validation Utilities.
A Starting Point
Peek-Then-Pay is not a finished solution to every challengeβit is a starting point. By defining standards for discovery, licensing, enforcement, and tooling, it creates the groundwork for a fair and extensible ecosystem.
This project is open source and community-driven. By working together, publishers, operators, and developers can evolve it into a standard that ensures the web remains both sustainable for creators and usable for AI systems.
Contributing
Contributions are welcome:
- Propose changes to the specification
- Improve documentation and examples
- Build reference implementations
- Develop publisher or operator tooling
Together, we can make Peek-Then-Pay the contract of trust between publishers and the AI systems that depend on their content.