Package Exports
- @signalwire/docusaurus-plugin-llms-txt
- @signalwire/docusaurus-plugin-llms-txt/public
Readme
@signalwire/docusaurus-plugin-llms-txt
📣 Version 2.0 Documentation This documentation is for version 2.0, which includes breaking API changes. If you're using version 1.x, please refer to the v1.2.2 documentation on npm.
A Docusaurus plugin that transforms your documentation into AI-friendly formats. It automatically
converts your site's rendered HTML pages into clean markdown files and generates an llms.txt index
file, making your documentation easily consumable by Large Language Models while preserving the
human-readable experience.
Perfect for: API documentation, internal knowledge bases, developer resources, and any documentation that you want to make accessible to AI assistants, chatbots, or LLM-powered tools.
How It Works
This plugin processes your final HTML output after Docusaurus builds your site, not your source MDX/MD files. This approach captures fully rendered components, resolved data, and processed content that only exists after build time. The HTML is then converted through a sophisticated pipeline that extracts clean content, processes it through rehype/remark transformations, and outputs pristine markdown optimized for AI consumption.
Features
- 🔄 HTML to Markdown Conversion: Automatically converts your Docusaurus HTML pages to clean markdown files
- 📝 llms.txt Generation: Creates a comprehensive index file with links to all your documentation
- 🗂️ Section-Based Organization: Intuitive section-based organization with route precedence logic
- ⚡ Smart Caching: Efficient caching system for fast incremental builds
- 🎯 Content Filtering: Flexible filtering by content type (docs, blog, pages) and custom patterns
- 📎 File Attachments: Include local files (OpenAPI specs, schemas, guides) with YAML/JSON formatting preservation
- 🔗 External Links: Organize external URLs within sections or optional sections
- 💻 CLI Commands: Standalone CLI for generation and cleanup operations
- 🎨 Customizable Content Extraction: Configurable CSS selectors for precise content extraction
- 🔗 Link Management: Smart internal link processing with relative/absolute path options
Core Concepts
Output Files
llms.txt- Hierarchical index file with links to all your documentation, organized by sections- Individual markdown files - Clean .md versions of each page, mirroring your route structure
llms-full.txt- Optional single file containing all content, useful for complete exports
Section Organization
Content is organized into logical sections that help AI systems understand documentation structure. You can define sections in two ways:
Manual Sections
Define sections explicitly with custom names, descriptions, and route patterns:
sections: [
{
id: 'api-docs',
name: 'API Reference',
routes: [{ route: '/api/**' }],
},
];Auto-Generated Sections
For routes not matching any manual section, the plugin auto-generates sections based on URL path
segments. Use autoSectionDepth to control which path level becomes top-level sections:
With autoSectionDepth: 1 (group by first segment):
/blog/post-1.md→ "Blog" section/blog/post-2.md→ "Blog" section/docs/intro.md→ "Docs" section
With autoSectionDepth: 2 (group by second segment):
/docs/advanced/plugin.md→ "Advanced" section/docs/tutorial-basics/intro.md→ "Tutorial Basics" section/blog/post-1.md→ "Post 1" section (falls back to depth 1)
Routes shallower than autoSectionDepth automatically fall back to their actual depth, ensuring all
content is included.
Documents within each section are sorted by path hierarchy (depth-first, then lexicographic),
ensuring related content stays grouped together (e.g., /api/methods/* before /api/guides/*).
Content Processing Pipeline
HTML → Content Extraction (CSS selectors) → HTML Processing (rehype) → Markdown Conversion (remark) → Clean Output
Route Patterns
Use glob patterns like /docs/** or /api/* to filter and organize content. Routes determine both
what gets processed and how it's organized in sections.
Default Excluded Routes
The plugin automatically excludes common Docusaurus-generated pages from processing. These defaults
apply to all three excludeRoutes options (markdown, llmsTxt, and
ui.copyPageContent.display):
/search- Search page/404.html- 404 error page/tags- Global tags index/tags/**- Individual tag pages/blog/tags- Blog tags index/blog/tags/**- Individual blog tag pages/blog/archive- Blog archive page/blog/authors- Blog authors index/blog/authors/**- Individual author pages
You can add your own patterns to any excludeRoutes array, which will be merged with these
defaults:
{
markdown: {
excludeRoutes: ['/admin/**', '/internal/**'], // Merged with defaults
},
}Installation
npm install @signalwire/docusaurus-plugin-llms-txt
# or
yarn add @signalwire/docusaurus-plugin-llms-txtQuick Start
Basic Setup
Add the plugin to your docusaurus.config.ts:
import type { Config } from '@docusaurus/types';
import type { PluginOptions } from '@signalwire/docusaurus-plugin-llms-txt/public';
const config: Config = {
plugins: [
[
'@signalwire/docusaurus-plugin-llms-txt',
{
// Enable with defaults
generate: {
enableMarkdownFiles: true,
enableLlmsFullTxt: false,
},
include: {
includeBlog: false,
includePages: false,
includeDocs: true,
},
} satisfies PluginOptions,
],
],
};
export default config;Build your site and the plugin will automatically generate:
build/llms.txt- Hierarchical index of your documentationbuild/**/*.md- Individual markdown files for each page (mirrors your route structure)
Advanced Configuration
import type { Config } from '@docusaurus/types';
import type { PluginOptions } from '@signalwire/docusaurus-plugin-llms-txt/public';
const config: Config = {
plugins: [
[
'@signalwire/docusaurus-plugin-llms-txt',
{
// Markdown file generation options
markdown: {
enableFiles: true,
relativePaths: true,
includeBlog: true,
includePages: true,
includeDocs: true,
includeVersionedDocs: true,
excludeRoutes: ['/admin/**', '/internal/**'],
},
// llms.txt index file configuration
llmsTxt: {
enableLlmsFullTxt: true,
includeBlog: true,
includePages: true,
includeDocs: true,
excludeRoutes: ['/admin/**'],
// Site metadata
siteTitle: 'My Documentation',
siteDescription: 'Comprehensive documentation for developers',
// Auto-section organization
autoSectionDepth: 2, // Group by 2nd path segment (/docs/api/* → "Api" section)
autoSectionPosition: 10, // Auto-sections appear after positioned manual sections
// Manual section organization
sections: [
{
id: 'getting-started',
name: 'Getting Started',
description: 'Quick start guides and tutorials',
position: 1,
routes: [{ route: '/docs/intro/**' }],
},
{
id: 'api-reference',
name: 'API Reference',
description: 'Complete API documentation',
position: 2,
routes: [{ route: '/docs/api/**' }],
attachments: [
{
source: './api/openapi.yaml',
title: 'OpenAPI Specification',
description: 'Complete API specification in OpenAPI 3.0 format',
},
],
},
],
},
// UI features (requires theme package)
ui: {
copyPageContent: {
buttonLabel: 'Copy Page',
display: {
docs: true,
excludeRoutes: ['/admin/**'],
},
},
},
} satisfies PluginOptions,
],
],
};
export default config;API Reference
The plugin configuration is organized into three main areas: markdown file generation, llms.txt index creation, and UI features.
Top-Level Options
These options control plugin behavior and error handling.
| Property | Type | Required | Default | Description |
|---|---|---|---|---|
markdown |
MarkdownOptions |
❌ | {} |
Generate individual .md files for each page. See MarkdownOptions below. |
llmsTxt |
LlmsTxtOptions |
❌ | {} |
Generate llms.txt index file with organized content. See LlmsTxtOptions below. |
ui |
UiOptions |
❌ | {} |
Enable UI features like copy buttons. See UiOptions below. |
runOnPostBuild |
boolean |
❌ | true |
Automatically run during build. Set to false to manually trigger via CLI. |
onSectionError |
'ignore' | 'log' | 'warn' | 'throw' |
❌ | 'warn' |
How to handle section configuration errors (invalid IDs, route conflicts). |
onRouteError |
'ignore' | 'log' | 'warn' | 'throw' |
❌ | 'warn' |
How to handle page processing errors (HTML parsing failures). 'warn' skips failed pages and continues. |
logLevel |
0 | 1 | 2 | 3 |
❌ | 1 |
Console output verbosity. 0=silent, 1=normal, 2=verbose, 3=debug. |
MarkdownOptions
Generate individual .md files for each page.
| Property | Type | Required | Default | Description |
|---|---|---|---|---|
enableFiles |
boolean |
❌ | true |
Generate .md files. Disable to skip file generation entirely. |
relativePaths |
boolean |
❌ | true |
Use relative paths (./docs/intro.md) vs absolute URLs (https://site.com/docs/intro). |
includeDocs |
boolean |
❌ | true |
Include documentation pages. |
includeVersionedDocs |
boolean |
❌ | true |
Include older doc versions. Disable to only process current version. |
includeBlog |
boolean |
❌ | false |
Include blog posts. |
includePages |
boolean |
❌ | false |
Include standalone pages from src/pages/. |
includeGeneratedIndex |
boolean |
❌ | true |
Include auto-generated category index pages. |
excludeRoutes |
string[] |
❌ | See default excludes | Glob patterns to exclude routes from markdown generation. Defaults include common Docusaurus pages like /search, /blog/tags/**, /blog/archive, etc. Add your own patterns like ['/admin/**', '/internal/**']. |
contentSelectors |
string[] |
❌ | ['.theme-doc-markdown', 'main .container .col', 'main .theme-doc-wrapper', 'article', 'main'] |
CSS selectors to find main content. First match wins. |
routeRules |
RouteRule[] |
❌ | [] |
Override selectors for specific routes. See RouteRule. |
remarkStringify |
object |
❌ | {} |
Markdown formatting options. See remark-stringify. |
remarkGfm |
boolean | object |
❌ | true |
Enable GitHub Flavored Markdown (tables, strikethrough, task lists). |
rehypeProcessTables |
boolean |
❌ | true |
Convert HTML tables to markdown. Disable for complex tables. |
beforeDefaultRehypePlugins |
PluginInput[] |
❌ | [] |
Custom rehype plugins to run BEFORE defaults. |
rehypePlugins |
PluginInput[] |
❌ | [] |
Custom rehype plugins that REPLACE defaults. Use with caution. |
beforeDefaultRemarkPlugins |
PluginInput[] |
❌ | [] |
Custom remark plugins to run BEFORE defaults. |
remarkPlugins |
PluginInput[] |
❌ | [] |
Custom remark plugins that REPLACE defaults. Use with caution. |
LlmsTxtOptions
Generate and configure the llms.txt index file.
| Property | Type | Required | Default | Description |
|---|---|---|---|---|
enableLlmsFullTxt |
boolean |
❌ | false |
Generate llms-full.txt with complete page content (not just links). |
includeDocs |
boolean |
❌ | true |
Include documentation pages. |
includeVersionedDocs |
boolean |
❌ | false |
⚠️ Include older doc versions. Default is false (different from markdown). |
includeBlog |
boolean |
❌ | false |
Include blog posts. |
includePages |
boolean |
❌ | false |
Include standalone pages from src/pages/. |
includeGeneratedIndex |
boolean |
❌ | true |
Include auto-generated category index pages. |
excludeRoutes |
string[] |
❌ | See default excludes | Glob patterns to exclude routes from llms.txt. Defaults include /search, /blog/tags/**, etc. Add your own like ['/admin/**']. |
sections |
SectionDefinition[] |
❌ | [] |
Organize content into named sections. See SectionDefinition. |
autoSectionDepth |
1 | 2 | 3 | 4 | 5 | 6 |
❌ | 1 |
Path depth for auto-generated sections. 1=group by first segment (/blog/* → "Blog"), 2=group by second segment (/docs/advanced/* → "Advanced"). Routes shallower than this depth fall back to their actual depth. Only affects auto-generated sections; manual sections are unaffected. |
autoSectionPosition |
number |
❌ | undefined |
Position for auto-generated sections. undefined=after positioned sections, number=sort with positioned sections. |
siteTitle |
string |
❌ | '' |
Title for llms.txt header. Falls back to Docusaurus config if not set. |
siteDescription |
string |
❌ | '' |
Description for llms.txt header. |
enableDescriptions |
boolean |
❌ | true |
Include page and section descriptions. Disable for a more compact index. |
attachments |
AttachmentFile[] |
❌ | [] |
Include files like OpenAPI specs, schemas. Appear in 'Attachments' section. |
optionalLinks |
OptionalLink[] |
❌ | [] |
External links (APIs, forums). Appear in 'Optional' section. |
UiOptions
Enable UI features on your documentation pages.
| Property | Type | Required | Default | Description |
|---|---|---|---|---|
copyPageContent |
boolean | CopyPageContentOptions |
❌ | false |
Add copy button to doc pages. Use true for defaults or object for customization. Requires theme package. |
Complex Types Reference
These types are used in the configuration options above.
SectionDefinition
Organize content into logical sections in llms.txt.
{
id: 'api-docs', // Unique kebab-case ID
name: 'API Documentation', // Display name
description: 'Complete API docs', // Optional context
position: 1, // Sort order (lower = earlier)
routes: [{ route: '/api/**' }], // Which pages belong here
subsections: [], // Nested sections
attachments: [], // Section-specific files
optionalLinks: [] // Section-specific external links
}| Property | Type | Required | Description |
|---|---|---|---|
id |
string |
✅ | Unique identifier (lowercase, numbers, hyphens only). Must be unique across all sections. |
name |
string |
✅ | Display name shown in llms.txt. |
description |
string |
❌ | Optional description shown under heading. |
position |
number |
❌ | Sort order. Lower numbers appear first. |
routes |
SectionRoute[] |
❌ | Glob patterns to match pages to this section. |
subsections |
SectionDefinition[] |
❌ | Nested sections (max 3 levels recommended). |
attachments |
AttachmentFile[] |
❌ | Files specific to this section. |
optionalLinks |
OptionalLink[] |
❌ | External links specific to this section. |
SectionRoute
Assign routes to sections using glob patterns.
{
route: '/api/**', // Match all /api/* routes
contentSelectors: ['.api-content'] // Optional: custom selectors for these pages
}| Property | Type | Required | Description |
|---|---|---|---|
route |
string |
✅ | Glob pattern (* = single level, ** = multiple levels). |
contentSelectors |
string[] |
❌ | Override content extraction for these routes. |
RouteRule
Customize content extraction for specific routes (separate from section assignment).
{
route: '/api/**',
contentSelectors: ['.api-content', 'article']
}| Property | Type | Required | Description |
|---|---|---|---|
route |
string |
✅ | Glob pattern matching routes. |
contentSelectors |
string[] |
❌ | CSS selectors for content extraction. |
Note: Use RouteRule (in markdown.routeRules) for processing customization. Use
SectionRoute (in sections[].routes) for section assignment.
AttachmentFile
Include external text files in your output.
{
source: './specs/openapi.yaml',
title: 'API Specification',
description: 'Complete OpenAPI 3.0 spec',
fileName: 'api-spec', // Custom output filename (prevents collisions)
includeInFullTxt: true
}| Property | Type | Default | Description |
|---|---|---|---|
source |
string |
- | File path relative to site root. |
title |
string |
- | Display name in llms.txt. |
description |
string |
- | Optional context about the file. |
fileName |
string |
- | Custom output filename (without extension). If not provided, uses source filename. Auto-numbered if collision detected. |
includeInFullTxt |
boolean |
true |
Include full content in llms-full.txt. |
OptionalLink
Link to external resources.
{
title: 'API Status Page',
url: 'https://status.example.com',
description: 'Real-time API status'
}| Property | Type | Required | Description |
|---|---|---|---|
title |
string |
✅ | Link text shown in llms.txt. |
url |
string |
✅ | External URL (must be HTTP/HTTPS). |
description |
string |
❌ | Optional context about the link. |
CopyPageContentOptions
Configure the copy button feature (requires theme package).
{
buttonLabel: 'Copy Page',
display: {
docs: true,
excludeRoutes: ['/admin/**']
},
contentStrategy: 'prefer-markdown',
actions: {
viewMarkdown: true,
ai: {
chatGPT: true,
claude: { prompt: 'Help me understand this:' }
}
}
}| Property | Type | Required | Default | Description |
|---|---|---|---|---|
buttonLabel |
string |
❌ | 'Copy Page' |
Button text. |
display |
object |
❌ | {} |
Control where button appears. |
display.docs |
boolean |
❌ | true |
Show on docs pages. |
display.excludeRoutes |
string[] |
❌ | See default excludes | Hide copy button on specific routes. Defaults include /search, /blog/tags/**, etc. |
contentStrategy |
'prefer-markdown' | 'html-only' |
❌ | 'prefer-markdown' |
Controls what content is copied. 'prefer-markdown' copies markdown if available, falls back to HTML. 'html-only' always copies HTML. Dropdown menu item shows "Copy Raw Markdown" or "Copy Raw HTML" accordingly. |
actions |
object |
❌ | {} |
Available actions in dropdown. |
actions.viewMarkdown |
boolean |
❌ | true |
Show "View Markdown" option in dropdown when markdown file exists. Independent of contentStrategy. |
actions.ai |
object |
❌ | {} |
AI integration options. |
actions.ai.chatGPT |
boolean | { prompt?: string } |
❌ | true |
ChatGPT integration. Default prompt: "Analyze this documentation:" |
actions.ai.claude |
boolean | { prompt?: string } |
❌ | true |
Claude integration. Default prompt: "Analyze this documentation:" |
MIT © SignalWire