JSPM

  • Created
  • Published
  • Downloads 1642
  • Score
    100M100P100Q100717F
  • License MIT

Compiler for legal and civic texts. Converts disparate statutory data into structured formats optimized for AI, RAG, and semantic search.

Package Exports

  • @lexbuild/cli

Readme

@lexbuild/cli

npm CI license

Download and convert U.S. legal XML into structured Markdown optimized for AI, RAG pipelines, and semantic search. Supports the U.S. Code (54 titles, 60,000+ sections) and the eCFR (50 titles, 200,000+ sections).

Install

# Global install
npm install -g @lexbuild/cli

# Or run directly
npx @lexbuild/cli --help

Quick Start

# U.S. Code — download and convert all 54 titles
lexbuild download-usc --all
lexbuild convert-usc --all

# eCFR — download and convert all 50 titles
lexbuild download-ecfr --all
lexbuild convert-ecfr --all

# Start small — a single title
lexbuild download-usc --titles 1 && lexbuild convert-usc --titles 1
lexbuild download-ecfr --titles 17 && lexbuild convert-ecfr --titles 17

Commands

download-usc

Download U.S. Code XML from the OLRC. Auto-detects the latest release point.

lexbuild download-usc --all                                  # All 54 titles
lexbuild download-usc --titles 1-5,8,11                      # Specific titles
lexbuild download-usc --all --release-point 119-73not60      # Pin a release
Option Default Description
--titles <spec> Title(s): 1, 1-5, 1-5,8,11
--all Download all 54 titles (single bulk zip)
-o, --output <dir> ./downloads/usc/xml Output directory
--release-point <id> auto-detected Pin a specific OLRC release point

convert-usc

Convert downloaded USC XML to Markdown.

lexbuild convert-usc --all                                   # All downloaded titles
lexbuild convert-usc --titles 1 -g chapter                   # Chapter-level output
lexbuild convert-usc --titles 26 --dry-run                   # Preview without writing
lexbuild convert-usc ./downloads/usc/xml/usc01.xml           # Direct file path
Option Default Description
--titles <spec> Title(s) to convert
--all Convert all titles in input directory
-i, --input-dir <dir> ./downloads/usc/xml Input XML directory
-o, --output <dir> ./output Output directory
-g, --granularity section section, chapter, or title
--link-style plaintext plaintext, canonical, or relative
--no-include-source-credits Exclude source credits
--no-include-notes Exclude all notes
--include-editorial-notes Include editorial notes only
--include-statutory-notes Include statutory notes only
--include-amendments Include amendment notes only
--dry-run Parse and report without writing
-v, --verbose Verbose file output

list-release-points

List available OLRC release points for the U.S. Code. Shows the latest release point and a table of prior releases with dates and affected titles.

lexbuild list-release-points                     # 20 most recent
lexbuild list-release-points -n 5                # 5 most recent
lexbuild list-release-points -n 0                # All available
Option Default Description
-n, --limit <count> 20 Max release points to show (0 = all)

Use the release point ID with download-usc --release-point <id> to download a specific version.

download-ecfr

Download eCFR XML. Defaults to the ecfr.gov API (daily-updated); govinfo bulk data available as fallback.

lexbuild download-ecfr --all                                 # All 50 titles (eCFR API)
lexbuild download-ecfr --titles 1-5,17                       # Specific titles
lexbuild download-ecfr --all --date 2026-01-01               # Point-in-time download
lexbuild download-ecfr --all --source govinfo                # Govinfo bulk fallback
Option Default Description
--titles <spec> Title(s): 1, 1-5, 1-5,17
--all Download all 50 titles
-o, --output <dir> ./downloads/ecfr/xml Output directory
--source ecfr-api ecfr-api (daily) or govinfo (bulk)
--date <YYYY-MM-DD> current Point-in-time date (ecfr-api only)

convert-ecfr

Convert downloaded eCFR XML to Markdown.

lexbuild convert-ecfr --all                                  # All downloaded titles
lexbuild convert-ecfr --titles 17 -g part                    # Part-level output
lexbuild convert-ecfr --all --dry-run                        # Preview without writing
lexbuild convert-ecfr ./downloads/ecfr/xml/ECFR-title17.xml  # Direct file path
Option Default Description
--titles <spec> Title(s) to convert
--all Convert all titles in input directory
-i, --input-dir <dir> ./downloads/ecfr/xml Input XML directory
-o, --output <dir> ./output Output directory
-g, --granularity section section, part, chapter, or title
--link-style plaintext plaintext, canonical, or relative
--no-include-source-credits Exclude source credits
--no-include-notes Exclude all notes
--include-editorial-notes Include editorial/regulatory notes only
--include-statutory-notes Include statutory notes only
--include-amendments Include amendment notes only
--dry-run Parse and report without writing
-v, --verbose Verbose file output

Output Structure

U.S. Code

Granularity Example Path
section (default) output/usc/title-01/chapter-01/section-1.md
chapter output/usc/title-01/chapter-01/chapter-01.md
title output/usc/title-01.md

eCFR

Granularity Example Path
section (default) output/ecfr/title-17/chapter-IV/part-240/section-240.10b-5.md
part output/ecfr/title-17/chapter-IV/part-240.md
chapter output/ecfr/title-17/chapter-IV/chapter-IV.md
title output/ecfr/title-17.md

Every file includes YAML frontmatter with source metadata (source, legal_status, identifier, hierarchy context) followed by the legal text in Markdown. Section and chapter/part granularities generate _meta.json sidecar files and README.md summaries per title.

Performance

The full U.S. Code — all 54 titles, 60,000+ sections, ~85 million estimated tokens — converts in about 20–30 seconds on modern hardware. SAX streaming keeps memory bounded for even the largest titles (100MB+ XML).

Compatibility

  • Node.js >= 22
  • ESM only — no CommonJS build

Monorepo Context

This is the published CLI for the LexBuild monorepo. It depends on @lexbuild/core, @lexbuild/usc, and @lexbuild/ecfr for all conversion and download logic.

pnpm turbo build --filter=@lexbuild/cli
pnpm turbo typecheck --filter=@lexbuild/cli
Package Description
@lexbuild/core Shared parsing, AST, and rendering infrastructure
@lexbuild/usc U.S. Code converter — programmatic API
@lexbuild/ecfr eCFR converter — programmatic API

License

MIT