JSPM

  • ESM via JSPM
  • ES Module Entrypoint
  • Export Map
  • Keywords
  • License
  • Repository URL
  • TypeScript Types
  • README
  • Created
  • Published
  • Downloads 5
  • Score
    100M100P100Q44389F
  • License CC-BY-NC-4.0

A TypeScript library for extracting structured blocks and media items from markdown content. Optimized for React and Vite.

Package Exports

  • @thds/markdown-block-extractor
  • @thds/markdown-block-extractor/package.json

Readme

Markdown Block Extractor

A TypeScript/Deno library for extracting structured blocks and media items from markdown content. This library processes markdown with custom block markers and extracts both regular content blocks and media items with detailed metadata.

Features

  • Block Extraction: Extract content blocks marked with HTML comments
  • Media Detection: Automatically detect images and videos in both markdown and HTML syntax
  • Rich Metadata: Generate detailed metadata for each block including word count, line count, and content features
  • TypeScript Support: Full TypeScript definitions included
  • React/Vite Optimized: Built with Vite for optimal bundling in modern React applications
  • Tree Shakeable: ES modules with proper exports for efficient bundling
  • Deno Compatible: Works with Deno runtime and can be published to JSR

Installation

NPM (React/Vite/Node.js)

npm install @thds/markdown-block-extractor
import { parse } from "@thds/markdown-block-extractor";

React Usage

import React, { useEffect, useState } from 'react';
import { parse, type ParseResult } from '@thds/markdown-block-extractor';

function MarkdownProcessor() {
  const [result, setResult] = useState<ParseResult | null>(null);
  
  useEffect(() => {
    const markdown = `<!-- block:id=1 -->
# My Block
![Image](https://example.com/image.jpg)
Some content here.
<!-- end-block:id=1 -->`;
    
    const parsed = parse(markdown);
    setResult(parsed);
  }, []);
  
  return (
    <div>
      {result?.blockExtracts.map(block => (
        <div key={block.id}>
          <h3>Block {block.id}</h3>
          <p>Word count: {block.metadata.wordCount}</p>
          <p>Has images: {block.metadata.hasImages ? 'Yes' : 'No'}</p>
        </div>
      ))}
    </div>
  );
}

Deno/JSR

import { parse } from "jsr:@your-username/markdown-block-extractor";

Usage

import { parse } from '@thds/markdown-block-extractor';

const markdown = `<!-- block:id=1 -->
# My Block
![Image](https://example.com/image.jpg)
Some content here.
<!-- end-block:id=1 -->`;

const result = parse(markdown);

console.log(result.blockExtracts);
// [
//   {
//     id: "1",
//     type: "block",
//     markdown: "# My Block\n![Image](https://example.com/image.jpg)\nSome content here.",
//     mediaItems: [
//       {
//         type: "image",
//         url: "https://example.com/image.jpg",
//         syntax: "markdown"
//       }
//     ],
//     metadata: {
//       lineCount: 3,
//       hasImages: true,
//       hasVideos: false,
//       hasCodeBlocks: false,
//       hasTables: false,
//       hasLists: false,
//       hasLinks: false,
//       wordCount: 4,
//       characterCount: 50
//     }
//   }
// ]

Block Syntax

The library recognizes two types of blocks:

Regular Blocks

<!-- block:id=1 -->
Your content here
<!-- end-block:id=1 -->

Custom Blocks

<!-- custom-block:id=2 -->
Your content here
<!-- end-custom-block:id=2 -->

API Reference

parse(markdown: string): ParseResult

Parses markdown content and returns extracted blocks and media items.

Parameters:

  • markdown (string): The markdown content to parse

Returns:

  • ParseResult: Object containing:
    • blockExtracts: Array of extracted blocks
    • mediaItems: Array of all media items found
    • ast: The parsed AST tree

Types

BlockExtract

interface BlockExtract {
  id: string;
  type: 'block' | 'customBlock';
  markdown: string;
  mediaItems: MediaItem[];
  metadata: BlockMetadata;
  position?: Position;
}

MediaItem

interface MediaItem {
  type: 'image' | 'video';
  url: string;
  alt?: string;
  title?: string;
  blockId?: string;
  blockType?: string;
  position?: Position;
  syntax: 'markdown' | 'html';
}

BlockMetadata

interface BlockMetadata {
  lineCount: number;
  hasImages: boolean;
  hasVideos: boolean;
  hasCodeBlocks: boolean;
  hasTables: boolean;
  hasLists: boolean;
  hasLinks: boolean;
  wordCount: number;
  characterCount: number;
}

Development

Running the Example

deno task example
# or
deno run --allow-read examples/example.ts

Running Tests

deno task test
# or
deno test --allow-read --allow-write tests/

Building

deno task build
# or
deno check index.ts

Development Mode

deno task dev
# Runs the example in watch mode

Project Structure

markdown-block-extractor/
├── src/                    # Source code
│   ├── index.ts           # Main library entry point
│   ├── types/             # TypeScript type definitions
│   │   └── index.ts
│   ├── utils.ts           # Shared utility functions
│   └── plugins/           # Remark plugins
│       ├── remark-block-extractor.ts
│       ├── remark-custom-blocks.ts
│       ├── remark-media-extractor.ts
│       └── remark-orphan-content-wrapper.ts
├── tests/                 # Test files
│   └── test.ts
├── examples/              # Example usage
│   └── example.ts
├── index.ts              # Library entry point (re-exports from src/)
├── deno.json             # Deno configuration
└── README.md

License

This project is licensed under the Creative Commons Attribution-NonCommercial 4.0 International License (CC-BY-NC-4.0).

Important: This license prohibits commercial use. You may use this library for personal, educational, or non-commercial projects, but commercial use requires explicit permission from the copyright holder.

For commercial licensing inquiries, please contact the author.