JSPM

@ignidor/web-search-mcp

1.3.0
  • ESM via JSPM
  • ES Module Entrypoint
  • Export Map
  • Keywords
  • License
  • Repository URL
  • TypeScript Types
  • README
  • Created
  • Published
  • Downloads 20
  • Score
    100M100P100Q52861F
  • License MIT

Local, unlimited web-search MCP server with BM25 ranking, Playwright crawling, and smart YouTube transcript extraction. DISCOVERY MODE: Get chapter outlines first, then extract specific sections. Perfect for long videos & bug fix workflows. No Docker, no API keys, no rate limits.

Package Exports

  • @ignidor/web-search-mcp
  • @ignidor/web-search-mcp/dist/index.js

This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (@ignidor/web-search-mcp) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.

Readme

@ignidor/web-search-mcp

Local, unlimited web-search MCP server with BM25 ranking, Playwright crawling, and YouTube transcripts.

  • 🔍 No API keys - Uses free DuckDuckGo HTML search
  • 🚀 No rate limits - Unlimited searches, 24/7
  • 🐳 No Docker - Direct Playwright integration (optional)
  • 📊 Smart ranking - BM25 + hybrid scoring with freshness
  • 📄 Full extraction - 1000+ words per page (not 200-word snippets)
  • 🎬 YouTube transcripts - Fast, robust extraction with yt-dlp
  • 💰 100% Free - Outperforms Brave Search, Tavily, commercial alternatives

Features

Tool Description
search Fast web search with BM25 ranking (DuckDuckGo)
crawl_and_extract Extract full content from URLs using Playwright
search_and_crawl Search + extract top results (one-stop research)
get_youtube_transcript Get YouTube video transcript (yt-dlp, 1-5sec) ⭐ NEW
capture_screenshot Screenshot any webpage (base64 PNG)
generate_pdf Convert webpage to PDF (base64)
extract_structured CSS selector-based data extraction
execute_js Run custom JavaScript on webpages
extract_regex Extract emails, phones, URLs, dates (21 patterns)

Quick Start

Installation (via npx)

npx @ignidor/web-search-mcp

Claude Desktop / Cursor / Windsurf Config

For npx usage (recommended):

{
  "mcpServers": {
    "web-search": {
      "command": "npx",
      "args": ["-y", "@ignidor/web-search-mcp"]
    }
  }
}

For local/SSH usage:

{
  "mcpServers": {
    "web-search": {
      "command": "node",
      "args": ["/path/to/dist/index.js"]
    }
  }
}

Tool Examples

1. Search with BM25 Ranking

// Search for anything - unlimited queries, no API key
{
  "name": "search",
  "arguments": {
    "query": "Rust programming language tutorial",
    "limit": 10,
    "rankingMode": "hybrid"  // 'bm25' or 'hybrid'
  }
}

2. Search + Extract Full Content

// Best for deep research - gets full articles, not snippets
{
  "name": "search_and_crawl",
  "arguments": {
    "query": "AWS DynamoDB batchWrite bug fix",
    "extractTopN": 5,
    "rerankAfterExtract": true
  }
}

Result: 8,000+ words of detailed content including:

  • Root cause analysis
  • Step-by-step fixes
  • Complete code examples
  • Common pitfalls

3. Get YouTube Transcript ⭐ NEW

// Fast, reliable transcript extraction (1-5 seconds)
{
  "name": "get_youtube_transcript",
  "arguments": {
    "url": "https://www.youtube.com/watch?v=dQw4w9WgXcQ",
    "language": "en",
    "includeTimestamps": false,
    "includeMetadata": true
  }
}

Features:

  • Works with any video length (1 min or 10 hours - same speed!)
  • Fetches existing captions (no audio processing)
  • Multiple language support (en, es, fr, de, ja, ko, etc.)
  • Optional timestamps: [00:15] Text here
  • Metadata: title, duration, word count
  • Uses yt-dlp (gold standard, 85k+ GitHub stars)

Requirements:

  • Install yt-dlp: brew install yt-dlp (macOS) or pip install yt-dlp

Supported URL formats:

  • Full URL: https://www.youtube.com/watch?v=VIDEO_ID
  • Short URL: https://youtu.be/VIDEO_ID
  • Shorts: https://www.youtube.com/shorts/VIDEO_ID
  • Video ID only: VIDEO_ID

4. Extract Structured Data

// Scrape product listings, articles, etc.
{
  "name": "extract_structured",
  "arguments": {
    "url": "https://example.com/products",
    "baseSelector": ".product",
    "fields": [
      { "name": "title", "selector": "h2", "type": "text" },
      { "name": "price", "selector": ".price", "type": "text" },
      { "name": "link", "selector": "a", "type": "attribute", "attribute": "href" }
    ]
  }
}

5. Execute JavaScript

// Great for dynamic content, debugging
{
  "name": "execute_js",
  "arguments": {
    "url": "https://example.com",
    "scripts": [
      "return document.title",
      "return document.links.length",
      "return document.URL"
    ]
  }
}

6. Screenshot

{
  "name": "capture_screenshot",
  "arguments": {
    "url": "https://example.com",
    "waitFor": 2  // seconds
  }
}

6. Regex Extraction

// Extract emails, phones, URLs, etc.
{
  "name": "extract_regex",
  "arguments": {
    "url": "https://example.com/contact",
    "patterns": ["email", "phone_intl", "url"]
  }
}

21 built-in patterns: email, phone_intl, phone_us, url, ipv4, ipv6, uuid, currency, percentage, number, date_iso, date_us, time_24h, postal_us, postal_uk, hex_color, twitter_handle, hashtag, mac_addr, iban, credit_card, all


For full functionality (crawling, screenshots, PDFs, JS execution), install Playwright browsers:

npx playwright install chromium

Without Playwright: Only search tool works (DuckDuckGo results only).

With Playwright: All 11 tools work with full content extraction.


Feature Brave Free This MCP
Cost Free tier only 100% Free
Rate Limits 2,000 requests/month Unlimited
Content Depth ~200 words snippet 1,000+ words
Ranking Black-box Transparent BM25
Infrastructure Cloud API Local control
API Key Required Not needed

Architecture

┌─────────────────────────────────────────────────────────────────┐
│                    Claude Desktop / Cursor                      │
└───────────────────────────────┬─────────────────────────────────┘
                                │ MCP (stdio)
                                ▼
┌─────────────────────────────────────────────────────────────────┐
│                   @ignidor/web-search-mcp                       │
│  ┌───────────────────────────────────────────────────────────┐  │
│  │  Tool Router                                              │  │
│  │  • search              → DuckDuckGo + BM25 ranking         │  │
│  │  • crawl_and_extract   → Playwright → Markdown            │  │
│  │  • search_and_crawl     → Combined (search + extract)     │  │
│  │  • capture_screenshot  → Playwright → base64 PNG          │  │
│  │  • generate_pdf        → Playwright → base64 PDF          │  │
│  │  • extract_structured  → Playwright → CSS extraction      │  │
│  │  • execute_js          → Playwright → JS results          │  │
│  │  • extract_regex       → Playwright → 21 patterns         │  │
│  └───────────────────────────┬───────────────────────────────┘  │
│                              │                                  │
│  ┌───────────────────────────▼───────────────────────────────┐  │
│  │              Ranking Engine (BM25 + Hybrid)                │  │
│  │  • fast-bm25 package for scoring                           │  │
│  │  • Freshness scoring (exponential decay)                   │  │
│  │  • Domain authority heuristics                             │  │
│  └───────────────────────────────────────────────────────────┘  │
└─────────────────────────────────────────────────────────────────┘
                                │
                                ▼
┌─────────────────────────────────────────────────────────────────┐
│                  Playwright (optional)                         │
│  • Chromium browser for dynamic content                         │
│  • Screenshot, PDF generation                                    │
│  • JavaScript execution                                          │
└─────────────────────────────────────────────────────────────────┘

Development

# Clone repo
git clone https://github.com/JayaBigDataIsCool/ignidor-web-search-mcp.git
cd ignidor-web-search-mcp

# Install dependencies
npm install

# Install Playwright (optional but recommended)
npx playwright install chromium

# Build
npm run build

# Run locally
npm start

License

MIT © Ignidor Team