Package Exports
- @ignidor/web-search-mcp
- @ignidor/web-search-mcp/dist/index.js
This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (@ignidor/web-search-mcp) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.
Readme
@ignidor/web-search-mcp
Local, unlimited web-search MCP server with BM25 ranking, Playwright crawling, and YouTube transcripts.
- 🔍 No API keys - Uses free DuckDuckGo HTML search
- 🚀 No rate limits - Unlimited searches, 24/7
- 🐳 No Docker - Direct Playwright integration (optional)
- 📊 Smart ranking - BM25 + hybrid scoring with freshness
- 📄 Full extraction - 1000+ words per page (not 200-word snippets)
- 🎬 YouTube transcripts - Fast, robust extraction with yt-dlp
- 💰 100% Free - Outperforms Brave Search, Tavily, commercial alternatives
Features
| Tool | Description |
|---|---|
search |
Fast web search with BM25 ranking (DuckDuckGo) |
crawl_and_extract |
Extract full content from URLs using Playwright |
search_and_crawl |
Search + extract top results (one-stop research) |
get_youtube_transcript |
Get YouTube video transcript (yt-dlp, 1-5sec) ⭐ NEW |
capture_screenshot |
Screenshot any webpage (base64 PNG) |
generate_pdf |
Convert webpage to PDF (base64) |
extract_structured |
CSS selector-based data extraction |
execute_js |
Run custom JavaScript on webpages |
extract_regex |
Extract emails, phones, URLs, dates (21 patterns) |
Quick Start
Installation (via npx)
npx @ignidor/web-search-mcpClaude Desktop / Cursor / Windsurf Config
For npx usage (recommended):
{
"mcpServers": {
"web-search": {
"command": "npx",
"args": ["-y", "@ignidor/web-search-mcp"]
}
}
}For local/SSH usage:
{
"mcpServers": {
"web-search": {
"command": "node",
"args": ["/path/to/dist/index.js"]
}
}
}Tool Examples
1. Search with BM25 Ranking
// Search for anything - unlimited queries, no API key
{
"name": "search",
"arguments": {
"query": "Rust programming language tutorial",
"limit": 10,
"rankingMode": "hybrid" // 'bm25' or 'hybrid'
}
}2. Search + Extract Full Content
// Best for deep research - gets full articles, not snippets
{
"name": "search_and_crawl",
"arguments": {
"query": "AWS DynamoDB batchWrite bug fix",
"extractTopN": 5,
"rerankAfterExtract": true
}
}Result: 8,000+ words of detailed content including:
- Root cause analysis
- Step-by-step fixes
- Complete code examples
- Common pitfalls
3. Get YouTube Transcript ⭐ NEW
// Fast, reliable transcript extraction (1-5 seconds)
{
"name": "get_youtube_transcript",
"arguments": {
"url": "https://www.youtube.com/watch?v=dQw4w9WgXcQ",
"language": "en",
"includeTimestamps": false,
"includeMetadata": true
}
}Features:
- Works with any video length (1 min or 10 hours - same speed!)
- Fetches existing captions (no audio processing)
- Multiple language support (en, es, fr, de, ja, ko, etc.)
- Optional timestamps:
[00:15] Text here - Metadata: title, duration, word count
- Uses yt-dlp (gold standard, 85k+ GitHub stars)
Requirements:
- Install yt-dlp:
brew install yt-dlp(macOS) orpip install yt-dlp
Supported URL formats:
- Full URL:
https://www.youtube.com/watch?v=VIDEO_ID - Short URL:
https://youtu.be/VIDEO_ID - Shorts:
https://www.youtube.com/shorts/VIDEO_ID - Video ID only:
VIDEO_ID
4. Extract Structured Data
// Scrape product listings, articles, etc.
{
"name": "extract_structured",
"arguments": {
"url": "https://example.com/products",
"baseSelector": ".product",
"fields": [
{ "name": "title", "selector": "h2", "type": "text" },
{ "name": "price", "selector": ".price", "type": "text" },
{ "name": "link", "selector": "a", "type": "attribute", "attribute": "href" }
]
}
}5. Execute JavaScript
// Great for dynamic content, debugging
{
"name": "execute_js",
"arguments": {
"url": "https://example.com",
"scripts": [
"return document.title",
"return document.links.length",
"return document.URL"
]
}
}6. Screenshot
{
"name": "capture_screenshot",
"arguments": {
"url": "https://example.com",
"waitFor": 2 // seconds
}
}6. Regex Extraction
// Extract emails, phones, URLs, etc.
{
"name": "extract_regex",
"arguments": {
"url": "https://example.com/contact",
"patterns": ["email", "phone_intl", "url"]
}
}21 built-in patterns: email, phone_intl, phone_us, url, ipv4, ipv6, uuid, currency, percentage, number, date_iso, date_us, time_24h, postal_us, postal_uk, hex_color, twitter_handle, hashtag, mac_addr, iban, credit_card, all
Playwright Setup (Optional but Recommended)
For full functionality (crawling, screenshots, PDFs, JS execution), install Playwright browsers:
npx playwright install chromiumWithout Playwright: Only search tool works (DuckDuckGo results only).
With Playwright: All 11 tools work with full content extraction.
Why This Over Brave Search?
| Feature | Brave Free | This MCP |
|---|---|---|
| Cost | Free tier only | 100% Free |
| Rate Limits | 2,000 requests/month | Unlimited |
| Content Depth | ~200 words snippet | 1,000+ words |
| Ranking | Black-box | Transparent BM25 |
| Infrastructure | Cloud API | Local control |
| API Key | Required | Not needed |
Architecture
┌─────────────────────────────────────────────────────────────────┐
│ Claude Desktop / Cursor │
└───────────────────────────────┬─────────────────────────────────┘
│ MCP (stdio)
▼
┌─────────────────────────────────────────────────────────────────┐
│ @ignidor/web-search-mcp │
│ ┌───────────────────────────────────────────────────────────┐ │
│ │ Tool Router │ │
│ │ • search → DuckDuckGo + BM25 ranking │ │
│ │ • crawl_and_extract → Playwright → Markdown │ │
│ │ • search_and_crawl → Combined (search + extract) │ │
│ │ • capture_screenshot → Playwright → base64 PNG │ │
│ │ • generate_pdf → Playwright → base64 PDF │ │
│ │ • extract_structured → Playwright → CSS extraction │ │
│ │ • execute_js → Playwright → JS results │ │
│ │ • extract_regex → Playwright → 21 patterns │ │
│ └───────────────────────────┬───────────────────────────────┘ │
│ │ │
│ ┌───────────────────────────▼───────────────────────────────┐ │
│ │ Ranking Engine (BM25 + Hybrid) │ │
│ │ • fast-bm25 package for scoring │ │
│ │ • Freshness scoring (exponential decay) │ │
│ │ • Domain authority heuristics │ │
│ └───────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ Playwright (optional) │
│ • Chromium browser for dynamic content │
│ • Screenshot, PDF generation │
│ • JavaScript execution │
└─────────────────────────────────────────────────────────────────┘Development
# Clone repo
git clone https://github.com/JayaBigDataIsCool/ignidor-web-search-mcp.git
cd ignidor-web-search-mcp
# Install dependencies
npm install
# Install Playwright (optional but recommended)
npx playwright install chromium
# Build
npm run build
# Run locally
npm startLicense
MIT © Ignidor Team