firecrawl-mcp
MCP server for Firecrawl web scraping integration. Supports both cloud and self-hosted instances. Features include web scraping, search, batch processing, structured data extraction, and LLM-powered content analysis.
Found 98 results for content-extraction
MCP server for Firecrawl web scraping integration. Supports both cloud and self-hosted instances. Features include web scraping, search, batch processing, structured data extraction, and LLM-powered content analysis.
Extract article content and metadata from web pages.
MCP server for fetching web content using Playwright browser
Hyperbrowser Model Context Protocol Server
Search for AIs - DeepSearch and Content API.
🚀 MARIA v4.4.8 - Enterprise AI Development Platform with identity system and character voice implementation. Features 74 production-ready commands with comprehensive fallback implementation, local LLM support, and zero external dependencies. Includes nat
Web content extraction and automation via Playwright MCP
n8n node to extract main content from webpages using Defuddle library
Markdown Content Preprocessor - Fetch web pages, extract content, convert to clean Markdown
Docusaurus plugin that adds a copy page button to extract documentation content as markdown for AI tools like ChatGPT and Claude
Fast, token-efficient web content extraction - fetch web pages and convert to clean Markdown
微信公众号文章抓取 MCP 服务器 - 支持自动图片下载、内容清理、智能抓取,可生成完整的本地化Markdown文档
Model Context Protocol (MCP) server that integrates AgentQL data extraction capabilities.
n8n community node for Olyptik web crawling and content extraction API
LLM-ready HTML to Markdown pipeline with Readability, htmlparser2, and post-processing utilities.
Elegant and powerful Instagram video downloader for seamless content extraction
Harvester is a lightweight and highly optimized javascript library for extracting data from the DOM tree. It supports extraction of tag texts with specified types and attributes. it's tiny and has no dependencies and also works with Puppeteer
A tool for extracting structured content from web pages with customizable selectors and crawling options
Clado Model Context Protocol Server
MCP server for JinaAI reader
MCP server for web search and content extraction with multiple URL support and memory optimizations
JavaScript/Node.js port of NewPipeExtractor
A powerful web crawler designed specifically for LLM applications, capable of extracting clean, readable content from various web pages and converting it to Markdown format.
MCP server for JinaAI search
MCP server for JinaAI grounding
Linkd Model Context Protocol Server
Universal document-to-markdown and section splitter for HTML, URLs, and PDFs.
An efficient React Native file reader library designed for comprehensive document handling with support for multiple file types and advanced content extraction capabilities
Model Context Protocol server for WebScraping.AI API. Provides LLM-powered web scraping tools with Chromium JavaScript rendering, rotating proxies, and HTML parsing.
MCP server for fetching web content using Playwright browser
Smart web scraper node for n8n with automatic failover and content extraction
Content extraction and metadata processing SDK for Evermark Protocol
MCP server for JinaAI search
Markdown Content Preprocessor - Fetch web pages, extract content, convert to clean Markdown
Powerful web scraping SDK for extracting blog articles and content. No LLM required.
Linkd Model Context Protocol Server
A powerful PDF text and image extraction library with universal browser and Node.js support (Dual Licensed: Free for non-commercial, Paid for commercial use)
Powerful web scraping SDK for extracting blog articles and content. No LLM required.
Powerful web scraping SDK for extracting blog articles and content. No LLM required.
Web crawler and API for aggregating and serving digital rights organizations' publications.
MCP server for Supadata video & web scraping integration. Features include YouTube, TikTok, Instagram, Twitter, and file video transcription, web scraping, batch processing and structured data extraction.
MCP server for Firecrawl web scraping integration. Supports both cloud and self-hosted instances. Features include web scraping, batch processing, structured data extraction, and LLM-powered content analysis.
Provide up-to-date context about any library, built by askbudi.ai
Browser Native client SDK for web scraping and content extraction API
Clado Model Context Protocol Server
A TypeScript library that fetches URLs and converts them to structured JSON and Markdown format.
MCP server for web content fetching, summarizing, comparing, and extracting information
A powerful web crawler that extracts content from web pages and converts them to clean Markdown format, with support for code blocks and GitHub Flavored Markdown
A Model Context Protocol (MCP) server that provides intelligent web reading capabilities using the Jina AI Reader API. It extracts clean, LLM-ready content from any URL.
A TypeScript library for extracting threaded content from discussion platforms like Reddit, Twitter, and Hacker News
A Bun-based tool for archiving web content as LLM context using Pure.md API
Docusaurus plugin that adds a copy page button to extract documentation content as markdown for AI tools like ChatGPT and Claude
CLI tool for converting web pages to clean, LLM-friendly markdown. Fetches content from URLs and converts HTML to optimized markdown format perfect for LLM training, RAG systems, and AI applications.
Clado Model Context Protocol Server
Crawl-to-markdown is a powerful TypeScript package designed to search search engines for a given keyword, crawl the resulting websites, and deliver the content in clean, readable Markdown format. Additionally, it can directly crawl specified websites for
A CLI tool to extract text from a static Next.js export and generate llm.txt for LLM ingestion.
TypeScript version of Graby content extraction library
A tool that generates content files from website routes in multiple formats (text, JSON, markdown)
MCP server for Firecrawl web scraping integration. Supports both cloud and self-hosted instances. Features include web scraping, batch processing, structured data extraction, and LLM-powered content analysis.
MCP server for JinaAI reader
MCP server for extracting content from web pages
MCP server for extracting content from web pages
A professional library for processing, cleaning, filtering, and converting HTML content to Markdown. Features advanced customization options, presets, plugin support, fluent API, and TypeScript integration for reliable content extraction.
Professional web scraper with Puppeteer & Mozilla Readability. Extract clean content from any website with full TypeScript support.

A powerful web crawler designed specifically for LLM applications, capable of extracting clean, readable content from various web pages and converting it to Markdown format.
Model Context Protocol (MCP) server that integrates AgentQL data extraction capabilities.
A powerful web content extractor that converts articles to clean markdown
A lightweight MCP server for extracting clean web content with intelligent content filtering and Markdown conversion
MCP服务器用于抓取网页内容,支持HTML、Markdown、纯文本和JSON格式,特别优化了微信公众号文章和学术论文的抓取
Hyperbrowser Model Context Protocol Server
MCP server for web browsing and content extraction
Generate LLM-friendly text files from Next.js applications by crawling sitemaps and extracting content
Site configuration loader for Graby-TS with dynamic imports
MCP server for extracting YouTube video content with transcript processing.
MCP server for scraping images and text from websites with comprehensive web content extraction capabilities
🔍 MCP 服务器,可让您使用内置缓存搜索和访问 Svelte 文档。
LLM-optimized MCP server for fetching and processing Medium articles
curl but in markdown - fetches content from URLs and converts to markdown
A lightweight alternative to Mozilla's Readability library for extracting readable content from web pages
A Model Context Protocol server for web search with content extraction
MCP server for fetching web content using Playwright browser
Model Context Protocol (MCP) server that integrates AgentQL data extraction capabilities.
IFTP Service JavaScript client library for browser integration
A command-line interface for extracting main content from web pages and articles
A utility for cataloguing the metadata for a URL
A low-level node.js web page content extractor based on `parse5`.
MCP server for extracting web content using web-content-extract library
MCP server for JinaAI search
MCP Server for fetching Confluence page content with authentication
MCP server for JinaAI reader
MCP server for fetching web content using Playwright browser
Model Context Protocol server for WebScraping.AI API. Provides LLM-powered web scraping tools with Chromium JavaScript rendering, rotating proxies, and HTML parsing.
MCP server for fetching web content using Playwright browser
Hyperbrowser Model Context Protocol Server
Oblien Search SDK - AI-powered web search, content extraction, and website crawling. Full documentation at https://oblien.com/docs/search-api
MCP server for Svelte docs
Enhanced MCP Server for intelligent search with real-time data extraction, AI integration, and Vietnamese financial content support