rebrowser-puppeteer-core
A drop-in replacement for puppeteer-core patched with rebrowser-patches. It allows to pass modern automation detection tests.
Found 379 results for web-scraping
A drop-in replacement for puppeteer-core patched with rebrowser-patches. It allows to pass modern automation detection tests.
Extract article content and metadata from web pages.
MCP server for Firecrawl web scraping integration. Supports both cloud and self-hosted instances. Features include web scraping, search, batch processing, structured data extraction, and LLM-powered content analysis.
A drop-in replacement for puppeteer patched with rebrowser-patches. It allows to pass modern automation detection tests.
MCP server for web research
A simple yet powerful module to retrieve organic search results and much more from Google.
MCP server for fetching web content using Playwright browser
A fully typed Brave Search API wrapper, providing easy access to web search, local POI search, and automatic polling for web search summary feature.
A drop-in replacement for playwright-core patched with rebrowser-patches. It allows to pass modern automation detection tests.
A drop-in replacement for playwright patched with rebrowser-patches. It allows to pass modern automation detection tests.
MCP server for web scraping using Scrape.do API
A simple Telegram channel scraper
Promptbook: Turn your company's scattered knowledge into AI ready books
MCP server for brave-real-browser
The library scraper for WhatsApp bot or Restfull API's
🚀 MARIA v4.4.1 - Enterprise AI Development Platform with identity system and character voice implementation. Features 74 production-ready commands with comprehensive fallback implementation, local LLM support, and zero external dependencies. Includes nat
Model Context Protocol (MCP) server for fetching data from the web
Collection of patches for puppeteer and playwright to avoid automation detection and leaks. Helps to avoid Cloudflare and DataDome CAPTCHA pages. Easy to patch/unpatch, can be enabled/disabled on demand.
The official TypeScript library for the Anchorbrowser API
A Puppeteer-based network scanner for analyzing web traffic, generating adblock filter rules, and identifying third-party requests. Features include fingerprint spoofing, Cloudflare bypass, content analysis with curl/grep, and multiple output formats.
Data ingestion nodes for Gravity workflow system
Kalpana (कल्पना) - AI development assistant with multi-runtime containerized execution, web automation, multi-modal analysis, error checking, and intelligent context management
n8n community node for HeadlessX API integration - web scraping, screenshots, and PDF generation
JavaScript/TypeScript SDK for Deepcrawl API - A powerful web scraping and crawling service
Node.js native bindings for libcurl-impersonate. Impersonate Chrome, Edge, Firefox and Safari TLS fingerprints.
Agentic DOM Intelligence - A lightweight TypeScript library for DOM analysis and manipulation, designed for web automation and AI agents
Hyperbrowser Model Context Protocol Server
LLM-ready HTML to Markdown pipeline with Readability, htmlparser2, and post-processing utilities.
A TypeScript library for performing Google searches with support for proxy, pagination, and customization
TypeScript SDK for Crawl4AI REST API - Bun & Node.js compatible
A web scraping framework for various websites using Playwright.
N8n node for integrating Stagehand browser automation with Browserless support
A custom n8n node for integrating with ScrapingDog to perform web scraping tasks.
A module for downloading TikTok videos by the URL
Toolkit for extracting email addresses from HTML and remote websites
JavaScript client for Raggle API
A TypeScript service to interact with the SearXNG search engine API, enabling customizable searches and result retrieval.
n8n node to control browser-use AI-powered browser automation with Nodes-as-Tools support
MCP server for web search and content extraction with multiple URL support and memory optimizations
Package for Apify/Crawlee that allows to store encrypted text values into the Storages
AI-powered QA agent using LLM models for automated testing and web interaction
CLI tool for summarizing web articles in Japanese using Anthropic Claude API. Fetches content from URLs and generates both 3-line summaries and full translations in polite Japanese.
An MCP protocol-based web content fetching tool that supports multiple modes and formats, can be integrated with AI assistants like Claude
UmbrellaMode shared library
Web scraping for Claude Desktop, Codex, and Gemini using Scrapedo API. Simple setup with npx.
Intelligent web scraping with AI Q&A, PDF support and multi-level fallback system - 11x faster than traditional scrapers
n8n node to extract main content from webpages using Defuddle library
Clean, cached web content for agents—Markdown + citations
A typescript wrapper around cURL-impersonate.
MCP server for browser inspection with Puppeteer - network monitoring and console error tracking
MCP (Model Context Protocol) server for Crawl4AI - Universal web crawling and data extraction. Supports STDIO, SSE, and HTTP transports.
A simple yet powerful module to retrieve organic search results and much more from Google.
A powerful NestJS HTML parsing service with XPath and CSS selector support, proxy configuration, random user agents, and rich response metadata including headers and status codes
A complete Retrieval-Augmented Generation system using pgvector, LangChain, and LangGraph for Node.js applications with dynamic embedding and model providers - supports OpenAI, Anthropic, HuggingFace, Azure, Google AI, and more
AnyCrawl MCP Server - Adds powerful web scraping and crawling to Cursor, Claude and any other LLM clients
CrawlForge MCP Server - Professional Model Context Protocol server with 19 comprehensive web scraping, crawling, and content processing tools.
Efficient response caching for Playwright automation scripts.
Web crawler and API for aggregating and serving digital rights organizations' publications.
Smart web scraper node for n8n with automatic failover and content extraction
Comprehensive Selenium MCP Server with full WebDriver functionality for browser automation and testing
n8n node for Anchor Browser API - browser automation and control
MCP Document Converter Server — A Model Context Protocol server for seamless document format conversion and processing
Modern TypeScript library for collecting public Instagram content with smart delays, mobile-first approach, and media support
MCP server for web search and semantic page content retrieval with local caching
Model Context Protocol (MCP) server that integrates AgentQL data extraction capabilities.
🤖 AI-friendly live session automation with REAL screenshot backgrounds (no transparency issues!) - control your EXISTING browser with visual debug panel. Perfect for AI agents!
Puppeteer MCP Server for browser automation via Model Context Protocol
Clado Model Context Protocol Server
A Model Context Protocol (MCP) server that provides web development tools for AI assistants. Enables browser automation, DOM inspection, network monitoring, and console analysis through Playwright.
A clean and powerful Twitter/X scraping library with CycleTLS support, proxy configuration, and full TypeScript type definitions | 简洁的 Twitter/X 爬虫库,支持 CycleTLS 和代理,提供完整的 TypeScript 类型定义
Xnxx Search and information scraper
TypeScript SDK for Scraper Microservice - Server-side only
A modern, fast Node.js CLI powered by arasadrahman
MCP server and client for web search and page viewing tools - DuckDuckGo search and web scraping
Plugin for browser actions and web scraping
Analyze HTML content visibility for AI crawlers and citations - compare static HTML vs fully rendered content
A non-official API to interact with UniRV's SEI system, focused on student functionalities.
n8n node for converting URLs to HTML using pdfmunk API
Node.js libcurl bindings using koffi with browser fingerprint capabilities
n8n community node for ScrapeOps Proxy, Parser, and Data APIs for web scraping and data extraction
Use LLMs to robustly extract and enrich structured data from HTML and markdown
Sebuah Module Scraper yang dibuat oleh Sxyz dan SuzakuTeam untuk memudahkan penggunaan scraper di project ESM maupun CJS.
MCP server for web content fetching, summarizing, comparing, and extracting information
Official Aluvia proxy management SDK for Node.js and modern JavaScript environments
Library to oEmbed a resource
Nodo de n8n para automatización web con Playwright, resolución de captchas con 2captcha y soporte para proxies
An AI web browsing framework focused on simplicity and extensibility.
Harvester is a lightweight and highly optimized javascript library for extracting data from the DOM tree. It supports extraction of tag texts with specified types and attributes. it's tiny and has no dependencies and also works with Puppeteer
Autonomous AI research agent that conducts comprehensive research on any topic and generates detailed reports with citations
utility for web scraping and fetching the html from a url or using puppeteer to interact with the page. getHtml uses various strategies in a 'waterfall' approch to get the content of the url, depending on priorities, such as stealth, speed, freshness.
A Model Context Protocol server providing tools for HTTP requests, GraphQL queries, WebSocket connections, and browser automation
Model Context Protocol (MCP) integration for Scraper.is - A web scraping tool for AI assistants
Complete n8n Playwright node with all Microsoft Playwright MCP tools and AI assistant support for advanced browser automation
A Node.js TypeScript API for scraping Panini Brasil product information following Clean Architecture principles
A powerful HTTP client for Node.js based on libcurl with browser fingerprinting capabilities.
A wrapper around cURL-impersonate, a binary which can be used to bypass TLS fingerprinting.
n8n node for Browser Use Cloud API - Automate web tasks with AI agents
[](https://badge.fury.io/js/plugin-books-pro)
MCP server for Deep Research. Provides specialized AI-powered deep research capabilities with no rate limits - faster than ChatGPT Deep Research, more thorough than Grok DeepSearch or Perplexity Deep Research.
A sophisticated website comparison tool with intelligent content analysis and offset-aware difference detection
TypeScript MCP server for Crawl4AI - web crawling and content extraction
Unofficial high performance API for SIGAA IFSC using web scraping.
🚀 MCP SERVER FIXED v3.7.9! Resolved import errors, middleware conflicts, type hints - NOW WORKING PERFECTLY!
A simple yet powerful module to retrieve organic search results and much more from Google.
Work with the internet as if it were your own API. Automate web interactions across popular internet platforms.
Model Context Protocol (MCP) server for pure.md, the markdown delivery network for LLMs
The library scraper for WhatsApp bot or Restfull API's
Webagent n8n nodes package
Lightweight, runtime-safe crawling → clean Markdown
Olostep MCP server for web scraping, google search and website urls search.
MCP server for browser automation with custom scripts
movietorrent scraper for extracting movie news from all pages.
Odysseus is a web scraping library built on top of Playwright, designed to handle dynamic web pages and CAPTCHA challenges with ease.
HunterBot Actor SDK - Official SDK for building web scraping actors on HunterBot platform
A typescript wrapper around cURL-impersonate.
A tool for extracting structured content from web pages with customizable selectors and crawling options
A Model Context Protocol (MCP) server for Playwright browser automation with dynamic CDP endpoint support
A Model Context Protocol (MCP) server that provides access to FetchSERP API for SEO analysis, SERP data, web scraping, and keyword research. Supports both stdio and HTTP transport modes.
Scraper untuk konten dewasa dari situs xgrovy.
MCP server for JinaAI grounding
Enterprise-grade Fastify TypeScript API for Syosetu.com data extraction using official API and web scraping. Run instantly with 'npx @tomisakae/syosetu-api'
A Model Context Protocol (MCP) server that provides access to FetchSERP API for SEO analysis, SERP data, web scraping, and keyword research. Supports both stdio and HTTP transport modes.
A standards-compliant generator for producing robots.txt files
MCP server for browser automation using McpXbridge
DeepSearch MCP Server with Brave Search API and Puppeteer content extraction
MCP server for fetching web content using Playwright browser
Browser Native client SDK for web scraping and content extraction API
n8n node for Firecrawl v2 API - Web scraping, crawling, and data extraction tool for workflows and AI agents
Node.js wrapper for undetected-chromedriver with automatic setup and cross-platform support
MCP server for Firecrawl web scraping integration. Supports both cloud and self-hosted instances. Features include web scraping, batch processing, structured data extraction, and LLM-powered content analysis.
Model Context Protocol (MCP) server that integrates AgentQL data extraction capabilities.
MCP server for Crawlbase API - enables web scraping through Model Context Protocol
MCP server for JinaAI search
n8n MCP server with Playwright browser automation capabilities
MCP server for extracting and categorizing images from web pages with intelligent classification
Model Context Protocol (MCP) server for web scraping with Micrawl - exposes scraping capabilities to AI assistants
Electron web scraper for Etherscan transactions - External and Internal transaction hash extractor
Linkd Model Context Protocol Server
一个基于 Cheerio 的 HTML 解析和数据提取工具库
API for CEFETMG-SIGAA plataform, forked from sigaa-api project.
A tool to scrape the clearance certificate status from the WSIB Online Services website.
Core scraping engine for Micrawl - supports Playwright and HTTP drivers with multi-format output
Local Browser MCP Server for web automation with Playwright integration
A powerful web crawler designed specifically for LLM applications, capable of extracting clean, readable content from various web pages and converting it to Markdown format.
MCP (Model Context Protocol) сервер для работы с браузером через Puppeteer
linktree-parser is a TypeScript library for scraping and extracting account, links, banners, and metadata from Linktree profiles.
MCP server for JinaAI search
MCP server for JinaAI reader
Papercut is a scraping/crawling library for Node.js, written in Typescript.
IMDb scraper for extracting movie reviews from IMDb pages.
Model Context Protocol server for WebScraping.AI API. Provides LLM-powered web scraping tools with Chromium JavaScript rendering, rotating proxies, and HTML parsing.
A Model Context Protocol (MCP) server for WaterCrawl, enabling AI systems to perform web crawling and search operations
A simple yet powerful module to retrieve organic search results and much more from Google.
MCP client for MCPRelay proxy service - provides web access for AI agents
MCP server for Supadata video & web scraping integration. Features include YouTube, TikTok, Instagram, Twitter, and file video transcription, web scraping, batch processing and structured data extraction.
Gather information of an osu!droid user via web scraping.
A high-performance web crawler powered by Bun that downloads pages and converts them to Markdown
Professional multi-dictionary scraper supporting WordReference and Linguee with unified API, TypeScript definitions, and comprehensive language coverage for 1000+ language pairs.
A fork of googlethis to get specific data for different needs.
A powerful and flexible web scraping library with concurrent processing and DOM hierarchy awareness
Clado Model Context Protocol Server
Made to scraping novels with Puppeter
1:1 (sorta) replacement for Puppeteer, but undetected
OLX MCP server that enables Claude Desktop to browse and search OLX listings across multiple domains (PT, PL, BG, RO, UA)
A command-line tool for monitoring financial data and market trends in real-time directly from your terminal.
Automatically detect and solve various captcha types in Playwright & Puppeteer with 2Captcha/CapMonster Cloud integration
A simple Node.js library to fetch Google Maps reviews
一个强大的网站链接抓取工具,支持深度抓取、认证和页面分析
A typescript wrapper around cURL-impersonate.
LangChain tools for Decodo's Scraper API
An intuitive DSL for Puppeteer, simplifying web automation and testing. Currently in alpha, subject to changes.
Tool for automatically analyzing and summarizing library documentation for use with LLM's
Linkd Model Context Protocol Server
MCP server for converting URLs to text using urltoany.com
A minimal TypeScript library for fetching and parsing Google Scholar pages.
A comprehensive TypeScript toolkit for building robust web scrapers with Crawlee, featuring maximum configurability and CLI generator
MCP server for advanced web scraping with Crawl4AI - supports authentication, dynamic content, and AI extraction
Cheerio-based crawler for server-side HTML parsing and extraction
TypeScript job scraper for LinkedIn, Indeed, Glassdoor, ZipRecruiter & more - rewritten from python-jobspy
Universal document-to-markdown and section splitter for HTML, URLs, and PDFs.
Google parser is a lightweight yet powerful HTTP client based Google Search Result scraper/parser with the purpose of sending browser-like requests out of the box. This is very essential in the web scraping industry to blend in with the website traffic.
Web crawler in NodeJS
Servidor MCP personalizado para automatización de navegadores usando Puppeteer
Puppeteer-based crawler for Chrome automation and dynamic content scraping
Utility functions for web crawling - sitemap processing, link extraction, system info
API crawler for REST and GraphQL endpoint crawling with auto-detection
n8n community node for Dumpling AI integration
`fdy-scraping` is a versatile HTTP client designed for making API requests with support for proxy configuration, debugging, and detailed error handling. It utilizes the [`got-scraping`](https://github.com/apify/got-scraping) library for HTTP operations.
TypeScript SDK for the Structured Scraper API - BitBuffet
Browser fingerprint bypass library using Rust for TLS/HTTP2 impersonation
HTTP crawler for basic web scraping without JavaScript execution
A powerful TypeScript library for downloading videos from web pages, including M3U8/HLS streams, with browser automation and intelligent stream detection
Toolkit Node.js complet pour extraire données d'animés, métadonnées et thumbnails depuis Crunchyroll avec techniques anti-détection 2024/2025
Professional web scraper with Puppeteer & Mozilla Readability. Extract clean content from any website with full TypeScript support.
The next generation web scraping framework
Playwright-based crawler for full browser automation and JavaScript rendering
A module using puppeteer to scrape several search engines such as Google, Bing and Duckduckgo
Core crawler framework functionality - TypeScript web crawling library
A TypeScript library that fetches URLs and converts them to structured JSON and Markdown format.
A CLI tool to extract text from a static Next.js export and generate llm.txt for LLM ingestion.
A powerful web crawler that extracts content from web pages and converts them to clean Markdown format, with support for code blocks and GitHub Flavored Markdown
Professional website mirroring tool with intelligent framework preservation, AI-powered analysis, and comprehensive asset optimization
n8n node for Exa Websets API - Create, manage, and query structured datasets from web sources
AI-powered hexagonal framework with OpenAI integration, database adapters, web scraping, and Next.js demo application
A simple yet powerful module to retrieve organic search results and much more from Google.
MCP server for Firecrawl web scraping integration. Supports both cloud and self-hosted instances. Features include web scraping, batch processing, structured data extraction, and LLM-powered content analysis.
TypeScript SDK for Bright Data APIs - Web Unlocker, SERP, and Scraper APIs
⚡ Lightning-fast MCP browser dev tool. Navigate → Get instant structured data. No screenshots needed! Puppeteer: 📸 → CSS selectors → JS eval. Supapup: semantic IDs ready to use. 10x faster, 90% fewer tokens.
MCP server for web scraping with Cheerio
Command-line interface for creating and managing crawler projects
Providers are the core of applications, where the subtitles are collected. Each provider exports a unique strategy for gathering data. From legendastv's web scraping from opensubtitle API usage, you can collect subtitles from your favorite tv shows and mo
Scrap the web asynchronously in live, reusing Node.js, all in one file, with a few lines!
n8n community node for competitive intelligence analysis using LLM-powered web scraping
CLI tool for converting web pages to clean, LLM-friendly markdown. Fetches content from URLs and converts HTML to optimized markdown format perfect for LLM training, RAG systems, and AI applications.
A Node.js package to generate link previews from URLs
A typescript wrapper around cURL-impersonate.
A powerful Node.js tool for searching and downloading books from Anna's Archive with Cloudflare bypass
Sebuah Module Scraper yang dibuat oleh Sxyz dan SuzakuTeam untuk memudahkan penggunaan scraper di project ESM maupun CJS.
A module using puppeteer to scrape several search engines such as Google, Bing
n8n node to interact with a Browserless instance for web scraping
Pure MCP Server for Jina.AI Advanced web scraping
A flexible and powerful library designed to extract and transform data from HTML documents using user-defined schemas
Package for Crawlee that should allows to import and use packages, that are using older version of Apify SDK.
MCP服务器用于抓取网页内容,支持HTML、Markdown、纯文本和JSON格式,特别优化了微信公众号文章和学术论文的抓取
A powerful web scraping library built with Playwright
Sebuah Module Scraper yang dibuat oleh Sxyz dan SuzakuTeam untuk memudahkan penggunaan scraper di project ESM maupun CJS.
Easily generate unique and optimized CSS or XPath selectors for any DOM element.
A low level stock data aggregation tool, a boring lib for others to build upon
A powerful Node.js package to scrape news from popular Nepali news portals including Kathmandu Post and Kantipur
一个智能Logo提取和处理的MCP服务器,支持从网站URL自动识别并提取Logo图标
High-performance, configurable, batch-generating User-Agent spoofing library. Supports multiple browsers, devices, and returns detailed meta information. Perfect for web scraping, automated testing, proxy pools and more.
A powerful web crawler and knowledge processing toolkit for extracting and managing web content
Convert any webpage to markdown using headless Chrome
Crawl websites and convert them to JSON with ease
A Node.js application that fetches examination result and other notices from IOE's and IOM's website and sends desktop notifications.
Powerful async task runner for Node.js with concurrency control, smart retries, timeouts & comprehensive reporting. Perfect for web scraping, API processing, file operations & bulk async operations.
A professional library for processing, cleaning, filtering, and converting HTML content to Markdown. Features advanced customization options, presets, plugin support, fluent API, and TypeScript integration for reliable content extraction.
Simple Module All In One Scrapers Untuk Memenuhi Kebutuhan Pengumpulan Data Kamu Dari 50+ Website!
MCP server for web research
A powerful web crawler designed specifically for LLM applications, capable of extracting clean, readable content from various web pages and converting it to Markdown format.
Free scraper for WhatsApp bot or REST API
🚀 An easy-to-handle Node.js scraper that allow you to scrape them all in a record time.
A comprehensive TypeScript library for automatic proxy management with validation, rotation, and intelligent selection
High-performance web crawler implemented in Go with JavaScript bindings
Model Context Protocol (MCP) server for pure.md, the markdown delivery network for LLMs
A comprehensive web scraping library with resumable operations, middleware support, and built-in rate limiting
A TypeScript API client library for Firecrawl
MCP server for reading web content using Jina AI Reader API
Lightweight scraper written in TypeScript using ES6 generators.
Robust Node.js module for Google Custom Search with rate limiting, error handling, and offline testing capabilities. Supports parallel searches and comprehensive result formatting.