rebrowser-puppeteer-core
A drop-in replacement for puppeteer-core patched with rebrowser-patches. It allows to pass modern automation detection tests.
Found 307 results for web-scraping
A drop-in replacement for puppeteer-core patched with rebrowser-patches. It allows to pass modern automation detection tests.
MCP server for Firecrawl web scraping integration. Supports both cloud and self-hosted instances. Features include web scraping, search, batch processing, structured data extraction, and LLM-powered content analysis.
Extract article content and metadata from web pages.
A drop-in replacement for puppeteer patched with rebrowser-patches. It allows to pass modern automation detection tests.
A simple yet powerful module to retrieve organic search results and much more from Google.
MCP server for fetching web content using Playwright browser
A fully typed Brave Search API wrapper, providing easy access to web search, local POI search, and automatic polling for web search summary feature.
A drop-in replacement for playwright patched with rebrowser-patches. It allows to pass modern automation detection tests.
A drop-in replacement for playwright-core patched with rebrowser-patches. It allows to pass modern automation detection tests.
🚀 MARIA v3.5.4 - Production Ready Release. /code command fully operational with verified 35% performance boost, enhanced error handling, and enterprise reliability. Features natural language code operations, progressive help system, intelligent model sel
A module for downloading TikTok videos by the URL
Hyperbrowser Model Context Protocol Server
A Puppeteer-based network scanner for analyzing web traffic, generating adblock filter rules, and identifying third-party requests. Features include fingerprint spoofing, Cloudflare bypass, content analysis with curl/grep, and multiple output formats.
Collection of patches for puppeteer and playwright to avoid automation detection and leaks. Helps to avoid Cloudflare and DataDome CAPTCHA pages. Easy to patch/unpatch, can be enabled/disabled on demand.
A Model Context Protocol (MCP) server that provides access to FetchSERP API for SEO analysis, SERP data, web scraping, and keyword research. Supports both stdio and HTTP transport modes.
MCP server for web research
MCP Document Converter Server — A Model Context Protocol server for seamless document format conversion and processing
The library scraper for WhatsApp bot or Restfull API's
MCP server for JinaAI reader
MCP server for JinaAI grounding
Promptbook: Run AI apps in plain human language across multiple models and platforms
Model Context Protocol (MCP) integration for Scraper.is - A web scraping tool for AI assistants
Model Context Protocol (MCP) server that integrates AgentQL data extraction capabilities.
A simple Telegram channel scraper
MCP server for Deep Research. Provides specialized AI-powered deep research capabilities with no rate limits - faster than ChatGPT Deep Research, more thorough than Grok DeepSearch or Perplexity Deep Research.
Model Context Protocol (MCP) server for pure.md, the markdown delivery network for LLMs
A powerful and flexible web scraping library with concurrent processing and DOM hierarchy awareness
Linkd Model Context Protocol Server
TypeScript MCP server for Crawl4AI - web crawling and content extraction
A tool for extracting structured content from web pages with customizable selectors and crawling options
MCP server for JinaAI search
Model Context Protocol (MCP) server for fetching data from the web
n8n node to control browser-use AI-powered browser automation with Nodes-as-Tools support
MCP server for browser automation with custom scripts
🚀 MCP SERVER FIXED v3.7.9! Resolved import errors, middleware conflicts, type hints - NOW WORKING PERFECTLY!
一个强大的网站链接抓取工具,支持深度抓取、认证和页面分析
JavaScript/TypeScript SDK for DeepCrawl API - A powerful web scraping and crawling service
A Model Context Protocol (MCP) server for Playwright browser automation with dynamic CDP endpoint support
An MCP protocol-based web content fetching tool that supports multiple modes and formats, can be integrated with AI assistants like Claude
Node.js native bindings for libcurl-impersonate. Impersonate Chrome, Edge, Firefox and Safari TLS fingerprints.
A high-performance web crawler powered by Bun that downloads pages and converts them to Markdown
A powerful HTTP client for Node.js based on libcurl with browser fingerprinting capabilities.
TypeScript SDK for Crawl4AI REST API - Bun & Node.js compatible
AI-powered hexagonal framework with OpenAI integration, database adapters, web scraping, and Next.js demo application
A custom n8n node for integrating with ScrapingDog to perform web scraping tasks.
A TypeScript service to interact with the SearXNG search engine API, enabling customizable searches and result retrieval.
A typescript wrapper around cURL-impersonate.
MCP server for extracting and categorizing images from web pages with intelligent classification
Puppeteer MCP Server for browser automation via Model Context Protocol
Lightweight, runtime-safe crawling → clean Markdown
Web crawler and API for aggregating and serving digital rights organizations' publications.
A Model Context Protocol server providing tools for HTTP requests, GraphQL queries, WebSocket connections, and browser automation
Plugin for browser actions and web scraping
Modern TypeScript library for collecting public Instagram content with smart delays, mobile-first approach, and media support
Made to scraping novels with Puppeter
Core crawler framework functionality - TypeScript web crawling library
Utility functions for web crawling - sitemap processing, link extraction, system info
Package for Apify/Crawlee that allows to store encrypted text values into the Storages
Xnxx Search and information scraper
JavaScript client for Raggle API
AI-powered QA agent using LLM models for automated testing and web interaction
A simple yet powerful module to retrieve organic search results and much more from Google.
Efficient response caching for Playwright automation scripts.
Vuagen is a simple and flexible User-Agent generator for browser automation, testing, and web scraping.
API crawler for REST and GraphQL endpoint crawling with auto-detection
A powerful CLI tool for scraping Facebook Ads Library with infinite scroll support
Sebuah Module Scraper yang dibuat oleh Sxyz dan SuzakuTeam untuk memudahkan penggunaan scraper di project ESM maupun CJS.
MCP client for MCPRelay proxy service - provides web access for AI agents
🍢 Skewer web data perfectly - Smart Indonesian web crawler library
A modern, fast Node.js CLI powered by arasadrahman
TypeScript job scraper for LinkedIn, Indeed, Glassdoor, ZipRecruiter & more - rewritten from python-jobspy
⚡ Lightning-fast MCP browser dev tool. Navigate → Get instant structured data. No screenshots needed! Puppeteer: 📸 → CSS selectors → JS eval. Supapup: semantic IDs ready to use. 10x faster, 90% fewer tokens.
A comprehensive web scraping library with resumable operations, middleware support, and built-in rate limiting
A powerful web scraping library built with Playwright
HTTP crawler for basic web scraping without JavaScript execution
n8n community node for ScrapeOps Proxy, Parser, and Data APIs for web scraping and data extraction
MCP server for extracting content from web pages
Complete n8n Playwright node with all Microsoft Playwright MCP tools and AI assistant support for advanced browser automation
MCP server for converting URLs to text using urltoany.com
A wrapper around cURL-impersonate, a binary which can be used to bypass TLS fingerprinting.
Türkiye kargo takip modülü - Web scraping tabanlı kargo takip sistemi
MCP server and client for web search and page viewing tools - DuckDuckGo search and web scraping
A CLI tool to extract text from a static Next.js export and generate llm.txt for LLM ingestion.
MCP server for Supadata video & web scraping integration. Features include YouTube, TikTok, Instagram, Twitter, and file video transcription, web scraping, batch processing and structured data extraction.
MCP server for web search and content extraction with multiple URL support and memory optimizations
linktree-parser is a TypeScript library for scraping and extracting account, links, banners, and metadata from Linktree profiles.
Node.js wrapper for undetected-chromedriver with automatic setup and cross-platform support
Official JavaScript SDK for Backlab AI-Powered Web Scraping API
Use LLMs to robustly extract and enrich structured data from HTML and markdown
A Model Context Protocol (MCP) server for WaterCrawl, enabling AI systems to perform web crawling and search operations
Odysseus is a web scraping library built on top of Playwright, designed to handle dynamic web pages and CAPTCHA challenges with ease.
Toolkit Node.js complet pour extraire données d'animés, métadonnées et thumbnails depuis Crunchyroll avec techniques anti-détection 2024/2025
LangChain tools for Decodo's Scraper API
A typescript wrapper around cURL-impersonate.
Scraper untuk konten dewasa dari situs xgrovy.
1:1 (sorta) replacement for Puppeteer, but undetected
MCP server for Firecrawl web scraping integration. Supports both cloud and self-hosted instances. Features include web scraping, batch processing, structured data extraction, and LLM-powered content analysis.
MCP server for extracting web content using web-content-extract library
Web application spider with screenshot capture and customer journey documentation. Automate user flow documentation with authentication support.
DeepSearch MCP Server with Brave Search API and Puppeteer content extraction
Lightweight, runtime-safe crawling → clean Markdown
Universal document-to-markdown and section splitter for HTML, URLs, and PDFs.
Professional website mirroring tool with intelligent framework preservation, AI-powered analysis, and comprehensive asset optimization
A TypeScript library for scraping model information from the Ollama model library website. Extract details, tags, and metadata from ollama.com/library with a simple, type-safe API.
The next generation web scraping framework
A powerful web crawler designed specifically for LLM applications, capable of extracting clean, readable content from various web pages and converting it to Markdown format.
一个智能Logo提取和处理的MCP服务器,支持从网站URL自动识别并提取Logo图标
A comprehensive TypeScript library for automatic proxy management with validation, rotation, and intelligent selection
MCP (Model Context Protocol) сервер для работы с браузером через Puppeteer
Professional email scraping tool with GUI
Olostep MCOP server for web scraping, google search and website urls search.
FDFS MCP Server - Search for movies and theaters on BookMyShow using web scraping with Puppeteer to bypass Cloudflare protection
A TypeScript library for performing Google searches with support for proxy, pagination, and customization
一个智能Logo提取和处理的MCP服务器,支持从网站URL自动识别并提取Logo图标
CLI tool for summarizing web articles in Japanese using Anthropic Claude API. Fetches content from URLs and generates both 3-line summaries and full translations in polite Japanese.
Tool for automatically analyzing and summarizing library documentation for use with LLM's
Puppeteer-based crawler for Chrome automation and dynamic content scraping
Playwright-based crawler for full browser automation and JavaScript rendering
Robust Node.js module for Google Custom Search with rate limiting, error handling, and offline testing capabilities. Supports parallel searches and comprehensive result formatting.
MCP server for Crawlbase API - enables web scraping through Model Context Protocol
Powerful async task runner for Node.js with concurrency control, smart retries, timeouts & comprehensive reporting. Perfect for web scraping, API processing, file operations & bulk async operations.
Command-line interface for creating and managing crawler projects
🤖 AI-friendly live session automation with REAL screenshot backgrounds (no transparency issues!) - control your EXISTING browser with visual debug panel. Perfect for AI agents!
A TypeScript library that fetches URLs and converts them to structured JSON and Markdown format.
Cheerio-based crawler for server-side HTML parsing and extraction
Intelligent web scraping with AI Q&A, PDF support and multi-level fallback system - 11x faster than traditional scrapers
Node.js libcurl bindings using koffi with browser fingerprint capabilities
A flexible and powerful library designed to extract and transform data from HTML documents using user-defined schemas
Google parser is a lightweight yet powerful HTTP client based Google Search Result scraper/parser with the purpose of sending browser-like requests out of the box. This is very essential in the web scraping industry to blend in with the website traffic.
Model Context Protocol server for WebScraping.AI API. Provides LLM-powered web scraping tools with Chromium JavaScript rendering, rotating proxies, and HTML parsing.
A powerful NestJS HTML parsing service with XPath and CSS selector support, proxy configuration, random user agents, and rich response metadata including headers and status codes
MCP server for extracting content from URLs with proper citations
CLI tool for converting web pages to clean, LLM-friendly markdown. Fetches content from URLs and converts HTML to optimized markdown format perfect for LLM training, RAG systems, and AI applications.
`fdy-scraping` is a versatile HTTP client designed for making API requests with support for proxy configuration, debugging, and detailed error handling. It utilizes the [`got-scraping`](https://github.com/apify/got-scraping) library for HTTP operations.
A powerful web crawler designed specifically for LLM applications, capable of extracting clean, readable content from various web pages and converting it to Markdown format.
Clado Model Context Protocol Server
MCP server for Firecrawl web scraping integration. Supports both cloud and self-hosted instances. Features include web scraping, batch processing, structured data extraction, and LLM-powered content analysis.
TypeScript SDK for Bright Data APIs - Web Unlocker, SERP, and Scraper APIs
A module using puppeteer to scrape several search engines such as Google, Bing and Duckduckgo
A module using puppeteer to scrape several search engines such as Google, Bing and Duckduckgo
A tool to scrape the clearance certificate status from the WSIB Online Services website.
一个基于 Cheerio 的 HTML 解析和数据提取工具库
A Node.js CLI for web scraping any page and generating an intelligent summary using Google Gemini.
n8n community node for competitive intelligence analysis using LLM-powered web scraping
A lightweight and powerful proxy server
MCP server for web scraping with Cheerio
A module using puppeteer to scrape several search engines such as Google, Bing
Work with the internet as if it were your own API. Automate web interactions across popular internet platforms.
High-performance, configurable, batch-generating User-Agent spoofing library. Supports multiple browsers, devices, and returns detailed meta information. Perfect for web scraping, automated testing, proxy pools and more.
A minimal TypeScript library for fetching and parsing Google Scholar pages.
A fork of googlethis to get specific data for different needs.
A professional library for processing, cleaning, filtering, and converting HTML content to Markdown. Features advanced customization options, presets, plugin support, fluent API, and TypeScript integration for reliable content extraction.
A lightweight MCP server for extracting clean web content with intelligent content filtering and Markdown conversion
Crawl websites and convert them to JSON with ease
Unofficial high performance API for SIGAA IFSC using web scraping.
The library scraper for WhatsApp bot or Restfull API's
A simple yet powerful module to retrieve organic search results and much more from Google.
MCP server for web research
A simple Node.js SDK for the Bright Data API
Browser Native client SDK for web scraping and content extraction API
Webagent n8n nodes package
A comprehensive TypeScript toolkit for building robust web scrapers with Crawlee, featuring maximum configurability and CLI generator
Automatically detect and solve various captcha types in Playwright & Puppeteer with 2Captcha/CapMonster Cloud integration
Package for Crawlee that should allows to import and use packages, that are using older version of Apify SDK.
Professional multi-dictionary scraper supporting WordReference and Linguee with unified API, TypeScript definitions, and comprehensive language coverage for 1000+ language pairs.
CacheServer is an efficient web page extractor that uses Puppeteer to launch a headless browser and fetch web page content.
Shared utilities and constants for Apify scrapers
n8n community nodes for Eddie Surf web crawling and search
A powerful web crawler and knowledge processing toolkit for extracting and managing web content
Model Context Protocol (MCP) server that integrates AgentQL data extraction capabilities.
Sebuah Module Scraper yang dibuat oleh Sxyz dan SuzakuTeam untuk memudahkan penggunaan scraper di project ESM maupun CJS.
Harvester is a lightweight and highly optimized javascript library for extracting data from the DOM tree. It supports extraction of tag texts with specified types and attributes. it's tiny and has no dependencies and also works with Puppeteer
A Node.js package to generate link previews from URLs
Collection of patches for puppeteer and playwright to avoid automation detection and leaks. Helps to avoid Cloudflare and DataDome CAPTCHA pages. Easy to patch/unpatch, can be enabled/disabled on demand.
utility for web scraping and fetching the html from a url or using puppeteer to interact with the page. getHtml uses various strategies in a 'waterfall' approch to get the content of the url, depending on priorities, such as stealth, speed, freshness.
Firecrawl API tools for OpenAI, Anthropic, and AI SDK
Sebuah Module Scraper yang dibuat oleh Sxyz dan SuzakuTeam untuk memudahkan penggunaan scraper di project ESM maupun CJS.
Lightweight scraper written in TypeScript using ES6 generators.
Simple Module All In One Scrapers Untuk Memenuhi Kebutuhan Pengumpulan Data Kamu Dari 50+ Website!
Free scraper for WhatsApp bot or REST API
Automated browser navigation and visual mapping system for web applications
Enterprise-grade Fastify TypeScript API for Syosetu.com data extraction using official API and web scraping. Run instantly with 'npx @tomisakae/syosetu-api'
MCP server for web browsing and content extraction
Electron web scraper for Etherscan transactions - External and Internal transaction hash extractor
A command-line tool that searches GitHub for URLs
A powerful web crawler that extracts content from web pages and converts them to clean Markdown format, with support for code blocks and GitHub Flavored Markdown
Browser automation tool for detecting interactive elements on web pages
A simple yet powerful module to retrieve organic search results and much more from Google.
Repositório não-oficial responsável por pegar informações do sistema SIGAA.
[](https://badge.fury.io/js/plugin-books-pro)
AI-powered Playwright automation framework that converts natural language instructions into executable Playwright code with step-by-step execution and intelligent error recovery
A command-line tool for monitoring financial data and market trends in real-time directly from your terminal.
A CLI tool to crawl sitemap.xml and convert all pages to LLM-friendly Markdown
n8n community node for Dumpling AI integration
A TypeScript API client library for Firecrawl
A library for converting HTML and XML into JSON
Scan given website recursively and report 404 links
Library to oEmbed a resource
一个基于 MCP 协议的网页内容获取工具,支持多种模式和格式,可与 Claude 等 AI 助手集成
Lightfeed SDK for Node.js
Model Context Protocol (MCP) server for pure.md, the markdown delivery network for LLMs
A package to scrape images of manga chapters from mangaonline.biz
Scrap the web asynchronously in live, reusing Node.js, all in one file, with a few lines!
Model Context Protocol (MCP) server for Firecrawl Simple - provides web scraping and crawling capabilities to LLMs
一个MCP服务,提供获取百度网站HTML内容的功能,支持Cursor、Claude Desktop、Cherry Studio等主流MCP客户端
Pure MCP Server for Jina.AI Advanced web scraping
Hyperbrowser Model Context Protocol Server
🪡 Social account detection and extraction in js, e.g. for crawling/scraping.
Web crawler in NodeJS
Generic content processing framework for web scraping and AI extraction
Easily generate unique and optimized CSS or XPath selectors for any DOM element.
A nodejs tool for scraping Google Reviews using Puppeteer.
MCP server for JinaAI reader
A low level stock data aggregation tool, a boring lib for others to build upon
🚀 An easy-to-handle Node.js scraper that allow you to scrape them all in a record time.
A powerful Node.js package to scrape news from popular Nepali news portals including Kathmandu Post and Kantipur
Telegram Channel Scraper
A utility for cataloguing the metadata for a URL
A command line interface to get french 🇫🇷 word definitions & synonymes from larousse website
A powerful web content extractor that converts articles to clean markdown
A wrapper around cURL-impersonate, a binary which can be used to bypass TLS fingerprinting.
MCP server for Firecrawl Simple — a web scraping and site mapping tool enabling LLMs to access and process web content
A URL scraper for extracting various metadata, including Open Graph, JSON-LD, and more
Apify extra is Node.js library extension of Apify SDK.
Extract structured content from any HTML website
movienews scraper for extracting movie news from all pages.
DeepSearch MCP Server with Brave Search API and Puppeteer content extraction
TypeScript implementation of OGP (Open Graph Protocol) information extraction server for Model Context Protocol (MCP)
A typescript wrapper around cURL-impersonate.
MCP Server for intelligent web page fetching with automatic cookie support
A simple yet powerful module to retrieve organic search results and much more from Google.
A command-line interface for extracting main content from web pages and articles
A typescript wrapper around cURL-impersonate.
movietorrent scraper for extracting movie news from all pages.
Module Yang Menyediakan 2 Metode Request (https/http)
Lightweight scraper written in TypeScript using ES6 generators.
Collection of patches for puppeteer and playwright to avoid automation detection and leaks. Helps to avoid Cloudflare and DataDome CAPTCHA pages. Easy to patch/unpatch, can be enabled/disabled on demand.
Papercut is a scraping/crawling library for Node.js, written in Typescript.
A personal typescript wrapper around cURL-impersonate.
MCP server for extracting webpage creation and modification timestamps
MCP server for extracting content from web pages
A TypeScript library for scraping game data from itch.io with a clean, scalable architecture
MCP server for web research, stealthified, improved, forked from mzxrai
Olostep MCOP server for web scraping, google search and website urls search.
Model Context Protocol (MCP) integration for Scraper.is - A web scraping tool for AI assistants
Convert any webpage to markdown using headless Chrome
A simple Node.js library to fetch Google Maps reviews
MCP server for JinaAI reader
n8n node to interact with a Browserless instance for web scraping
A robust Node.js utility to save webpage resources using Chrome DevTools Protocol via Puppeteer. Extract and download all static assets from any webpage for offline use or analysis.