JSPM

Found 307 results for web-scraping

rebrowser-puppeteer-core

A drop-in replacement for puppeteer-core patched with rebrowser-patches. It allows to pass modern automation detection tests.

  • v24.8.1
  • 60.77
  • Published

firecrawl-mcp

MCP server for Firecrawl web scraping integration. Supports both cloud and self-hosted instances. Features include web scraping, search, batch processing, structured data extraction, and LLM-powered content analysis.

  • v2.0.2
  • 59.82
  • Published

defuddle

Extract article content and metadata from web pages.

  • v0.6.6
  • 57.41
  • Published

rebrowser-puppeteer

A drop-in replacement for puppeteer patched with rebrowser-patches. It allows to pass modern automation detection tests.

  • v24.8.1
  • 56.09
  • Published

googlethis

A simple yet powerful module to retrieve organic search results and much more from Google.

  • v1.8.0
  • 56.01
  • Published

fetcher-mcp

MCP server for fetching web content using Playwright browser

    • v0.3.0
    • 50.27
    • Published

    brave-search

    A fully typed Brave Search API wrapper, providing easy access to web search, local POI search, and automatic polling for web search summary feature.

    • v0.9.0
    • 49.06
    • Published

    rebrowser-playwright

    A drop-in replacement for playwright patched with rebrowser-patches. It allows to pass modern automation detection tests.

    • v1.52.0
    • 48.08
    • Published

    rebrowser-playwright-core

    A drop-in replacement for playwright-core patched with rebrowser-patches. It allows to pass modern automation detection tests.

    • v1.52.0
    • 47.41
    • Published

    @bonginkan/maria

    🚀 MARIA v3.5.4 - Production Ready Release. /code command fully operational with verified 35% performance boost, enhanced error handling, and enterprise reliability. Features natural language code operations, progressive help system, intelligent model sel

    • v3.5.4
    • 45.64
    • Published

    @faouzkk/tiktok-dl

    A module for downloading TikTok videos by the URL

    • v1.0.1
    • 42.97
    • Published

    hyperbrowser-mcp

    Hyperbrowser Model Context Protocol Server

    • v1.0.25
    • 42.07
    • Published

    @fanboynz/network-scanner

    A Puppeteer-based network scanner for analyzing web traffic, generating adblock filter rules, and identifying third-party requests. Features include fingerprint spoofing, Cloudflare bypass, content analysis with curl/grep, and multiple output formats.

    • v1.0.82
    • 41.07
    • Published

    rebrowser-patches

    Collection of patches for puppeteer and playwright to avoid automation detection and leaks. Helps to avoid Cloudflare and DataDome CAPTCHA pages. Easy to patch/unpatch, can be enabled/disabled on demand.

    • v1.0.19
    • 40.68
    • Published

    fetchserp-mcp-server

    A Model Context Protocol (MCP) server that provides access to FetchSERP API for SEO analysis, SERP data, web scraping, and keyword research. Supports both stdio and HTTP transport modes.

    • v1.0.5
    • 40.32
    • Published

    doc-ops-mcp

    MCP Document Converter Server — A Model Context Protocol server for seamless document format conversion and processing

    • v0.3.8
    • 39.94
    • Published

    d-scrape

    The library scraper for WhatsApp bot or Restfull API's

    • v1.2.0
    • 39.54
    • Published

    @promptbook/website-crawler

    Promptbook: Run AI apps in plain human language across multiple models and platforms

    • v0.100.0-44
    • 38.68
    • Published

    scraperis-mcp

    Model Context Protocol (MCP) integration for Scraper.is - A web scraping tool for AI assistants

    • v0.1.22
    • 38.31
    • Published

    agentql-mcp

    Model Context Protocol (MCP) server that integrates AgentQL data extraction capabilities.

    • v1.0.0
    • 38.26
    • Published

    octagon-deep-research-mcp

    MCP server for Deep Research. Provides specialized AI-powered deep research capabilities with no rate limits - faster than ChatGPT Deep Research, more thorough than Grok DeepSearch or Perplexity Deep Research.

    • v1.0.18
    • 38.01
    • Published

    puremd-mcp

    Model Context Protocol (MCP) server for pure.md, the markdown delivery network for LLMs

    • v1.0.3
    • 37.89
    • Published

    web-structure

    A powerful and flexible web scraping library with concurrent processing and DOM hierarchy awareness

    • v1.0.2
    • 37.81
    • Published

    linkd-mcp

    Linkd Model Context Protocol Server

    • v1.0.25
    • 37.81
    • Published

    mcp-crawl4ai-ts

    TypeScript MCP server for Crawl4AI - web crawling and content extraction

    • v2.6.12
    • 37.60
    • Published

    mcp-web-content-pick

    A tool for extracting structured content from web pages with customizable selectors and crawling options

    • v0.0.25
    • 37.56
    • Published

    @wordbricks/fetch-mcp

    Model Context Protocol (MCP) server for fetching data from the web

    • v1.3.0
    • 36.71
    • Published

    n8n-nodes-browser-use

    n8n node to control browser-use AI-powered browser automation with Nodes-as-Tools support

    • v0.1.6
    • 36.12
    • Published

    web-parser-mcp

    🚀 MCP SERVER FIXED v3.7.9! Resolved import errors, middleware conflicts, type hints - NOW WORKING PERFECTLY!

    • v3.7.9
    • 35.78
    • Published

    web-page-analyzer-cli

    一个强大的网站链接抓取工具,支持深度抓取、认证和页面分析

    • v1.0.19
    • 35.02
    • Published

    deepcrawl

    JavaScript/TypeScript SDK for DeepCrawl API - A powerful web scraping and crawling service

    • v0.2.8
    • 34.93
    • Published

    @ejazullah/playwright-mcp-server

    A Model Context Protocol (MCP) server for Playwright browser automation with dynamic CDP endpoint support

    • v1.88.0
    • 34.91
    • Published

    @lmcc-dev/mult-fetch-mcp-server

    An MCP protocol-based web content fetching tool that supports multiple modes and formats, can be integrated with AI assistants like Claude

    • v1.3.2
    • 33.80
    • Published

    node-libcurl-ja3

    Node.js native bindings for libcurl-impersonate. Impersonate Chrome, Edge, Firefox and Safari TLS fingerprints.

    • v5.0.3
    • 33.32
    • Published

    @fwdslsh/inform

    A high-performance web crawler powered by Bun that downloads pages and converts them to Markdown

      • v0.1.3
      • 32.92
      • Published

      curl-cffi

      A powerful HTTP client for Node.js based on libcurl with browser fingerprinting capabilities.

      • v0.1.41
      • 32.78
      • Published

      crawl4ai

      TypeScript SDK for Crawl4AI REST API - Bun & Node.js compatible

      • v1.0.1
      • 32.54
      • Published

      @juhomat/hexagonal-ai-framework

      AI-powered hexagonal framework with OpenAI integration, database adapters, web scraping, and Next.js demo application

      • v0.2.2
      • 31.53
      • Published

      n8n-nodes-scraping-dog

      A custom n8n node for integrating with ScrapingDog to perform web scraping tasks.

      • v0.3.8
      • 31.09
      • Published

      searxng

      A TypeScript service to interact with the SearXNG search engine API, enabling customizable searches and result retrieval.

      • v0.0.5
      • 31.03
      • Published

      mcp-server-image-extractor

      MCP server for extracting and categorizing images from web pages with intelligent classification

        • v1.0.8
        • 30.95
        • Published

        @6digit/silktext

        Lightweight, runtime-safe crawling → clean Markdown

        • v0.1.5
        • 30.83
        • Published

        ethos-crawler

        Web crawler and API for aggregating and serving digital rights organizations' publications.

        • v1.1.1
        • 30.83
        • Published

        mcp-fetch

        A Model Context Protocol server providing tools for HTTP requests, GraphQL queries, WebSocket connections, and browser automation

        • v0.1.6
        • 30.76
        • Published

        @aduptive/instagram-scraper

        Modern TypeScript library for collecting public Instagram content with smart delays, mobile-first approach, and media support

        • v1.0.3
        • 30.47
        • Published

        novel-scraper

        Made to scraping novels with Puppeter

        • v9.0.0
        • 30.36
        • Published

        @crawlus/core

        Core crawler framework functionality - TypeScript web crawling library

        • v0.9.0
        • 30.35
        • Published

        @crawlus/utils

        Utility functions for web crawling - sitemap processing, link extraction, system info

        • v0.9.0
        • 30.14
        • Published

        crawlee-storage-extensions

        Package for Apify/Crawlee that allows to store encrypted text values into the Storages

        • v1.0.11
        • 30.04
        • Published

        xnxx-scraper

        Xnxx Search and information scraper

        • v1.0.4
        • 29.40
        • Published

        raggle-js

        JavaScript client for Raggle API

        • v0.2.55
        • 29.39
        • Published

        qa-agent

        AI-powered QA agent using LLM models for automated testing and web interaction

        • v1.1.0-beta.0
        • 28.90
        • Published

        @victorsouzaleal/googlethis

        A simple yet powerful module to retrieve organic search results and much more from Google.

        • v1.8.1
        • 28.58
        • Published

        playwright-cache

        Efficient response caching for Playwright automation scripts.

        • v1.0.1
        • 28.45
        • Published

        vuagen

        Vuagen is a simple and flexible User-Agent generator for browser automation, testing, and web scraping.

        • v1.0.3
        • 28.00
        • Published

        @crawlus/api

        API crawler for REST and GraphQL endpoint crawling with auto-detection

        • v0.9.0
        • 27.58
        • Published

        facebook-ads-scraper

        A powerful CLI tool for scraping Facebook Ads Library with infinite scroll support

          • v1.0.3
          • 27.39
          • Published

          @suzakuteam/scraper-node

          Sebuah Module Scraper yang dibuat oleh Sxyz dan SuzakuTeam untuk memudahkan penggunaan scraper di project ESM maupun CJS.

            • v1.3.0
            • 27.01
            • Published

            @mcprelay/client

            MCP client for MCPRelay proxy service - provides web access for AI agents

            • v1.0.7
            • 26.75
            • Published

            sate.js

            🍢 Skewer web data perfectly - Smart Indonesian web crawler library

            • v1.1.1
            • 26.68
            • Published

            moshai-cli

            A modern, fast Node.js CLI powered by arasadrahman

              • v1.0.0
              • 26.56
              • Published

              ts-jobspy

              TypeScript job scraper for LinkedIn, Indeed, Glassdoor, ZipRecruiter & more - rewritten from python-jobspy

              • v1.3.1
              • 25.69
              • Published

              supapup

              ⚡ Lightning-fast MCP browser dev tool. Navigate → Get instant structured data. No screenshots needed! Puppeteer: 📸 → CSS selectors → JS eval. Supapup: semantic IDs ready to use. 10x faster, 90% fewer tokens.

              • v0.1.31
              • 25.56
              • Published

              @jambudipa/spider

              A comprehensive web scraping library with resumable operations, middleware support, and built-in rate limiting

              • v0.2.1
              • 25.44
              • Published

              stepwright

              A powerful web scraping library built with Playwright

              • v1.0.2
              • 25.29
              • Published

              @crawlus/http

              HTTP crawler for basic web scraping without JavaScript execution

              • v0.9.0
              • 25.28
              • Published

              @scrapeops/n8n-nodes-scrapeops

              n8n community node for ScrapeOps Proxy, Parser, and Data APIs for web scraping and data extraction

              • v0.2.4
              • 25.27
              • Published

              n8n-nodes-playwright-mcp

              Complete n8n Playwright node with all Microsoft Playwright MCP tools and AI assistant support for advanced browser automation

              • v1.0.0
              • 24.81
              • Published

              mcp-url-to-text

              MCP server for converting URLs to text using urltoany.com

              • v1.0.0
              • 24.68
              • Published

              node-curl-impersonate

              A wrapper around cURL-impersonate, a binary which can be used to bypass TLS fingerprinting.

                • v1.5.4
                • 24.63
                • Published

                kargo-takip

                Türkiye kargo takip modülü - Web scraping tabanlı kargo takip sistemi

                • v1.0.3
                • 24.60
                • Published

                mcp-search-tools

                MCP server and client for web search and page viewing tools - DuckDuckGo search and web scraping

                • v1.0.9
                • 24.22
                • Published

                llm-gen

                A CLI tool to extract text from a static Next.js export and generate llm.txt for LLM ingestion.

                • v1.0.3
                • 23.71
                • Published

                @supadata/mcp

                MCP server for Supadata video & web scraping integration. Features include YouTube, TikTok, Instagram, Twitter, and file video transcription, web scraping, batch processing and structured data extraction.

                • v1.0.1
                • 23.43
                • Published

                @pinkpixel/web-scout-mcp

                MCP server for web search and content extraction with multiple URL support and memory optimizations

                • v1.5.0
                • 23.20
                • Published

                linktree-parser

                linktree-parser is a TypeScript library for scraping and extracting account, links, banners, and metadata from Linktree profiles.

                • v1.5.0
                • 23.15
                • Published

                undetected-chromedriver-js

                Node.js wrapper for undetected-chromedriver with automatic setup and cross-platform support

                • v1.2.2
                • 23.03
                • Published

                backlab-sdk

                Official JavaScript SDK for Backlab AI-Powered Web Scraping API

                • v1.0.1
                • 22.75
                • Published

                @lightfeed/extractor

                Use LLMs to robustly extract and enrich structured data from HTML and markdown

                • v0.2.0
                • 22.70
                • Published

                @watercrawl/mcp

                A Model Context Protocol (MCP) server for WaterCrawl, enabling AI systems to perform web crawling and search operations

                • v1.1.0
                • 22.55
                • Published

                @rpidanny/odysseus

                Odysseus is a web scraping library built on top of Playwright, designed to handle dynamic web pages and CAPTCHA challenges with ease.

                • v2.6.0
                • 22.51
                • Published

                crunchyroll-toolkit

                Toolkit Node.js complet pour extraire données d'animés, métadonnées et thumbnails depuis Crunchyroll avec techniques anti-détection 2024/2025

                • v1.1.1
                • 22.41
                • Published

                ts-curl-impersonate

                A typescript wrapper around cURL-impersonate.

                  • v1.0.3
                  • 21.48
                  • Published

                  undetected-puppeteer

                  1:1 (sorta) replacement for Puppeteer, but undetected

                  • v1.0.1
                  • 21.07
                  • Published

                  @iflow-mcp/firecrawl-mcp

                  MCP server for Firecrawl web scraping integration. Supports both cloud and self-hosted instances. Features include web scraping, batch processing, structured data extraction, and LLM-powered content analysis.

                  • v1.12.0
                  • 20.91
                  • Published

                  web-content-extract-mcp

                  MCP server for extracting web content using web-content-extract library

                    • v1.0.0
                    • 20.64
                    • Published

                    @knowcode/screenshotfetch

                    Web application spider with screenshot capture and customer journey documentation. Automate user flow documentation with authentication support.

                    • v1.0.0
                    • 20.50
                    • Published

                    silktext

                    Lightweight, runtime-safe crawling → clean Markdown

                    • v0.1.0
                    • 20.45
                    • Published

                    doc-to-readable

                    Universal document-to-markdown and section splitter for HTML, URLs, and PDFs.

                    • v1.5.3
                    • 20.44
                    • Published

                    mirror-web-cli

                    Professional website mirroring tool with intelligent framework preservation, AI-powered analysis, and comprehensive asset optimization

                    • v1.1.3
                    • 20.36
                    • Published

                    ollama-library-scraper

                    A TypeScript library for scraping model information from the Ollama model library website. Extract details, tags, and metadata from ollama.com/library with a simple, type-safe API.

                    • v1.0.0
                    • 20.11
                    • Published

                    ayakashi

                    The next generation web scraping framework

                    • v1.0.0-beta8.4
                    • 20.09
                    • Published

                    @langgraph-js/crawler

                    A powerful web crawler designed specifically for LLM applications, capable of extracting clean, readable content from various web pages and converting it to Markdown format.

                    • v1.7.0
                    • 20.03
                    • Published

                    @lucianaib/logo-mcp

                    一个智能Logo提取和处理的MCP服务器,支持从网站URL自动识别并提取Logo图标

                    • v1.0.0
                    • 20.01
                    • Published

                    proxy-auto-ts

                    A comprehensive TypeScript library for automatic proxy management with validation, rotation, and intelligent selection

                    • v1.1.2
                    • 19.58
                    • Published

                    devchrome-mcp

                    MCP (Model Context Protocol) сервер для работы с браузером через Puppeteer

                    • v1.4.0
                    • 19.56
                    • Published

                    email-scraper-tool

                    Professional email scraping tool with GUI

                      • v1.0.0
                      • 19.56
                      • Published

                      olostep-mcp

                      Olostep MCOP server for web scraping, google search and website urls search.

                      • v1.0.4
                      • 18.85
                      • Published

                      fdfs-mcp

                      FDFS MCP Server - Search for movies and theaters on BookMyShow using web scraping with Puppeteer to bypass Cloudflare protection

                      • v1.0.0
                      • 18.72
                      • Published

                      google-search-ts

                      A TypeScript library for performing Google searches with support for proxy, pagination, and customization

                      • v1.0.1
                      • 18.57
                      • Published

                      @iflow-mcp/logo-mcp

                      一个智能Logo提取和处理的MCP服务器,支持从网站URL自动识别并提取Logo图标

                      • v1.0.0
                      • 18.00
                      • Published

                      article-summarizer-jp

                      CLI tool for summarizing web articles in Japanese using Anthropic Claude API. Fetches content from URLs and generates both 3-line summaries and full translations in polite Japanese.

                      • v1.5.17
                      • 17.73
                      • Published

                      docs-to-markdown

                      Tool for automatically analyzing and summarizing library documentation for use with LLM's

                        • v1.0.0
                        • 17.19
                        • Published

                        @crawlus/puppeteer

                        Puppeteer-based crawler for Chrome automation and dynamic content scraping

                        • v0.6.0
                        • 16.71
                        • Published

                        @crawlus/playwright

                        Playwright-based crawler for full browser automation and JavaScript rendering

                        • v0.6.0
                        • 16.27
                        • Published

                        qserp

                        Robust Node.js module for Google Custom Search with rate limiting, error handling, and offline testing capabilities. Supports parallel searches and comprehensive result formatting.

                        • v1.0.9
                        • 16.02
                        • Published

                        @crawlbase/mcp

                        MCP server for Crawlbase API - enables web scraping through Model Context Protocol

                        • v1.0.3
                        • 15.76
                        • Published

                        @md-anas-sabah/async-task-runner

                        Powerful async task runner for Node.js with concurrency control, smart retries, timeouts & comprehensive reporting. Perfect for web scraping, API processing, file operations & bulk async operations.

                        • v1.0.2
                        • 15.66
                        • Published

                        @crawlus/cli

                        Command-line interface for creating and managing crawler projects

                        • v0.6.0
                        • 15.49
                        • Published

                        @sashbot/uibridge

                        🤖 AI-friendly live session automation with REAL screenshot backgrounds (no transparency issues!) - control your EXISTING browser with visual debug panel. Perfect for AI agents!

                        • v1.6.0
                        • 15.47
                        • Published

                        url-to-json-markdown

                        A TypeScript library that fetches URLs and converts them to structured JSON and Markdown format.

                        • v1.0.7
                        • 15.47
                        • Published

                        @crawlus/cheerio

                        Cheerio-based crawler for server-side HTML parsing and extraction

                        • v0.6.0
                        • 15.32
                        • Published

                        @monostate/node-scraper

                        Intelligent web scraping with AI Q&A, PDF support and multi-level fallback system - 11x faster than traditional scrapers

                        • v1.8.1
                        • 15.14
                        • Published

                        koffi-curl

                        Node.js libcurl bindings using koffi with browser fingerprint capabilities

                        • v0.1.23
                        • 15.10
                        • Published

                        xscrape

                        A flexible and powerful library designed to extract and transform data from HTML documents using user-defined schemas

                        • v3.0.4
                        • 15.02
                        • Published

                        @nrjdalal/google-parser

                        Google parser is a lightweight yet powerful HTTP client based Google Search Result scraper/parser with the purpose of sending browser-like requests out of the box. This is very essential in the web scraping industry to blend in with the website traffic.

                        • v2.3.0
                        • 15.01
                        • Published

                        webscraping-ai-mcp

                        Model Context Protocol server for WebScraping.AI API. Provides LLM-powered web scraping tools with Chromium JavaScript rendering, rotating proxies, and HTML parsing.

                        • v1.0.2
                        • 15.01
                        • Published

                        @hanivanrizky/nestjs-html-parser

                        A powerful NestJS HTML parsing service with XPath and CSS selector support, proxy configuration, random user agents, and rich response metadata including headers and status codes

                        • v1.3.1
                        • 14.86
                        • Published

                        url-to-markdown-cli-tool

                        CLI tool for converting web pages to clean, LLM-friendly markdown. Fetches content from URLs and converts HTML to optimized markdown format perfect for LLM training, RAG systems, and AI applications.

                        • v1.1.0
                        • 14.50
                        • Published

                        fdy-scraping

                        `fdy-scraping` is a versatile HTTP client designed for making API requests with support for proxy configuration, debugging, and detailed error handling. It utilizes the [`got-scraping`](https://github.com/apify/got-scraping) library for HTTP operations.

                        • v1.0.3
                        • 14.32
                        • Published

                        @langgraph-js/crawler-mcp

                        A powerful web crawler designed specifically for LLM applications, capable of extracting clean, readable content from various web pages and converting it to Markdown format.

                        • v1.5.3
                        • 14.26
                        • Published

                        @cladoai/mcp

                        Clado Model Context Protocol Server

                        • v1.0.27
                        • 13.96
                        • Published

                        @mseep/firecrawl-mcp

                        MCP server for Firecrawl web scraping integration. Supports both cloud and self-hosted instances. Features include web scraping, batch processing, structured data extraction, and LLM-powered content analysis.

                        • v1.9.0
                        • 13.74
                        • Published

                        anil-brd-typescript-sdk

                        TypeScript SDK for Bright Data APIs - Web Unlocker, SERP, and Scraper APIs

                        • v1.0.1
                        • 13.72
                        • Published

                        @monibrand/se-scraper

                        A module using puppeteer to scrape several search engines such as Google, Bing and Duckduckgo

                        • v1.15.0
                        • 13.48
                        • Published

                        @lyuboslavlyubenov/se-scraper

                        A module using puppeteer to scrape several search engines such as Google, Bing and Duckduckgo

                        • v1.9.12
                        • 13.37
                        • Published

                        @cityssm/wsib-clearance-check

                        A tool to scrape the clearance certificate status from the WSIB Online Services website.

                        • v4.0.2
                        • 13.37
                        • Published

                        cparse

                        一个基于 Cheerio 的 HTML 解析和数据提取工具库

                        • v2.2.0
                        • 13.30
                        • Published

                        rtium-cli

                        A Node.js CLI for web scraping any page and generating an intelligent summary using Google Gemini.

                        • v1.0.0
                        • 13.24
                        • Published

                        cheerio-mcp

                        MCP server for web scraping with Cheerio

                          • v1.2.2
                          • 12.85
                          • Published

                          search-engine-scraper

                          A module using puppeteer to scrape several search engines such as Google, Bing

                          • v1.0.0
                          • 12.76
                          • Published

                          @actionbase/web-action-sdk

                          Work with the internet as if it were your own API. Automate web interactions across popular internet platforms.

                          • v0.1.9
                          • 12.73
                          • Published

                          @imaginerlabs/user-agent-generator

                          High-performance, configurable, batch-generating User-Agent spoofing library. Supports multiple browsers, devices, and returns detailed meta information. Perfect for web scraping, automated testing, proxy pools and more.

                          • v1.0.2
                          • 12.73
                          • Published

                          @rpidanny/google-scholar

                          A minimal TypeScript library for fetching and parsing Google Scholar pages.

                          • v3.3.0
                          • 12.59
                          • Published

                          googlethis-augmented

                          A fork of googlethis to get specific data for different needs.

                          • v0.0.2--canary.1.4307381028.0
                          • 12.43
                          • Published

                          html-content-processor

                          A professional library for processing, cleaning, filtering, and converting HTML content to Markdown. Features advanced customization options, presets, plugin support, fluent API, and TypeScript integration for reliable content extraction.

                          • v1.0.5
                          • 12.34
                          • Published

                          cleanweb-mcp

                          A lightweight MCP server for extracting clean web content with intelligent content filtering and Markdown conversion

                            • v1.0.1
                            • 12.33
                            • Published

                            crawltojson

                            Crawl websites and convert them to JSON with ease

                              • v1.11.11
                              • 12.13
                              • Published

                              sigaa-api

                              Unofficial high performance API for SIGAA IFSC using web scraping.

                              • v1.0.34
                              • 12.06
                              • Published

                              jann-scraper

                              The library scraper for WhatsApp bot or Restfull API's

                              • v0.0.6
                              • 12.05
                              • Published

                              @nathanclevenger/googlethis

                              A simple yet powerful module to retrieve organic search results and much more from Google.

                              • v1.8.3
                              • 11.91
                              • Published

                              akbdsdk

                              A simple Node.js SDK for the Bright Data API

                              • v1.0.8
                              • 11.81
                              • Published

                              crawlee-scraper-toolkit

                              A comprehensive TypeScript toolkit for building robust web scrapers with Crawlee, featuring maximum configurability and CLI generator

                              • v2.0.2
                              • 11.47
                              • Published

                              auto-captcha-solver

                              Automatically detect and solve various captcha types in Playwright & Puppeteer with 2Captcha/CapMonster Cloud integration

                              • v1.3.7
                              • 11.47
                              • Published

                              apify-sdk-legacy

                              Package for Crawlee that should allows to import and use packages, that are using older version of Apify SDK.

                              • v1.0.5
                              • 11.24
                              • Published

                              multi-dictionary-scraper

                              Professional multi-dictionary scraper supporting WordReference and Linguee with unified API, TypeScript definitions, and comprehensive language coverage for 1000+ language pairs.

                              • v1.1.6
                              • 11.17
                              • Published

                              @sapkotamadan/cache-server

                              CacheServer is an efficient web page extractor that uses Puppeteer to launch a headless browser and fetch web page content.

                              • v2.0.8
                              • 11.14
                              • Published

                              @darkbing/knowledge-retrieval

                              A powerful web crawler and knowledge processing toolkit for extracting and managing web content

                              • v1.0.2
                              • 10.79
                              • Published

                              @mseep/agentql-mcp

                              Model Context Protocol (MCP) server that integrates AgentQL data extraction capabilities.

                              • v1.0.0
                              • 10.74
                              • Published

                              @cifumo/scraper-node

                              Sebuah Module Scraper yang dibuat oleh Sxyz dan SuzakuTeam untuk memudahkan penggunaan scraper di project ESM maupun CJS.

                                • v1.1.0
                                • 10.72
                                • Published

                                js-harvester

                                Harvester is a lightweight and highly optimized javascript library for extracting data from the DOM tree. It supports extraction of tag texts with specified types and attributes. it's tiny and has no dependencies and also works with Puppeteer

                                • v0.3.14
                                • 10.72
                                • Published

                                link-view

                                A Node.js package to generate link previews from URLs

                                  • v1.0.3
                                  • 10.44
                                  • Published

                                  fadi-rebrowser-patches

                                  Collection of patches for puppeteer and playwright to avoid automation detection and leaks. Helps to avoid Cloudflare and DataDome CAPTCHA pages. Easy to patch/unpatch, can be enabled/disabled on demand.

                                  • v0.0.6
                                  • 10.42
                                  • Published

                                  waterfall-fetch

                                  utility for web scraping and fetching the html from a url or using puppeteer to interact with the page. getHtml uses various strategies in a 'waterfall' approch to get the content of the url, depending on priorities, such as stealth, speed, freshness.

                                  • v1.0.11
                                  • 10.42
                                  • Published

                                  @tooly/firecrawl

                                  Firecrawl API tools for OpenAI, Anthropic, and AI SDK

                                  • v0.0.3
                                  • 10.38
                                  • Published

                                  tester-scraper

                                  Sebuah Module Scraper yang dibuat oleh Sxyz dan SuzakuTeam untuk memudahkan penggunaan scraper di project ESM maupun CJS.

                                    • v1.1.7
                                    • 10.38
                                    • Published

                                    hyperscraped-follower

                                    Lightweight scraper written in TypeScript using ES6 generators.

                                    • v1.0.6
                                    • 10.26
                                    • Published

                                    @sxyzdev/scrapers

                                    Simple Module All In One Scrapers Untuk Memenuhi Kebutuhan Pengumpulan Data Kamu Dari 50+ Website!

                                    • v0.0.5
                                    • 10.26
                                    • Published

                                    @tomisakae/syosetu-api

                                    Enterprise-grade Fastify TypeScript API for Syosetu.com data extraction using official API and web scraping. Run instantly with 'npx @tomisakae/syosetu-api'

                                    • v0.0.2
                                    • 9.59
                                    • Published

                                    @bcoders.gr/eth-scrapper

                                    Electron web scraper for Etherscan transactions - External and Internal transaction hash extractor

                                    • v1.4.0
                                    • 9.59
                                    • Published

                                    @zbo14/giturls

                                    A command-line tool that searches GitHub for URLs

                                    • v0.1.0
                                    • 9.59
                                    • Published

                                    markdown-crawler

                                    A powerful web crawler that extracts content from web pages and converts them to clean Markdown format, with support for code blocks and GitHub Flavored Markdown

                                    • v1.0.18
                                    • 9.45
                                    • Published

                                    rabbit-browser

                                    Browser automation tool for detecting interactive elements on web pages

                                    • v1.1.0
                                    • 9.42
                                    • Published

                                    buscar.io

                                    A simple yet powerful module to retrieve organic search results and much more from Google.

                                    • v1.8.1
                                    • 9.42
                                    • Published

                                    get-sigaa

                                    Repositório não-oficial responsável por pegar informações do sistema SIGAA.

                                    • v0.1.1
                                    • 9.41
                                    • Published

                                    plugin-books-pro

                                    [![npm version](https://badge.fury.io/js/plugin-books-pro.svg)](https://badge.fury.io/js/plugin-books-pro)

                                    • v0.0.11
                                    • 9.41
                                    • Published

                                    @scraminator/core

                                    AI-powered Playwright automation framework that converts natural language instructions into executable Playwright code with step-by-step execution and intelligent error recovery

                                    • v1.0.1
                                    • 9.41
                                    • Published

                                    finview

                                    A command-line tool for monitoring financial data and market trends in real-time directly from your terminal.

                                    • v1.0.5
                                    • 9.21
                                    • Published

                                    mapdown

                                    A CLI tool to crawl sitemap.xml and convert all pages to LLM-friendly Markdown

                                    • v1.0.2
                                    • 9.13
                                    • Published

                                    markup2json

                                    A library for converting HTML and XML into JSON

                                    • v1.0.5
                                    • 9.13
                                    • Published

                                    scan-link

                                    Scan given website recursively and report 404 links

                                    • v1.0.3
                                    • 8.92
                                    • Published

                                    oembedder

                                    Library to oEmbed a resource

                                    • v2.1.1
                                    • 8.92
                                    • Published

                                    mult-fetch-mcp-server

                                    一个基于 MCP 协议的网页内容获取工具,支持多种模式和格式,可与 Claude 等 AI 助手集成

                                    • v1.0.0
                                    • 8.90
                                    • Published

                                    @mseep/puremd-mcp

                                    Model Context Protocol (MCP) server for pure.md, the markdown delivery network for LLMs

                                    • v1.0.3
                                    • 8.88
                                    • Published

                                    web2os

                                    Scrap the web asynchronously in live, reusing Node.js, all in one file, with a few lines!

                                    • v1.1.0
                                    • 8.62
                                    • Published

                                    firecrawl-simple-mcp

                                    Model Context Protocol (MCP) server for Firecrawl Simple - provides web scraping and crawling capabilities to LLMs

                                    • v1.0.2
                                    • 8.31
                                    • Published

                                    mcp-baidu-curl

                                    一个MCP服务,提供获取百度网站HTML内容的功能,支持Cursor、Claude Desktop、Cherry Studio等主流MCP客户端

                                    • v1.0.0
                                    • 8.31
                                    • Published

                                    socials_regex

                                    🪡 Social account detection and extraction in js, e.g. for crawling/scraping.

                                    • v1.0.3
                                    • 8.12
                                    • Published

                                    weavebot-core

                                    Generic content processing framework for web scraping and AI extraction

                                    • v0.1.1
                                    • 8.05
                                    • Published

                                    selektra

                                    Easily generate unique and optimized CSS or XPath selectors for any DOM element.

                                    • v1.0.5
                                    • 7.94
                                    • Published

                                    stonkinator

                                    A low level stock data aggregation tool, a boring lib for others to build upon

                                    • v1.0.0
                                    • 7.73
                                    • Published

                                    scrape-them-all

                                    🚀 An easy-to-handle Node.js scraper that allow you to scrape them all in a record time.

                                    • v2.0.0
                                    • 7.69
                                    • Published

                                    nepali-news-scraper

                                    A powerful Node.js package to scrape news from popular Nepali news portals including Kathmandu Post and Kantipur

                                    • v1.0.1
                                    • 7.69
                                    • Published

                                    tg-scraper

                                    Telegram Channel Scraper

                                      • v1.0.1
                                      • 7.69
                                      • Published

                                      node-merle

                                      A utility for cataloguing the metadata for a URL

                                      • v0.0.1
                                      • 7.66
                                      • Published

                                      larousse

                                      A command line interface to get french 🇫🇷 word definitions & synonymes from larousse website

                                      • v1.0.1
                                      • 7.65
                                      • Published

                                      ohmyreader

                                      A powerful web content extractor that converts articles to clean markdown

                                      • v0.1.1
                                      • 7.49
                                      • Published

                                      @mseep/firecrawl-simple-mcp

                                      MCP server for Firecrawl Simple — a web scraping and site mapping tool enabling LLMs to access and process web content

                                      • v1.0.2
                                      • 7.43
                                      • Published

                                      web-meta-scraper

                                      A URL scraper for extracting various metadata, including Open Graph, JSON-LD, and more

                                      • v0.1.1
                                      • 6.99
                                      • Published

                                      flay-js

                                      Extract structured content from any HTML website

                                        • v0.1.1
                                        • 6.94
                                        • Published

                                        movienews-scraper

                                        movienews scraper for extracting movie news from all pages.

                                        • v1.9.0
                                        • 6.91
                                        • Published

                                        @yareyaredesuyo/mcp-server-ogp

                                        TypeScript implementation of OGP (Open Graph Protocol) information extraction server for Model Context Protocol (MCP)

                                        • v0.1.1
                                        • 6.71
                                        • Published

                                        @pricething/curl

                                        A typescript wrapper around cURL-impersonate.

                                          • v1.1.6
                                          • 6.71
                                          • Published

                                          mcp-fetchpage

                                          MCP Server for intelligent web page fetching with automatic cookie support

                                          • v2.0.0
                                          • 6.71
                                          • Published

                                          @zentus/googlethis

                                          A simple yet powerful module to retrieve organic search results and much more from Google.

                                          • v1.7.1
                                          • 6.71
                                          • Published

                                          defuddler

                                          A command-line interface for extracting main content from web pages and articles

                                          • v1.0.1
                                          • 6.29
                                          • Published

                                          dynamic-request

                                          Module Yang Menyediakan 2 Metode Request (https/http)

                                          • v1.0.0
                                          • 6.06
                                          • Published

                                          hyperscraped

                                          Lightweight scraper written in TypeScript using ES6 generators.

                                          • v1.0.4
                                          • 6.06
                                          • Published

                                          rebrowser-patches-fadi-patch

                                          Collection of patches for puppeteer and playwright to avoid automation detection and leaks. Helps to avoid Cloudflare and DataDome CAPTCHA pages. Easy to patch/unpatch, can be enabled/disabled on demand.

                                          • v1.0.18
                                          • 5.93
                                          • Published

                                          @armand1m/papercut

                                          Papercut is a scraping/crawling library for Node.js, written in Typescript.

                                          • v2.0.5
                                          • 5.93
                                          • Published

                                          mcp-webpage-timestamps

                                          MCP server for extracting webpage creation and modification timestamps

                                            • v1.0.1
                                            • 5.88
                                            • Published

                                            a1hul-mcp

                                            MCP server for extracting content from web pages

                                            • v0.1.6
                                            • 5.88
                                            • Published

                                            chimi-scraper

                                            A TypeScript library for scraping game data from itch.io with a clean, scalable architecture

                                            • v1.0.0
                                            • 5.88
                                            • Published

                                            @mseep/olostep-mcp

                                            Olostep MCOP server for web scraping, google search and website urls search.

                                            • v1.0.4
                                            • 5.08
                                            • Published

                                            @mseep/scraperis-mcp

                                            Model Context Protocol (MCP) integration for Scraper.is - A web scraping tool for AI assistants

                                            • v0.1.22
                                            • 5.08
                                            • Published

                                            extreme-scrap

                                            Convert any webpage to markdown using headless Chrome

                                            • v1.0.10
                                            • 5.08
                                            • Published

                                            google-reviews-api

                                            A simple Node.js library to fetch Google Maps reviews

                                            • v1.0.6
                                            • 5.06
                                            • Published

                                            n8n-nodes-my-browserless

                                            n8n node to interact with a Browserless instance for web scraping

                                              • v1.0.5
                                              • 5.06
                                              • Published

                                              resource-saver-headless

                                              A robust Node.js utility to save webpage resources using Chrome DevTools Protocol via Puppeteer. Extract and download all static assets from any webpage for offline use or analysis.

                                              • v0.9.1
                                              • 5.06
                                              • Published