JSPM

Found 1180 results for crawler

@nodelib/fs.walk

A library for efficiently walking a directory recursively

  • v3.0.1
  • 86.68
  • Published

fdir

The fastest directory crawler & globbing alternative to glob, fast-glob, & tiny-glob. Crawls 1m files in < 1s

  • v6.5.0
  • 84.93
  • Published

recrawl-sync

[![npm](https://img.shields.io/npm/v/recrawl-sync.svg)](https://www.npmjs.com/package/recrawl-sync) [![ci](https://github.com/aleclarson/recrawl/actions/workflows/release.yml/badge.svg)](https://github.com/aleclarson/recrawl/actions/workflows/release.yml)

  • v2.2.3
  • 60.48
  • Published

json-crawl

Async and sync crawler for json object

    • v0.5.3
    • 57.55
    • Published

    apify-client

    Apify API client for JavaScript

    • v2.16.0
    • 57.29
    • Published

    @crawlee/core

    The scalable web crawling and scraping library for JavaScript/Node.js. Enables development of data extraction and web automation jobs (not only) with headless Chrome and Puppeteer.

    • v3.14.1
    • 55.29
    • Published

    simplecrawler

    Very straightforward, event driven web crawler. Features a flexible queue interface and a basic cache mechanism with extensible backend.

    • v1.1.9
    • 54.62
    • Published

    @crawlee/browser

    The scalable web crawling and scraping library for JavaScript/Node.js. Enables development of data extraction and web automation jobs (not only) with headless Chrome and Puppeteer.

    • v3.14.1
    • 54.45
    • Published

    @crawlee/playwright

    The scalable web crawling and scraping library for JavaScript/Node.js. Enables development of data extraction and web automation jobs (not only) with headless Chrome and Puppeteer.

    • v3.14.1
    • 54.18
    • Published

    @crawlee/puppeteer

    The scalable web crawling and scraping library for JavaScript/Node.js. Enables development of data extraction and web automation jobs (not only) with headless Chrome and Puppeteer.

    • v3.14.1
    • 53.69
    • Published

    @crawlee/jsdom

    The scalable web crawling and scraping library for JavaScript/Node.js. Enables development of data extraction and web automation jobs (not only) with headless Chrome and Puppeteer.

    • v3.14.1
    • 53.58
    • Published

    @crawlee/http

    The scalable web crawling and scraping library for JavaScript/Node.js. Enables development of data extraction and web automation jobs (not only) with headless Chrome and Puppeteer.

    • v3.14.1
    • 53.16
    • Published

    @crawlee/cheerio

    The scalable web crawling and scraping library for JavaScript/Node.js. Enables development of data extraction and web automation jobs (not only) with headless Chrome and Puppeteer.

    • v3.14.1
    • 52.85
    • Published

    crawlee

    The scalable web crawling and scraping library for JavaScript/Node.js. Enables development of data extraction and web automation jobs (not only) with headless Chrome and Puppeteer.

    • v3.14.1
    • 52.80
    • Published

    @crawlee/cli

    The scalable web crawling and scraping library for JavaScript/Node.js. Enables development of data extraction and web automation jobs (not only) with headless Chrome and Puppeteer.

    • v3.14.1
    • 52.77
    • Published

    @crawlee/linkedom

    The scalable web crawling and scraping library for JavaScript/Node.js. Enables development of data extraction and web automation jobs (not only) with headless Chrome and Puppeteer.

    • v3.14.1
    • 52.65
    • Published

    firecrawl-mcp

    MCP server for Firecrawl web scraping integration. Supports both cloud and self-hosted instances. Features include web scraping, search, batch processing, structured data extraction, and LLM-powered content analysis.

    • v2.0.2
    • 51.29
    • Published

    npm-license-crawler

    Analyzes license information for multiple node.js modules (package.json files) as part of your software project.

    • v0.2.1
    • 50.75
    • Published

    isbot-fast

    JavaScript module detecting bots/crawlers/spiders via user-agent

    • v1.2.0
    • 50.74
    • Published

    notion-md-crawler

    A library to recursively retrieve and serialize Notion pages with customization for machine learning applications.

    • v1.0.2
    • 50.01
    • Published

    apify

    The scalable web crawling and scraping library for JavaScript/Node.js. Enables development of data extraction and web automation jobs (not only) with headless Chrome and Puppeteer.

    • v3.4.4
    • 49.17
    • Published

    web-auto-extractor

    Automatically extracts structured information from webpages

    • v1.0.17
    • 47.66
    • Published

    spider-detector

    A tiny node module to detect spiders/crawlers quickly and comes with optional middleware for ExpressJS

    • v2.1.0
    • 46.38
    • Published

    es6-crawler-detect

    This is an ES6 adaptation of the original PHP library CrawlerDetect, this library will help you detect bots/crawlers/spiders vie the useragent.

    • v4.0.2
    • 45.22
    • Published

    sitemap-generator

    Easily create XML sitemaps for your website.

    • v8.5.1
    • 44.94
    • Published

    pdfdataextract

    Extract data from a pdf with pure javascript

    • v4.0.0
    • 44.72
    • Published

    puppeteer-afp

    Stop website fingerprinting techniques

    • v1.1.6
    • 43.38
    • Published

    crawler

    Crawler is a ready-to-use web spider that works with proxies, asynchrony, rate limit, configurable request pools, jQuery, and HTTP/2 support.

    • v2.0.2
    • 43.01
    • Published

    @nodebb/spider-detector

    A tiny node module to detect spiders/crawlers quickly and comes with optional middleware for ExpressJS

    • v2.0.3
    • 42.64
    • Published

    robots-txt-parser

    A lightweight robots.txt parser for Node.js with support for wildcards, caching and promises.

    • v2.0.3
    • 41.92
    • Published

    firecrawl

    JavaScript SDK for Firecrawl API

    • v4.0.0
    • 41.75
    • Published

    node-scrapy

    Simple, lightweight and expressive web scraping with Node.js

    • v0.5.0
    • 41.14
    • Published

    sqreen

    Node.js agent for Sqreen, please see https://www.sqreen.io/

    • v2.0.2
    • 40.44
    • Published

    crawler-request

    HTTP request module customized for crawlers.

    • v1.2.2
    • 38.56
    • Published

    playwright-afp

    Stop website fingerprinting techniques playwright edition

    • v0.0.3
    • 38.02
    • Published

    beautiful-dom

    Beautiful-dom is a lightweight library that mirrors the capabilities of the HTML DOM API needed for parsing crawled HTML/XML pages. It models the methods and properties of HTML nodes that are relevant for extracting data from HTML nodes. It is written in

    • v1.0.9
    • 36.88
    • Published

    recrawl

    [![npm](https://img.shields.io/npm/v/recrawl.svg)](https://www.npmjs.com/package/recrawl) [![ci](https://github.com/aleclarson/recrawl/actions/workflows/release.yml/badge.svg)](https://github.com/aleclarson/recrawl/actions/workflows/release.yml) [![codeco

    • v2.2.1
    • 36.75
    • Published

    crawlab-sdk

    Node.js SDK for Crawlab

    • v0.6.0-12
    • 36.72
    • Published

    hyperbrowser-mcp

    Hyperbrowser Model Context Protocol Server

    • v1.0.25
    • 36.07
    • Published

    rebrowser-patches

    Collection of patches for puppeteer and playwright to avoid automation detection and leaks. Helps to avoid Cloudflare and DataDome CAPTCHA pages. Easy to patch/unpatch, can be enabled/disabled on demand.

    • v1.0.19
    • 35.31
    • Published

    @algolia/netlify-plugin-crawler

    This plugin links your Netlify site with Algolia's Crawler. It will trigger a crawl on each successful build.

    • v1.0.15
    • 35.15
    • Published

    crawlbase

    Dependency free module for scraping and crawling websites using [Crawlbase](https://crawlbase.com) API

    • v1.0.2
    • 34.80
    • Published

    websearch-mcp

    A Model Context Protocol (MCP) server implementation that provides real-time web search capabilities through a simple API

    • v1.0.3
    • 34.21
    • Published

    @brightsec/cli

    Bright CLI is a CLI tool that can initialize, stop, poll and maintain scans in Bright solutions.

    • v13.7.0
    • 33.92
    • Published

    better-fetch-mcp

    Advanced MCP server for web scraping with nested URL fetching and intelligent markdown formatting

      • v1.0.0
      • 33.34
      • Published

      grunt-link-checker

      Finds broken links and resources on websites

      • v0.2.0
      • 33.27
      • Published

      simple-headless-chrome

      Headless Chrome abstraction to simplify the interaction with the browser. It may be used for crawling sites, test automation, etc

      • v4.3.10
      • 33.23
      • Published

      node-html-crawler

      Crawler (spider) of site web pages by domain name

      • v1.2.3
      • 33.10
      • Published

      chowdown

      A JavaScript library that allows for the quick transformation of DOM documents into useful formats.

      • v1.2.6
      • 32.96
      • Published

      puremd-mcp

      Model Context Protocol (MCP) server for pure.md, the markdown delivery network for LLMs

      • v1.0.3
      • 32.89
      • Published

      web-structure

      A powerful and flexible web scraping library with concurrent processing and DOM hierarchy awareness

      • v1.0.2
      • 32.82
      • Published

      linkd-mcp

      Linkd Model Context Protocol Server

      • v1.0.25
      • 32.82
      • Published

      @turingnova/robots

      Next.js robots.tsx generator - Automatically create and serve robots.txt for Next.js applications

        • v1.0.21
        • 32.30
        • Published

        fiftyone.devicedetection

        Parse HTTP headers to detect the device type, model, operating system, browser, and crawler information

        • v4.4.210
        • 31.73
        • Published

        cheerio-httpcli

        http client module with cheerio & iconv(-lite) & promise

        • v0.8.3
        • 31.70
        • Published

        webhead

        An easy-to-use Node web crawler storing cookies, following redirects, traversing pages and submitting forms.

        • v1.1.3
        • 31.44
        • Published

        web-parser-mcp

        🚀 MCP SERVER FIXED v3.7.9! Resolved import errors, middleware conflicts, type hints - NOW WORKING PERFECTLY!

        • v3.7.9
        • 31.06
        • Published

        web-page-analyzer-cli

        一个强大的网站链接抓取工具,支持深度抓取、认证和页面分析

        • v1.0.19
        • 30.40
        • Published

        usetube

        crawl youtube without api key (search videos channels or get all channel/playlist's videos)

        • v2.2.7
        • 30.25
        • Published

        @crawlee/impit-client

        impit-based HTTP client implementation for Crawlee. Impersonates browser requests to avoid bot detection.

        • v3.14.1
        • 30.05
        • Published

        taki

        Take a snapshot of any website.

        • v3.0.0
        • 29.53
        • Published

        torrent-search-api

        Yet another node torrent scraper based on x-ray. (Support iptorrents, torrentleech, torrent9, Yyggtorrent, ThePiratebay, torrentz2, 1337x, KickassTorrent, Rarbg, TorrentProject, Yts, Limetorrents, Eztv)

        • v2.1.4
        • 28.92
        • Published

        @folder/readdir

        Recursively read a directory, blazing fast.

        • v3.1.0
        • 28.65
        • Published

        fakebrowser

        🤖 Fake fingerprints to bypass anti-bot systems. Simulate mouse and keyboard operations to make behavior like a real person.

        • v0.0.66
        • 28.60
        • Published

        @fwdslsh/inform

        A high-performance web crawler powered by Bun that downloads pages and converts them to Markdown

          • v0.1.3
          • 28.58
          • Published

          @just-every/crawl

          Fast, token-efficient web content extraction - fetch web pages and convert to clean Markdown

          • v1.0.8
          • 27.39
          • Published

          node-site-downloader

          An easy to use CLI for downloading websites for offline usage

          • v1.3.0
          • 27.27
          • Published

          gulp-license-crawler

          Analyzes license information for multiple node.js modules (package.json files) as part of your software project.

          • v0.0.10
          • 27.27
          • Published

          @spider-rs/spider-rs

          The [spider](https://github.com/spider-rs/spider) project ported to Node.js

          • v0.0.157
          • 27.05
          • Published

          extract-email

          A simple email extractor for obfuscated emails.

          • v1.1.3
          • 26.98
          • Published

          hltv

          The unofficial HLTV Node.js API

          • v3.5.0
          • 26.74
          • Published

          node-webcrawler

          Crawler is a web spider written with Nodejs. It gives you the full power of jQuery on the server to parse a big number of pages as they are downloaded, asynchronously

          • v0.8.0
          • 26.62
          • Published

          googlebot-verify

          Verify that a request is from Google using Google's recommended DNS verification steps

          • v0.1.3
          • 26.55
          • Published

          roboto

          A web crawler for Nodejs.

          • v0.8.2
          • 26.26
          • Published

          node-spider

          Generic web crawler powered by Node.js

          • v1.4.1
          • 26.16
          • Published

          express-nobots

          Keep Bots Away From Your Express App

          • v1.0.5
          • 26.04
          • Published

          @crawlus/core

          Core crawler framework functionality - TypeScript web crawling library

          • v0.9.0
          • 26.02
          • Published

          @kdinisv/sql-scanner

          Smart SQL injection scanner with crawler and optional Playwright capture.

          • v0.2.4
          • 25.86
          • Published

          @crawlus/utils

          Utility functions for web crawling - sitemap processing, link extraction, system info

          • v0.9.0
          • 25.84
          • Published

          novel-downloader

          novel downloader for node-novel style , include site ( dmzj / wenku8 / syosetu / ...etc )

          • v2.0.40
          • 25.66
          • Published

          nintendo-switch-eshop

          Unofficial API lib for Nintendo Switch eShop game listing and pricing information.

          • v8.0.1
          • 25.56
          • Published

          license-crawler

          crawls a npm package and it's dependencies for their licenses

          • v0.0.5
          • 25.50
          • Published

          x-crawl

          x-crawl is a flexible Node.js AI-assisted crawler library.

          • v10.1.0
          • 25.34
          • Published

          js-crawler

          Web crawler for Node.js

          • v0.3.21
          • 25.30
          • Published

          @6digit/silktext

          Lightweight, runtime-safe crawling → clean Markdown

          • v0.1.5
          • 25.11
          • Published

          crawl-server

          Efficient SEO-focused server for Wasm-generated pages

          • v1.8.2
          • 25.10
          • Published

          funnelweb

          Detect search engine crawlers by their User-Agent strings.

          • v0.0.1
          • 24.98
          • Published

          @letsscrapedata/controller

          Unified browser / HTML controller interfaces that support patchright, camoufox, playwright, puppeteer and cheerio

          • v0.0.68
          • 24.93
          • Published

          crawl-cli

          A Node crawler/scrape for retrieving data from websites

            • v0.2.0
            • 24.58
            • Published

            @crawlus/api

            API crawler for REST and GraphQL endpoint crawling with auto-detection

            • v0.9.0
            • 24.50
            • Published

            semantic-crawler

            Priority based Semantic Web Crawler.

            • v0.0.2
            • 24.48
            • Published

            linkedin-jobs-scraper

            Scrape public available jobs on Linkedin using headless browser

            • v18.0.1
            • 24.27
            • Published

            osmosis

            Web scraper for NodeJS

            • v1.1.10
            • 24.22
            • Published

            seo-checker

            A library for checking basic SEO signals of a website

            • v0.3.2
            • 23.61
            • Published

            crawler-ninja

            A web crawler made for the SEO based on plugins. Please wait or contribute ... still in beta

            • v0.2.7
            • 23.61
            • Published

            mcp-smart-crawler

            A command-line tool acting as an MCP (ModelContextProtocol) server, using Playwright to crawl web content for AI models.

            • v1.0.10
            • 23.59
            • Published

            @letsscrapedata/scraper

            Web scraper that scraping web pages by LetsScrapeData XML template

            • v0.0.87
            • 23.56
            • Published

            jopi-crawler

            A crawler, to download web-site

            • v1.0.4
            • 23.49
            • Published

            ghcrawler

            A robust GitHub API crawler that walks a queue of GitHub entities retrieving and storing their contents.

            • v0.2.23
            • 23.43
            • Published

            web-link-collector

            A library and CLI tool to recursively collect links from a given initial URL and output them as structured data

            • v1.0.10
            • 23.19
            • Published

            tse-client

            A client for fetching stock data from the Tehran Stock Exchange (TSETMC). Works in Browser, Node and as CLI.

            • v2.27.6
            • 23.06
            • Published

            @crawlus/http

            HTTP crawler for basic web scraping without JavaScript execution

            • v0.9.0
            • 22.45
            • Published

            @adncorp/apify

            The scalable web crawling and scraping library for JavaScript/Node.js. Enables development of data extraction and web automation jobs (not only) with headless Chrome and Puppeteer.

            • v2.7.6
            • 22.36
            • Published

            geoforge-cli

            Generate AI-ready optimization files for websites, including robots.txt, sitemaps, and AI manifests

            • v0.1.2
            • 22.15
            • Published

            @jambudipa/spider

            A comprehensive web scraping library with resumable operations, middleware support, and built-in rate limiting

            • v0.2.1
            • 22.08
            • Published

            @morioh/is-bot

            Detect user-agent is a bot/spider/crawler

            • v1.1.2
            • 22.07
            • Published

            website-crawler-sdk

            Node.js SDK for interacting with WebsiteCrawler.org API

            • v1.0.4
            • 22.05
            • Published

            udger-nodejs

            NodeJS User-Agent String Parser based on Udger SQLite databases https://udger.com/products/local_parser

            • v1.5.0
            • 21.76
            • Published

            puppeteer-cloak

            Secure your puppeteer for scraping

            • v1.0.6
            • 21.58
            • Published

            sauron-crawler

            Basic page crawler written in nodejs

            • v4.0.1
            • 21.48
            • Published

            scrapedin

            linkedin scraper for 2020 website

            • v1.0.21
            • 21.38
            • Published

            @warren-bank/node-request-cli

            An extremely lightweight HTTP request client for the command-line. Supports: http, https, proxy, redirects, cookies, content-encoding, multipart/form-data, multi-threading, recursive website crawling and mirroring.

            • v4.0.25
            • 21.26
            • Published

            nest-crawler

            An easiest crawling and scraping module for NestJS

            • v1.9.0
            • 21.03
            • Published

            nodespider

            Simple, flexible, delightful web crawler/spider package

            • v0.11.4
            • 20.68
            • Published

            reporter-cli

            Crawler queue creation tool for paging

            • v0.2.5
            • 20.66
            • Published

            crawler-links

            Node.js web crawler to get all internal links from a website.

            • v1.0.1
            • 20.65
            • Published

            backstop-crawl

            Crawl a site to generate a backstopjs config

            • v2.3.1
            • 20.52
            • Published

            huntsman

            Super configurable async web spider

            • v0.3.0
            • 20.46
            • Published

            @crawlbyte/crawlbyte-sdk-ts

            Official TypeScript SDK for Crawlbyte – create tasks, poll results, and integrate data scraping into your JavaScript/TypeScript applications.

            • v1.0.1
            • 20.41
            • Published

            @supadata/mcp

            MCP server for Supadata video & web scraping integration. Features include YouTube, TikTok, Instagram, Twitter, and file video transcription, web scraping, batch processing and structured data extraction.

            • v1.0.1
            • 20.34
            • Published

            @supacrawler/js

            Typed TypeScript/JavaScript SDK for Supacrawler API (scrape, jobs, screenshots, watch)

            • v0.1.1
            • 20.29
            • Published

            nuxt3-bot-handler

            🛡️ Nuxt 3 middleware to block suspicious bots, protect SEO crawlers with reverse DNS checks, and enforce User-Agent rules.

            • v1.0.7-beta
            • 19.89
            • Published

            snapcrawl-vercel-ssr

            Vercel integration for SnapCrawl. Serve pre-rendered HTML to crawlers in Next.js middleware or Edge Functions for static SPAs and Express apps.

              • v1.3.7
              • 19.83
              • Published

              @dotsur/link-harvest

              Deterministic link harvesting for QA and website migration testing

              • v1.0.1
              • 19.68
              • Published

              images-downloader

              A Node.js module for downloading a single image or multiple images to disk from a given Url (checking if url exist and detecting image type)

              • v1.0.3
              • 19.41
              • Published

              flysh

              DOM Document Object Artifact Collector

              • v1.2.0
              • 19.39
              • Published

              anydownload

              A powerful website downloader with GUI support

              • v1.2.0
              • 19.32
              • Published

              spankbang

              spankbang.com api implementation

              • v0.0.9
              • 19.08
              • Published

              site-audit-seo

              Web service and CLI tool for SEO site audit: crawl site, lighthouse all pages, view public reports in browser. Also output to console, json, csv.

              • v6.0.1
              • 19.05
              • Published

              advanced-seo-checker

              A library for checking basic SEO signals of a website

              • v3.2.0
              • 19.03
              • Published

              crawlee-one

              CrawleeOne is a framework built on top of Crawlee and Apify for writing robust and highly configurable web scrapers

              • v2.0.4
              • 18.91
              • Published

              get-all-links

              A node crawler that return all links/href from website

              • v1.0.2
              • 18.61
              • Published

              @devjoyvn/fakebrowser

              🤖 Fake fingerprints to bypass anti-bot systems. Simulate mouse and keyboard operations to make behavior like a real person.

              • v0.0.67
              • 18.45
              • Published

              goose-parser

              Multi environment web page parser

              • v0.6.1
              • 18.39
              • Published

              express-bot

              Crawler(robots) decision middleware for Express

              • v1.0.7
              • 18.27
              • Published

              @upstash/search-crawler

              A CLI tool to crawl documentation sites and create a search index for Upstash Search.

              • v0.2.0
              • 18.15
              • Published

              @hardbulls/wbsc-crawler

              Tool to crawl events, leagues and statistics from WBSC based websites.

              • v0.6.1
              • 18.01
              • Published

              bas

              Behaviour Assertion Sheets: CSS-like declarative syntax for client-side integration testing and quality assurance.

              • v0.1.1
              • 18.01
              • Published

              site2pdf-cli

              Generate comprehensive PDFs of entire websites, ideal for RAG.

              • v0.1.10
              • 17.95
              • Published

              syphonx

              SyphonX is a tool that extracts data from HTML data, transforming it into JSON of any shape or size. It combines the power of CSS Selectors and jQuery, Regular Expressions, and Javascript into a declarative template format to elegantly solve the simplest

              • v1.2.66
              • 17.80
              • Published

              email-extractor

              extract emails address from website by following links

              • v0.2.9
              • 17.75
              • Published

              salticidae

              A utility library to make downloading & extracting specific content from a URL easy

              • v0.10.0
              • 17.74
              • Published

              puppeteer-prerender

              Fetch the pre-rendered content, meta, links and Open Graph of a webpage, especially Single-Page Application (SPA)

              • v0.14.0
              • 17.60
              • Published

              @iflow-mcp/firecrawl-mcp

              MCP server for Firecrawl web scraping integration. Supports both cloud and self-hosted instances. Features include web scraping, batch processing, structured data extraction, and LLM-powered content analysis.

              • v1.12.0
              • 17.59
              • Published

              aliexpress-product-scraper

              Get Aliexpress product details as a json reponse including feedbacks, variants, description, images, etc.,

                • v2.0.2
                • 17.58
                • Published

                @justcooldev/slwcrawl

                Crawl and download Snap Lenses from *lens.snapchat.com* with ease.

                • v1.2.4
                • 17.38
                • Published

                crawlyx

                Crawlyx is an open-source command-line interface (CLI) based web crawler built using Node.js. It is designed to crawl websites and extract useful information like links, images, and text. It is lightweight, fast, and easy to use.

                • v2.2.5
                • 17.25
                • Published

                @knowcode/screenshotfetch

                Web application spider with screenshot capture and customer journey documentation. Automate user flow documentation with authentication support.

                • v1.0.0
                • 17.25
                • Published

                directory-crawler

                The directory crawler library for Node.JS

                • v0.0.6
                • 17.24
                • Published

                @acwink/movies-search-mcp

                Smart MCP tool to find and validate movie/tv-show resources with multiple sources support

                • v1.0.18
                • 16.85
                • Published

                vue-seo-helper

                A Vue3 plugin to improve SEO and crawler accessibility

                  • v1.0.0
                  • 16.78
                  • Published

                  silktext

                  Lightweight, runtime-safe crawling → clean Markdown

                  • v0.1.0
                  • 16.66
                  • Published

                  gin-downloader

                  Simple manga scrapper for famous online manga websites.

                  • v2.0.0-beta.6
                  • 16.57
                  • Published

                  flixhq-core

                  Nodejs library that provides an Api for obtaining the movies information from FlixHQ website.

                  • v1.1.1
                  • 16.54
                  • Published

                  page-scraper

                  Web page scraper with a jQuery-like syntax for Node.

                  • v2.0.5
                  • 16.50
                  • Published

                  axe-crawler

                  A highly configurable website crawler for automatically testing a website for accessibility issues using the axe-core library. Uses selenium and headless Chrome to load pages, inject axe-core, and run tests. Generates an html summary report in addition

                  • v0.5.5
                  • 16.27
                  • Published

                  schabbi-webscraper

                  Lightweight and easy to use crawling solution for websites.

                  • v1.2.2
                  • 16.26
                  • Published

                  @acq/environ

                  Environment variable collector

                  • v0.4.0
                  • 16.26
                  • Published

                  kaiser-crawler

                  Node.js module for crawling the web

                  • v1.0.5
                  • 16.19
                  • Published

                  headless-crawler

                  A crawler implemented using a headless browser (Chrome).

                  • v1.4.0
                  • 15.96
                  • Published

                  @acq/acq

                  A util tool

                  • v0.4.0
                  • 15.87
                  • Published

                  gpapi

                  use google play protobuf api in node

                  • v4.5.0
                  • 15.77
                  • Published

                  @duyquangnvx/story-spider

                  A TypeScript library for scraping stories from various Vietnamese websites

                  • v2.0.2
                  • 15.60
                  • Published

                  hydris

                  Generic node service to handle SSR for SPA made with any kind of frontend framework

                  • v1.3.0
                  • 15.51
                  • Published

                  @langgraph-js/crawler

                  A powerful web crawler designed specifically for LLM applications, capable of extracting clean, readable content from various web pages and converting it to Markdown format.

                  • v1.7.0
                  • 15.46
                  • Published

                  eztv-crawler

                  A promised based node module to scrape TV shows, episodes and torrent info from EZTV.

                  • v1.3.6
                  • 15.42
                  • Published

                  xvideosx

                  xvideos.com api implementation.

                  • v1.6.4
                  • 15.41
                  • Published

                  fakebrowser-dev

                  🤖 Fake fingerprints to bypass anti-bot systems. Simulate mouse and keyboard operations to make behavior like a real person.

                  • v0.0.69-dev
                  • 15.20
                  • Published

                  cardinalis

                  A socks and http proxy by nodejs for you to over GWF

                    • v3.2.4
                    • 15.19
                    • Published

                    scrapefrom

                    Scrape data from any webpage.

                    • v2.6.7
                    • 15.17
                    • Published

                    dbcrawler

                    crawls mysql database and creates insert queries or returns data from multiple table depending on the relationship information of the tables provided

                    • v0.0.42
                    • 14.92
                    • Published

                    flexible

                    Easily build flexible, scalable, and distributed, web crawlers.

                    • v0.1.20
                    • 14.80
                    • Published

                    crawlercore

                    crawler with nodejs

                    • v1.5.51
                    • 14.71
                    • Published

                    crawlkit

                    A crawler based on Phantom. Allows discovery of dynamic content and supports custom scrapers.

                    • v2.0.2
                    • 14.65
                    • Published

                    easydomscrapper

                    An extremely simple module to web scrapper a DOM element(s)

                    • v0.1.0
                    • 14.61
                    • Published

                    @gonetone/google-play-api

                    Access Google Play by logging in and making requests as an Android device!

                    • v1.3.1
                    • 14.61
                    • Published

                    floodesh

                    Floodesh is a distributed web spider/crawler written with Nodejs.

                    • v0.8.19
                    • 14.52
                    • Published

                    googlebot

                    Express middleware that returns the resulting html after executing javascript, allowing crawlers to read on the page

                    • v0.1.41
                    • 14.49
                    • Published

                    bauer-crawler

                    Multi-thread crawler engine.

                    • v0.2.9
                    • 14.28
                    • Published

                    @botmation/twitter

                    Auxiliary package of functions for the TypeScript framework Botmation

                    • v1.0.2
                    • 14.16
                    • Published

                    aliexpress-product-scraper-ts

                    Get Aliexpress product details as a json reponse including feedbacks, variants, description, images, etc.,

                      • v2.0.25
                      • 14.14
                      • Published

                      browser-bot-detector

                      A TypeScript library for detecting and categorizing bots from user agent strings

                        • v1.0.0
                        • 13.94
                        • Published

                        supercrawler

                        A web crawler. Supercrawler automatically crawls websites. Define custom handlers to parse content. Obeys robots.txt, rate limits and concurrency limits.

                        • v2.0.0
                        • 13.85
                        • Published

                        csdn-crawler

                        一个专门用于爬取csdn文章的爬虫/A JS library for Crawl CSDN Article.

                        • v1.0.8
                        • 13.81
                        • Published

                        @citoyasha/yt-search

                        Youtube Crawler with no API that returns 3 first videos.

                        • v1.0.1
                        • 13.76
                        • Published

                        robotto

                        A robots.txt reader, parser and matcher.

                        • v1.0.16
                        • 13.76
                        • Published

                        @crawlus/playwright

                        Playwright-based crawler for full browser automation and JavaScript rendering

                        • v0.6.0
                        • 13.69
                        • Published

                        @crawlbase/mcp

                        MCP server for Crawlbase API - enables web scraping through Model Context Protocol

                        • v1.0.3
                        • 13.68
                        • Published

                        yggtorrent

                        Web crawler to use as API

                        • v2.0.3
                        • 13.62
                        • Published

                        @crawlus/puppeteer

                        Puppeteer-based crawler for Chrome automation and dynamic content scraping

                        • v0.6.0
                        • 13.61
                        • Published