JSPM

Found 177 results for crawling

@tooly/firecrawl

Firecrawl API tools for OpenAI, Anthropic, and AI SDK

  • v0.0.3
  • 11.21
  • Published

tiny-crawler

tiny-crawler is a web crawler.

  • v0.0.5
  • 10.98
  • Published

scrapr

A tool for getting public website content using a browser engine or http get.

  • v0.0.15
  • 10.98
  • Published

enispider

A Node.js scraping framework built on puppeteer (to use a headless Chrome/Chromium browser)

  • v1.2.5
  • 10.87
  • Published

console-tourist

This script provides to analyze console error on your website.

  • v1.2.0
  • 10.51
  • Published

node-raspar

Easily scrap the web for torrent and media files.

  • v1.2.6
  • 10.51
  • Published

notion-crawler

Easily crawl your public notion pages

  • v0.0.9
  • 10.47
  • Published

beautifulstew

A simple web scraping tool built for developers that can be utilized on both the client and server.

    • v1.1.4
    • 10.34
    • Published

    crawling

    A simple crawler made in JavaScript for Node.

    • v1.0.1
    • 10.34
    • Published

    realfish-yct

    Real Fish Youtube Trend Video Crawling

    • v0.3.0
    • 10.33
    • Published

    papermonk

    Streaming pdf fetcher for academic papers.

    • v0.0.3
    • 9.96
    • Published

    node-crawler-scraper

    Simple and powerful crawler. It scraps content and collects links from websites using request or phantomjs. The whole magic and simplicity is behind configuration.

      • v1.0.1
      • 9.89
      • Published

      earthworm

      easily create crawlers based on self-replicated scrapers

      • v1.0.4
      • 9.89
      • Published

      img-cli

      An interactive Command-Line Interface Build in NodeJS for downloading a single image or multiple images to disk from URL

      • v1.2.0
      • 9.89
      • Published

      scrapingai

      Build web scraping agents using AI to auto-extract the data from websites

      • v1.0.1
      • 9.76
      • Published

      scrapingapi

      One API to scrape All the Web.

      • v0.3.1
      • 9.76
      • Published

      crawler-hq

      Crawler is a web spider written with Nodejs. It gives you the full power of jQuery on the server to parse a big number of pages as they are downloaded, asynchronously. Scraping should be simple and fun!

      • v0.2.7
      • 9.64
      • Published

      firecrawl-simple-mcp

      Model Context Protocol (MCP) server for Firecrawl Simple - provides web scraping and crawling capabilities to LLMs

      • v1.0.2
      • 9.41
      • Published

      realfish-yc

      Real Fish Youtube Video Crawling Module

      • v0.1.8
      • 9.24
      • Published

      hapi-goldwasher

      A plugin for Hapi.js to run goldwasher as a scraping API on the web.

      • v1.0.4
      • 9.01
      • Published

      scrapeasy

      Automated scraping module using patterns generated by the userscript Scrapeasy.

      • v0.4.2
      • 9.01
      • Published

      sasori-crawl

      Sasori is a dynamic web crawler powered by Puppeteer, designed for lightning-fast endpoint discovery.

      • v1.0.0
      • 9.01
      • Published

      nodecraw

      NodeCraw is a web crawling application that allows you to crawl specified URLs and extract information from web pages. It utilizes various modules and libraries to perform crawling and save the results.

        • v1.0.7
        • 8.79
        • Published

        spider2

        A 2nd generation spider to crawl any article site, automatic reading title and content.

        • v0.0.7
        • 8.69
        • Published

        aragog-client

        Aragog web scraping framework client

        • v1.0.3
        • 8.69
        • Published

        tai-spider

        Scrapy Framework implemented by nodejs.

        • v0.1.21
        • 8.67
        • Published

        scrape-them-all

        🚀 An easy-to-handle Node.js scraper that allow you to scrape them all in a record time.

        • v2.0.0
        • 8.31
        • Published

        @leoko/crawler

        Crawler is a web spider written with Nodejs. It gives you the full power of jQuery on the server to parse a big number of pages as they are downloaded, asynchronously

        • v1.3.1
        • 8.31
        • Published

        crawley

        A simple web crawler

        • v1.0.2
        • 8.31
        • Published

        doffy

        a headless browser automation library with easy-use API

        • v0.0.7
        • 8.10
        • Published

        @jifeon/goose-parser

        PhantomJS/Browser lib which allows to parse a webpage

        • v0.2.0-alpha.3
        • 8.10
        • Published

        @mseep/firecrawl-simple-mcp

        MCP server for Firecrawl Simple — a web scraping and site mapping tool enabling LLMs to access and process web content

        • v1.0.2
        • 8.10
        • Published

        crawlme

        Makes your ajax web application indexable by search engines by generating html snapshots on the fly. Caches results for blazing fast responses and better page ranking.

          • v0.0.7
          • 8.01
          • Published

          crawler-ts-fs

          Lightweight crawler written in TypeScript using ES6 generators.

          • v1.1.1
          • 8.00
          • Published

          crawler-ts

          Lightweight crawler written in TypeScript using ES6 generators.

          • v1.1.1
          • 8.00
          • Published

          crawlable-solidify

          Some tools to help you to render your application as a static web site using the crawlable module.

          • v1.0.2
          • 7.83
          • Published

          sitescrapr

          Simple website crawler and scraper

          • v0.0.1
          • 7.83
          • Published

          cookied-phantom-crawler

          PhantomJS and JSDOM based crawling tool. Used PhantomJS for full load of asynchronously-loaded resources and JSDOM for quick crawls. Allows custom [tough-cookie](https://www.npmjs.com/package/tough-cookie) insertion. Refer to [cheerio](https://www.npmj

            • v1.0.1
            • 7.70
            • Published

            crawly-automation

            A lightweight and modular web crawling framework built with Puppeteer.

              • v1.0.4
              • 7.60
              • Published

              netcrawler

              Net Crawler is a web spider written with Nodejs

                • v0.8.6
                • 7.35
                • Published

                spider-core

                A Node.js scraping framework built on puppeteer-core (to use a headless Chrome/Chromium browser). The core module without browser installation

                • v1.3.11
                • 7.35
                • Published

                headline-news-naver

                This extracts the top five news metadata from NAVER headlines.

                • v1.0.5
                • 7.32
                • Published

                spa-seo

                Single Page App SER

                • v0.0.3
                • 7.24
                • Published

                crawler-ts-htmlparser2

                Lightweight crawler written in TypeScript using ES6 generators.

                • v1.1.1
                • 7.24
                • Published

                hylsplider

                fork from headless-chrome-crawler and update puppeteer to the latest version

                • v1.0.0
                • 7.23
                • Published

                crawler2

                Crawler is a web spider written with Nodejs. It gives you the full power of jQuery on the server to parse a big number of pages as they are downloaded, asynchronously. Scraping should be simple and fun!

                • v0.0.2
                • 6.87
                • Published

                scrapyteer

                Web scraping/crawling framework built on top of headless Chrome

                • v1.4.0
                • 6.58
                • Published

                dcrawler

                DCrawler is a distribited web spider written in Nodejs and queued with Mongodb. It gives you the full power of jQuery to parse big pages as they are downloaded, asynchronously. Simplifying distributed crawler!

                • v0.0.8
                • 6.58
                • Published

                plucky-crawler

                The error crawler that powers http://plucky.io/

                • v0.0.1
                • 6.44
                • Published

                saintjs-score

                SoongSil UniverSity U-saint Score Crawling

                • v2.0.1
                • 6.44
                • Published

                krawler

                Fast and lightweight web crawler with built-in cheerio, xml and json parser.

                • v0.3.3
                • 6.42
                • Published

                rebrowser-patches-fadi-patch

                Collection of patches for puppeteer and playwright to avoid automation detection and leaks. Helps to avoid Cloudflare and DataDome CAPTCHA pages. Easy to patch/unpatch, can be enabled/disabled on demand.

                • v1.0.18
                • 6.35
                • Published

                confession

                Helper to extract confessions from webpages

                • v3.1.0
                • 6.35
                • Published

                @crstn/redirect

                A small package to crawl a site and return a redirect template. This is helpful for migration from one to another website with different url schemes.

                • v1.2.0
                • 6.34
                • Published

                @datasco/sdk

                Datasco API SDK for Node.js to collect any data from any website

                • v1.0.4
                • 5.63
                • Published

                proxidoor

                proxidoor helps you make HTTP requests through a rotating proxy, you can use it for services such as web scraping, web crawling and more.

                • v1.0.3
                • 5.63
                • Published

                friday-sdk

                Official JavaScript/TypeScript SDK for the Friday API

                • v0.2.2
                • 5.37
                • Published

                cspider

                Distributed web crawler powered by Headless Chrome

                • v0.0.6
                • 5.29
                • Published

                goose-browser-environment

                Environment for Goose parser which allows to run it in commmon Browser

                • v1.0.4
                • 5.29
                • Published

                spider-stealth

                A Node.js scraping framework built on puppeteer-extra (to use a headless Chrome/Chromium browser). Has the ability to solve reCaptcha

                • v1.2.2
                • 5.29
                • Published

                p4k-api

                web scraper for album reviews from pitchfork

                • v1.4.3
                • 5.29
                • Published

                nocrawler

                Crawler is a web spider written with Nodejs. It gives you the full power of jQuery on the server to parse a big number of pages as they are downloaded, asynchronously. Scraping should be simple and fun!

                • v0.0.1
                • 5.29
                • Published

                crawling-typer

                Transform your text with dynamic typing animations! crawling-typer lets you display an array of strings one at a time, each with its own color. Customize typing speed, delete speed, and pauses between strings. Enjoy full control with loop counts, post-loo

                • v1.1.1
                • 5.29
                • Published

                magnet-getter

                An API to get magnet links using Puppeteer.

                • v1.1.0
                • 4.33
                • Published

                spider-stealth-core

                A Node.js scraping framework built on puppeteer-extra (to use a headless Chrome/Chromium browser). Has the ability to solve reCaptcha. The core module without browser installation

                • v1.3.4
                • 4.26
                • Published

                dynamic-crawling

                Tem o objetivo de executar rotinas de CRAWLING a partir de um arquivo JSON utilizando xpath mas aceitando para cada passo uma função callback que recebe o valor e pode passar esse valor para um próximo passo.

                • v1.0.2
                • 4.20
                • Published

                planisphere

                A straightforward sitemap generator written in TypeScript.

                • v1.0.1
                • 4.20
                • Published

                robinbot

                robin web crawling engine with nodejs

                • v0.9.0
                • 4.15
                • Published

                imdb-scrapi

                An API to get data off of IMDB using Puppeteer.

                • v1.0.2
                • 4.06
                • Published

                declarative-scraper

                Simple & Human-Friendly HTML Scraper with Json-ld support

                • v0.1.1
                • 4.06
                • Published

                miniscraper

                Minimalist Node.js web scraper and crawler working with under-the-hood JSDOM

                • v0.3.2
                • 4.06
                • Published

                fiend

                The most advanced web crawler for JavaScript

                • v0.1.0
                • 4.00
                • Published

                @stacksleuth/browser-agent

                StackSleuth in-house browser automation agent for debugging and user simulation

                • v0.2.1
                • 4.00
                • Published

                keyworm

                keyword mention 크롤러

                • v0.1.1
                • 4.00
                • Published

                style-crawl

                Package to find style links from the site you want

                • v1.1.2
                • 4.00
                • Published

                crawler-mod

                based on node-crawler

                • v0.0.1
                • 2.49
                • Published

                instagram-crawling

                Simple Instagram Crawling without using public API

                • v1.1.2
                • 2.49
                • Published

                jason-the-miner

                Harvesting data at the <html> mine.

                • v1.1.1
                • 2.46
                • Published

                skrap

                Easily scrap web pages by providing json recipes

                • v0.1.1
                • 2.46
                • Published

                crawline

                Web crawler

                • v0.0.0
                • 2.43
                • Published

                node-pool-scraper

                Node.js web scraping utility powered by puppeteer pool

                • v0.1.6
                • 2.37
                • Published

                node-crawling-framework

                NodeJs crawling & scraping framework heavily inspired by Scrapy (Pyhton)

                • v0.0.1-alpha.2
                • 2.34
                • Published

                ccht

                A simple command0line tool to crawl and test your website

                • v0.1.2
                • 2.34
                • Published

                wight-backend-web

                A Wight backend for fetching static web pages

                • v0.1.0
                • 2.34
                • Published

                spamlet

                spamlet is an efficient and simple crawler for playwright

                  • v0.1.6
                  • 2.34
                  • Published

                  crt-scrapper

                  Easily create a scraper api with the @web/scrapper library, which includes a scraper and advanced events for your website.

                  • v1.0.4
                  • 2.34
                  • Published

                  gumo

                  A web-crawler and scraper that extracts data from a family of nested dynamic webpages with added enhancements to assist in knowledge mining applications.

                  • v1.0.7
                  • 0.00
                  • Published

                  hcr

                  Easy To Use Web Crawler

                  • v1.4.1
                  • 0.00
                  • Published

                  @subtitles/providers

                  Providers are the core of applications, where the subtitles are collected. Each provider exports a unique strategy for gathering data. From legendastv's web scraping from opensubtitle API usage, you can collect subtitles from your favorite tv shows and mo

                  • v0.3.0-beta.2
                  • 0.00
                  • Published

                  press2blogger

                  Moving or backing up your Wordpress site to Blogger

                  • v1.0.3
                  • 0.00
                  • Published

                  parkour

                  Parkour the web like a yamakazi

                  • v1.0.0
                  • 0.00
                  • Published

                  ig-scrap-cache

                  scrap and caching by use a redis from instagram

                  • v3.0.0
                  • 0.00
                  • Published

                  n8n-nodes-firecrawl-tool

                  n8n node for Firecrawl v2 API - Web scraping, crawling, and data extraction tool for workflows and AI agents

                  • v0.1.2
                  • 0.00
                  • Published

                  sitemaps-getter

                  A tool to get sitemaps from websites and crawl them

                  • v1.0.3
                  • 0.00
                  • Published

                  nstock

                  naver stock data crawler

                  • v0.1.0-beta
                  • 0.00
                  • Published

                  malkovich-malkovich

                  A lightweight and simple API for web crawling built on chromium puppeteer

                  • v0.0.1
                  • 0.00
                  • Published