JSPM

Found 177 results for crawling

realfish-yc

Real Fish Youtube Video Crawling Module

  • v0.1.8
  • 9.24
  • Published

hapi-goldwasher

A plugin for Hapi.js to run goldwasher as a scraping API on the web.

  • v1.0.4
  • 9.01
  • Published

scrapeasy

Automated scraping module using patterns generated by the userscript Scrapeasy.

  • v0.4.2
  • 9.01
  • Published

sasori-crawl

Sasori is a dynamic web crawler powered by Puppeteer, designed for lightning-fast endpoint discovery.

  • v1.0.0
  • 9.01
  • Published

nodecraw

NodeCraw is a web crawling application that allows you to crawl specified URLs and extract information from web pages. It utilizes various modules and libraries to perform crawling and save the results.

    • v1.0.7
    • 8.79
    • Published

    spider2

    A 2nd generation spider to crawl any article site, automatic reading title and content.

    • v0.0.7
    • 8.69
    • Published

    aragog-client

    Aragog web scraping framework client

    • v1.0.3
    • 8.69
    • Published

    tai-spider

    Scrapy Framework implemented by nodejs.

    • v0.1.21
    • 8.67
    • Published

    scrape-them-all

    🚀 An easy-to-handle Node.js scraper that allow you to scrape them all in a record time.

    • v2.0.0
    • 8.31
    • Published

    @leoko/crawler

    Crawler is a web spider written with Nodejs. It gives you the full power of jQuery on the server to parse a big number of pages as they are downloaded, asynchronously

    • v1.3.1
    • 8.31
    • Published

    crawley

    A simple web crawler

    • v1.0.2
    • 8.31
    • Published

    doffy

    a headless browser automation library with easy-use API

    • v0.0.7
    • 8.10
    • Published

    @jifeon/goose-parser

    PhantomJS/Browser lib which allows to parse a webpage

    • v0.2.0-alpha.3
    • 8.10
    • Published

    @mseep/firecrawl-simple-mcp

    MCP server for Firecrawl Simple — a web scraping and site mapping tool enabling LLMs to access and process web content

    • v1.0.2
    • 8.10
    • Published

    crawlme

    Makes your ajax web application indexable by search engines by generating html snapshots on the fly. Caches results for blazing fast responses and better page ranking.

      • v0.0.7
      • 8.01
      • Published

      crawler-ts-fs

      Lightweight crawler written in TypeScript using ES6 generators.

      • v1.1.1
      • 8.00
      • Published

      crawler-ts

      Lightweight crawler written in TypeScript using ES6 generators.

      • v1.1.1
      • 8.00
      • Published

      crawlable-solidify

      Some tools to help you to render your application as a static web site using the crawlable module.

      • v1.0.2
      • 7.83
      • Published

      sitescrapr

      Simple website crawler and scraper

      • v0.0.1
      • 7.83
      • Published

      cookied-phantom-crawler

      PhantomJS and JSDOM based crawling tool. Used PhantomJS for full load of asynchronously-loaded resources and JSDOM for quick crawls. Allows custom [tough-cookie](https://www.npmjs.com/package/tough-cookie) insertion. Refer to [cheerio](https://www.npmj

        • v1.0.1
        • 7.70
        • Published

        crawly-automation

        A lightweight and modular web crawling framework built with Puppeteer.

          • v1.0.4
          • 7.60
          • Published

          netcrawler

          Net Crawler is a web spider written with Nodejs

            • v0.8.6
            • 7.35
            • Published

            spider-core

            A Node.js scraping framework built on puppeteer-core (to use a headless Chrome/Chromium browser). The core module without browser installation

            • v1.3.11
            • 7.35
            • Published

            headline-news-naver

            This extracts the top five news metadata from NAVER headlines.

            • v1.0.5
            • 7.32
            • Published

            spa-seo

            Single Page App SER

            • v0.0.3
            • 7.24
            • Published

            crawler-ts-htmlparser2

            Lightweight crawler written in TypeScript using ES6 generators.

            • v1.1.1
            • 7.24
            • Published

            hylsplider

            fork from headless-chrome-crawler and update puppeteer to the latest version

            • v1.0.0
            • 7.23
            • Published

            crawler2

            Crawler is a web spider written with Nodejs. It gives you the full power of jQuery on the server to parse a big number of pages as they are downloaded, asynchronously. Scraping should be simple and fun!

            • v0.0.2
            • 6.87
            • Published

            scrapyteer

            Web scraping/crawling framework built on top of headless Chrome

            • v1.4.0
            • 6.58
            • Published

            dcrawler

            DCrawler is a distribited web spider written in Nodejs and queued with Mongodb. It gives you the full power of jQuery to parse big pages as they are downloaded, asynchronously. Simplifying distributed crawler!

            • v0.0.8
            • 6.58
            • Published

            plucky-crawler

            The error crawler that powers http://plucky.io/

            • v0.0.1
            • 6.44
            • Published

            saintjs-score

            SoongSil UniverSity U-saint Score Crawling

            • v2.0.1
            • 6.44
            • Published

            krawler

            Fast and lightweight web crawler with built-in cheerio, xml and json parser.

            • v0.3.3
            • 6.42
            • Published

            rebrowser-patches-fadi-patch

            Collection of patches for puppeteer and playwright to avoid automation detection and leaks. Helps to avoid Cloudflare and DataDome CAPTCHA pages. Easy to patch/unpatch, can be enabled/disabled on demand.

            • v1.0.18
            • 6.35
            • Published

            confession

            Helper to extract confessions from webpages

            • v3.1.0
            • 6.35
            • Published

            @crstn/redirect

            A small package to crawl a site and return a redirect template. This is helpful for migration from one to another website with different url schemes.

            • v1.2.0
            • 6.34
            • Published

            @datasco/sdk

            Datasco API SDK for Node.js to collect any data from any website

            • v1.0.4
            • 5.63
            • Published

            proxidoor

            proxidoor helps you make HTTP requests through a rotating proxy, you can use it for services such as web scraping, web crawling and more.

            • v1.0.3
            • 5.63
            • Published

            friday-sdk

            Official JavaScript/TypeScript SDK for the Friday API

            • v0.2.2
            • 5.37
            • Published

            cspider

            Distributed web crawler powered by Headless Chrome

            • v0.0.6
            • 5.29
            • Published

            goose-browser-environment

            Environment for Goose parser which allows to run it in commmon Browser

            • v1.0.4
            • 5.29
            • Published

            spider-stealth

            A Node.js scraping framework built on puppeteer-extra (to use a headless Chrome/Chromium browser). Has the ability to solve reCaptcha

            • v1.2.2
            • 5.29
            • Published

            p4k-api

            web scraper for album reviews from pitchfork

            • v1.4.3
            • 5.29
            • Published

            nocrawler

            Crawler is a web spider written with Nodejs. It gives you the full power of jQuery on the server to parse a big number of pages as they are downloaded, asynchronously. Scraping should be simple and fun!

            • v0.0.1
            • 5.29
            • Published

            crawling-typer

            Transform your text with dynamic typing animations! crawling-typer lets you display an array of strings one at a time, each with its own color. Customize typing speed, delete speed, and pauses between strings. Enjoy full control with loop counts, post-loo

            • v1.1.1
            • 5.29
            • Published

            magnet-getter

            An API to get magnet links using Puppeteer.

            • v1.1.0
            • 4.33
            • Published

            spider-stealth-core

            A Node.js scraping framework built on puppeteer-extra (to use a headless Chrome/Chromium browser). Has the ability to solve reCaptcha. The core module without browser installation

            • v1.3.4
            • 4.26
            • Published

            dynamic-crawling

            Tem o objetivo de executar rotinas de CRAWLING a partir de um arquivo JSON utilizando xpath mas aceitando para cada passo uma função callback que recebe o valor e pode passar esse valor para um próximo passo.

            • v1.0.2
            • 4.20
            • Published

            planisphere

            A straightforward sitemap generator written in TypeScript.

            • v1.0.1
            • 4.20
            • Published

            robinbot

            robin web crawling engine with nodejs

            • v0.9.0
            • 4.15
            • Published

            imdb-scrapi

            An API to get data off of IMDB using Puppeteer.

            • v1.0.2
            • 4.06
            • Published

            declarative-scraper

            Simple & Human-Friendly HTML Scraper with Json-ld support

            • v0.1.1
            • 4.06
            • Published

            miniscraper

            Minimalist Node.js web scraper and crawler working with under-the-hood JSDOM

            • v0.3.2
            • 4.06
            • Published

            fiend

            The most advanced web crawler for JavaScript

            • v0.1.0
            • 4.00
            • Published

            @stacksleuth/browser-agent

            StackSleuth in-house browser automation agent for debugging and user simulation

            • v0.2.1
            • 4.00
            • Published

            keyworm

            keyword mention 크롤러

            • v0.1.1
            • 4.00
            • Published

            style-crawl

            Package to find style links from the site you want

            • v1.1.2
            • 4.00
            • Published

            crawler-mod

            based on node-crawler

            • v0.0.1
            • 2.49
            • Published

            instagram-crawling

            Simple Instagram Crawling without using public API

            • v1.1.2
            • 2.49
            • Published

            jason-the-miner

            Harvesting data at the <html> mine.

            • v1.1.1
            • 2.46
            • Published

            skrap

            Easily scrap web pages by providing json recipes

            • v0.1.1
            • 2.46
            • Published

            crawline

            Web crawler

            • v0.0.0
            • 2.43
            • Published

            node-pool-scraper

            Node.js web scraping utility powered by puppeteer pool

            • v0.1.6
            • 2.37
            • Published

            node-crawling-framework

            NodeJs crawling & scraping framework heavily inspired by Scrapy (Pyhton)

            • v0.0.1-alpha.2
            • 2.34
            • Published

            ccht

            A simple command0line tool to crawl and test your website

            • v0.1.2
            • 2.34
            • Published

            wight-backend-web

            A Wight backend for fetching static web pages

            • v0.1.0
            • 2.34
            • Published

            spamlet

            spamlet is an efficient and simple crawler for playwright

              • v0.1.6
              • 2.34
              • Published

              crt-scrapper

              Easily create a scraper api with the @web/scrapper library, which includes a scraper and advanced events for your website.

              • v1.0.4
              • 2.34
              • Published

              gumo

              A web-crawler and scraper that extracts data from a family of nested dynamic webpages with added enhancements to assist in knowledge mining applications.

              • v1.0.7
              • 0.00
              • Published

              hcr

              Easy To Use Web Crawler

              • v1.4.1
              • 0.00
              • Published

              @subtitles/providers

              Providers are the core of applications, where the subtitles are collected. Each provider exports a unique strategy for gathering data. From legendastv's web scraping from opensubtitle API usage, you can collect subtitles from your favorite tv shows and mo

              • v0.3.0-beta.2
              • 0.00
              • Published

              press2blogger

              Moving or backing up your Wordpress site to Blogger

              • v1.0.3
              • 0.00
              • Published

              parkour

              Parkour the web like a yamakazi

              • v1.0.0
              • 0.00
              • Published

              ig-scrap-cache

              scrap and caching by use a redis from instagram

              • v3.0.0
              • 0.00
              • Published

              n8n-nodes-firecrawl-tool

              n8n node for Firecrawl v2 API - Web scraping, crawling, and data extraction tool for workflows and AI agents

              • v0.1.2
              • 0.00
              • Published

              sitemaps-getter

              A tool to get sitemaps from websites and crawl them

              • v1.0.3
              • 0.00
              • Published

              nstock

              naver stock data crawler

              • v0.1.0-beta
              • 0.00
              • Published

              malkovich-malkovich

              A lightweight and simple API for web crawling built on chromium puppeteer

              • v0.0.1
              • 0.00
              • Published