JSPM

Found 177 results for crawling

crawlable-solidify

Some tools to help you to render your application as a static web site using the crawlable module.

  • v1.0.2
  • 7.86
  • Published

sitescrapr

Simple website crawler and scraper

  • v0.0.1
  • 7.86
  • Published

cookied-phantom-crawler

PhantomJS and JSDOM based crawling tool. Used PhantomJS for full load of asynchronously-loaded resources and JSDOM for quick crawls. Allows custom [tough-cookie](https://www.npmjs.com/package/tough-cookie) insertion. Refer to [cheerio](https://www.npmj

    • v1.0.1
    • 7.68
    • Published

    crawly-automation

    A lightweight and modular web crawling framework built with Puppeteer.

      • v1.0.4
      • 7.59
      • Published

      headline-news-naver

      This extracts the top five news metadata from NAVER headlines.

      • v1.0.5
      • 7.30
      • Published

      spa-seo

      Single Page App SER

      • v0.0.3
      • 7.26
      • Published

      crawler-ts-htmlparser2

      Lightweight crawler written in TypeScript using ES6 generators.

      • v1.1.1
      • 7.26
      • Published

      hylsplider

      fork from headless-chrome-crawler and update puppeteer to the latest version

      • v1.0.0
      • 7.26
      • Published

      netcrawler

      Net Crawler is a web spider written with Nodejs

        • v0.8.6
        • 7.23
        • Published

        crawler2

        Crawler is a web spider written with Nodejs. It gives you the full power of jQuery on the server to parse a big number of pages as they are downloaded, asynchronously. Scraping should be simple and fun!

        • v0.0.2
        • 6.89
        • Published

        scrapyteer

        Web scraping/crawling framework built on top of headless Chrome

        • v1.4.0
        • 6.58
        • Published

        dcrawler

        DCrawler is a distribited web spider written in Nodejs and queued with Mongodb. It gives you the full power of jQuery to parse big pages as they are downloaded, asynchronously. Simplifying distributed crawler!

        • v0.0.8
        • 6.58
        • Published

        krawler

        Fast and lightweight web crawler with built-in cheerio, xml and json parser.

        • v0.3.3
        • 6.40
        • Published

        doffy

        a headless browser automation library with easy-use API

        • v0.0.7
        • 6.40
        • Published

        confession

        Helper to extract confessions from webpages

        • v3.1.0
        • 6.37
        • Published

        rebrowser-patches-fadi-patch

        Collection of patches for puppeteer and playwright to avoid automation detection and leaks. Helps to avoid Cloudflare and DataDome CAPTCHA pages. Easy to patch/unpatch, can be enabled/disabled on demand.

        • v1.0.18
        • 6.37
        • Published

        @crstn/redirect

        A small package to crawl a site and return a redirect template. This is helpful for migration from one to another website with different url schemes.

        • v1.2.0
        • 6.36
        • Published

        saintjs-score

        SoongSil UniverSity U-saint Score Crawling

        • v2.0.1
        • 6.34
        • Published

        @datasco/sdk

        Datasco API SDK for Node.js to collect any data from any website

        • v1.0.4
        • 5.62
        • Published

        proxidoor

        proxidoor helps you make HTTP requests through a rotating proxy, you can use it for services such as web scraping, web crawling and more.

        • v1.0.3
        • 5.62
        • Published

        cspider

        Distributed web crawler powered by Headless Chrome

        • v0.0.6
        • 5.31
        • Published

        goose-browser-environment

        Environment for Goose parser which allows to run it in commmon Browser

        • v1.0.4
        • 5.31
        • Published

        spider-stealth

        A Node.js scraping framework built on puppeteer-extra (to use a headless Chrome/Chromium browser). Has the ability to solve reCaptcha

        • v1.2.2
        • 5.31
        • Published

        p4k-api

        web scraper for album reviews from pitchfork

        • v1.4.3
        • 5.31
        • Published

        nocrawler

        Crawler is a web spider written with Nodejs. It gives you the full power of jQuery on the server to parse a big number of pages as they are downloaded, asynchronously. Scraping should be simple and fun!

        • v0.0.1
        • 5.31
        • Published

        crawling-typer

        Transform your text with dynamic typing animations! crawling-typer lets you display an array of strings one at a time, each with its own color. Customize typing speed, delete speed, and pauses between strings. Enjoy full control with loop counts, post-loo

        • v1.1.1
        • 5.31
        • Published

        friday-sdk

        Official JavaScript/TypeScript SDK for the Friday API

        • v0.2.2
        • 5.29
        • Published

        imdb-scrapi

        An API to get data off of IMDB using Puppeteer.

        • v1.0.2
        • 5.29
        • Published

        spider-core

        A Node.js scraping framework built on puppeteer-core (to use a headless Chrome/Chromium browser). The core module without browser installation

        • v1.3.11
        • 5.29
        • Published

        magnet-getter

        An API to get magnet links using Puppeteer.

        • v1.1.0
        • 4.35
        • Published

        spider-stealth-core

        A Node.js scraping framework built on puppeteer-extra (to use a headless Chrome/Chromium browser). Has the ability to solve reCaptcha. The core module without browser installation

        • v1.3.4
        • 4.25
        • Published

        dynamic-crawling

        Tem o objetivo de executar rotinas de CRAWLING a partir de um arquivo JSON utilizando xpath mas aceitando para cada passo uma função callback que recebe o valor e pode passar esse valor para um próximo passo.

        • v1.0.2
        • 4.20
        • Published

        planisphere

        A straightforward sitemap generator written in TypeScript.

        • v1.0.1
        • 4.20
        • Published

        robinbot

        robin web crawling engine with nodejs

        • v0.9.0
        • 4.15
        • Published

        fiend

        The most advanced web crawler for JavaScript

        • v0.1.0
        • 4.02
        • Published

        @stacksleuth/browser-agent

        StackSleuth in-house browser automation agent for debugging and user simulation

        • v0.2.1
        • 4.02
        • Published

        keyworm

        keyword mention 크롤러

        • v0.1.1
        • 4.02
        • Published

        style-crawl

        Package to find style links from the site you want

        • v1.1.2
        • 4.02
        • Published

        plucky-crawler

        The error crawler that powers http://plucky.io/

        • v0.0.1
        • 4.00
        • Published

        crawler-mod

        based on node-crawler

        • v0.0.1
        • 2.49
        • Published

        instagram-crawling

        Simple Instagram Crawling without using public API

        • v1.1.2
        • 2.49
        • Published

        jason-the-miner

        Harvesting data at the <html> mine.

        • v1.1.1
        • 2.46
        • Published

        skrap

        Easily scrap web pages by providing json recipes

        • v0.1.1
        • 2.46
        • Published

        crawline

        Web crawler

        • v0.0.0
        • 2.43
        • Published

        node-pool-scraper

        Node.js web scraping utility powered by puppeteer pool

        • v0.1.6
        • 2.36
        • Published

        node-crawling-framework

        NodeJs crawling & scraping framework heavily inspired by Scrapy (Pyhton)

        • v0.0.1-alpha.2
        • 2.35
        • Published

        ccht

        A simple command0line tool to crawl and test your website

        • v0.1.2
        • 2.35
        • Published

        wight-backend-web

        A Wight backend for fetching static web pages

        • v0.1.0
        • 2.35
        • Published

        spamlet

        spamlet is an efficient and simple crawler for playwright

          • v0.1.6
          • 2.35
          • Published

          crt-scrapper

          Easily create a scraper api with the @web/scrapper library, which includes a scraper and advanced events for your website.

          • v1.0.4
          • 2.35
          • Published

          miniscraper

          Minimalist Node.js web scraper and crawler working with under-the-hood JSDOM

          • v0.3.2
          • 2.34
          • Published

          gumo

          A web-crawler and scraper that extracts data from a family of nested dynamic webpages with added enhancements to assist in knowledge mining applications.

          • v1.0.7
          • 0.00
          • Published

          hcr

          Easy To Use Web Crawler

          • v1.4.1
          • 0.00
          • Published

          @subtitles/providers

          Providers are the core of applications, where the subtitles are collected. Each provider exports a unique strategy for gathering data. From legendastv's web scraping from opensubtitle API usage, you can collect subtitles from your favorite tv shows and mo

          • v0.3.0-beta.2
          • 0.00
          • Published

          press2blogger

          Moving or backing up your Wordpress site to Blogger

          • v1.0.3
          • 0.00
          • Published

          parkour

          Parkour the web like a yamakazi

          • v1.0.0
          • 0.00
          • Published

          ig-scrap-cache

          scrap and caching by use a redis from instagram

          • v3.0.0
          • 0.00
          • Published

          n8n-nodes-firecrawl-tool

          n8n node for Firecrawl v2 API - Web scraping, crawling, and data extraction tool for workflows and AI agents

          • v0.1.2
          • 0.00
          • Published

          sitemaps-getter

          A tool to get sitemaps from websites and crawl them

          • v1.0.3
          • 0.00
          • Published

          nstock

          naver stock data crawler

          • v0.1.0-beta
          • 0.00
          • Published

          malkovich-malkovich

          A lightweight and simple API for web crawling built on chromium puppeteer

          • v0.0.1
          • 0.00
          • Published