Found 177 results for crawling

crawlable-solidify

Some tools to help you to render your application as a static web site using the crawlable module.

PhantomJS and JSDOM based crawling tool. Used PhantomJS for full load of asynchronously-loaded resources and JSDOM for quick crawls. Allows custom [tough-cookie](https://www.npmjs.com/package/tough-cookie) insertion. Refer to [cheerio](https://www.npmj

crawly-automation

A lightweight and modular web crawling framework built with Puppeteer.

headline-news-naver

This extracts the top five news metadata from NAVER headlines.

spa-seo

Single Page App SER

crawler-ts-htmlparser2

Lightweight crawler written in TypeScript using ES6 generators.

hylsplider

fork from headless-chrome-crawler and update puppeteer to the latest version

netcrawler

Net Crawler is a web spider written with Nodejs

crawler2

Crawler is a web spider written with Nodejs. It gives you the full power of jQuery on the server to parse a big number of pages as they are downloaded, asynchronously. Scraping should be simple and fun!

scrapyteer

Web scraping/crawling framework built on top of headless Chrome

dcrawler

DCrawler is a distribited web spider written in Nodejs and queued with Mongodb. It gives you the full power of jQuery to parse big pages as they are downloaded, asynchronously. Simplifying distributed crawler!

krawler

Fast and lightweight web crawler with built-in cheerio, xml and json parser.

doffy

a headless browser automation library with easy-use API

confession

Helper to extract confessions from webpages

rebrowser-patches-fadi-patch

Collection of patches for puppeteer and playwright to avoid automation detection and leaks. Helps to avoid Cloudflare and DataDome CAPTCHA pages. Easy to patch/unpatch, can be enabled/disabled on demand.

@crstn/redirect

A small package to crawl a site and return a redirect template. This is helpful for migration from one to another website with different url schemes.

simplecrawling

Crawler made simple

headless-chrome-crawler-x

Distributed web crawler powered by Headless Chrome

saintjs-score

SoongSil UniverSity U-saint Score Crawling

goose-jsdom-environment

Environment for Goose Parser which allows to run it using JsDOM

puppeteer-for-crawling

Daily use crawling methods for puppeteer

@datasco/sdk

Datasco API SDK for Node.js to collect any data from any website

proxidoor

proxidoor helps you make HTTP requests through a rotating proxy, you can use it for services such as web scraping, web crawling and more.

cspider

Distributed web crawler powered by Headless Chrome

goose-browser-environment

Environment for Goose parser which allows to run it in commmon Browser

spider-stealth

A Node.js scraping framework built on puppeteer-extra (to use a headless Chrome/Chromium browser). Has the ability to solve reCaptcha

p4k-api

web scraper for album reviews from pitchfork

nocrawler

crawling-typer

Transform your text with dynamic typing animations! crawling-typer lets you display an array of strings one at a time, each with its own color. Customize typing speed, delete speed, and pauses between strings. Enjoy full control with loop counts, post-loo

friday-sdk

Official JavaScript/TypeScript SDK for the Friday API

imdb-scrapi

An API to get data off of IMDB using Puppeteer.

spider-core

A Node.js scraping framework built on puppeteer-core (to use a headless Chrome/Chromium browser). The core module without browser installation

magnet-getter

An API to get magnet links using Puppeteer.

spider-stealth-core

A Node.js scraping framework built on puppeteer-extra (to use a headless Chrome/Chromium browser). Has the ability to solve reCaptcha. The core module without browser installation

@jonnyprof/headless-chrome-crawler

Distributed web crawler powered by Headless Chrome

dynamic-crawling

Tem o objetivo de executar rotinas de CRAWLING a partir de um arquivo JSON utilizando xpath mas aceitando para cada passo uma função callback que recebe o valor e pode passar esse valor para um próximo passo.

planisphere

A straightforward sitemap generator written in TypeScript.

robinbot

robin web crawling engine with nodejs

@a-parser/webperl

twitter-crawler

NodeJS Crawler for Twitter

fiend

The most advanced web crawler for JavaScript

@stacksleuth/browser-agent

StackSleuth in-house browser automation agent for debugging and user simulation

crawler-by-sunbirder

Crawler Second-system effect,the second development

keyworm

keyword mention 크롤러

billboard-chart-api

billboard chart crawling module

style-crawl

Package to find style links from the site you want

@satankebab/scraping-utils

Set of utils and queues to make web scraping easy.

plucky-crawler

The error crawler that powers http://plucky.io/

@karthikmam/job-manager

A Simple Job Manager

crawler-mod

based on node-crawler

instagram-crawling

Simple Instagram Crawling without using public API

jason-the-miner

Harvesting data at the <html> mine.

skrap

Easily scrap web pages by providing json recipes

crawline

Web crawler

node-pool-scraper

Node.js web scraping utility powered by puppeteer pool

node-crawling-framework

NodeJs crawling & scraping framework heavily inspired by Scrapy (Pyhton)

ccht

A simple command0line tool to crawl and test your website

kick-off-crawling

make web scraping easy

wight-backend-web

A Wight backend for fetching static web pages

spamlet

spamlet is an efficient and simple crawler for playwright

crt-scrapper

Easily create a scraper api with the @web/scrapper library, which includes a scraper and advanced events for your website.

detect-crawling-react

This is the React Component for Detect Crawling

miniscraper

Minimalist Node.js web scraper and crawler working with under-the-hood JSDOM

gumo

A web-crawler and scraper that extracts data from a family of nested dynamic webpages with added enhancements to assist in knowledge mining applications.

@vladfrangu-dev/crawlee-utils

A set of shared utilities that can be used by crawlers

commodidolores

Web crawler for Node.js

hcr

Easy To Use Web Crawler

@subtitles/providers

Providers are the core of applications, where the subtitles are collected. Each provider exports a unique strategy for gathering data. From legendastv's web scraping from opensubtitle API usage, you can collect subtitles from your favorite tv shows and mo