JSPM

scraptor

0.1.0
  • ESM via JSPM
  • ES Module Entrypoint
  • Export Map
  • Keywords
  • License
  • Repository URL
  • TypeScript Types
  • README
  • Created
  • Published
  • Downloads 1
  • Score
    100M100P100Q14860F
  • License GPL-3.0

My way to use Chrome headless and scrape.

Package Exports

  • scraptor

This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (scraptor) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.

Readme

scraptor

!!This library is a work in progress. The API most likely will change.!!

This library is my attempt to wrap puppeteer and cheerio to create a library that allows me to easily construct web scrapers. A DSL implements common patterns, while allowing to break out into the underlying libraries if necessary.

Synopsis

import {browse, once, fillForm, click, html, usingHeadlessBrowser} from "scraptor";
import {flowP} from "combinators-p";

const spinnerDone = "document.querySelector('.spinner').classList.contains('hide')";
const waitForSpinner = once(spinnerDone);
const search = (url, term) =>
  flowP([
    browse,
    waitForSpinner,
    fillForm("#search"),
    click("button.search"),
    waitForSpinner,
    html("body"),
  ], url);

usingHeadlessBrowser(search("https://example.org", "Keith Johnstone"))
  .then(console.log); // Prints full HTML

API

usingBrowser

usingHeadlessBrowser

browse

html

fillForm

click

once

onceLoaded

onceMs

doUntil