Package Exports

scraptor

This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (scraptor) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.

Readme

`scraptor`

!!This library is a work in progress. The API most likely will change.!!

This library is my attempt to wrap puppeteer and cheerio to create a library that allows me to easily construct web scrapers. A DSL implements common patterns, while allowing to break out into the underlying libraries if necessary.

Synopsis

import {browse, once, fillForm, click, html, usingHeadlessBrowser} from "scraptor";
import {flowP} from "combinators-p";

const spinnerDone = "document.querySelector('.spinner').classList.contains('hide')";
const waitForSpinner = once(spinnerDone);
const search = (url, term) =>
  flowP([
    browse,
    waitForSpinner,
    fillForm("#search"),
    click("button.search"),
    waitForSpinner,
    html("body"),
  ], url);

usingHeadlessBrowser(search("https://example.org", "Keith Johnstone"))
  .then(console.log); // Prints full HTML

API

usingBrowser: Execute a scrape in a browser session.
usingHeadlessBrowser: Execute a scrape in a headless browser session.
browse: Visit a URL and load the page.
html: Select the inner HTML of a DOM node.
fillForm: Input a string into a form field.
click: Click on an DOM node.
once: Continue the browser session once a predicate fulfills.
onceLoaded: Continue the browser session once the page loaded.
onceMs: Continue browser session once a set time passes.
doUntil: Run an action once a predicate fulfills.

scraptor

Package Exports

Readme

scraptor

Synopsis

API

usingBrowser

usingHeadlessBrowser

browse

html

fillForm

click

once

onceLoaded

onceMs

doUntil

`scraptor`