Package Exports
- scraptor
This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (scraptor) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.
Readme
scraptor
!!This library is a work in progress. The API most likely will change.!!
This library is my attempt to wrap puppeteer and cheerio to create a
library that allows me to easily construct web scrapers. A DSL implements
common patterns, while allowing to break out into the underlying libraries if
necessary.
Synopsis
import {browse, once, fillForm, click, html, usingHeadlessBrowser} from "scraptor";
import {flowP} from "combinators-p";
const spinnerDone = "document.querySelector('.spinner').classList.contains('hide')";
const waitForSpinner = once(spinnerDone);
const search = (url, term) =>
flowP([
browse,
waitForSpinner,
fillForm("#search"),
click("button.search"),
waitForSpinner,
html("body"),
], url);
usingHeadlessBrowser(search("https://example.org", "Keith Johnstone"))
.then(console.log); // Prints full HTMLAPI
usingBrowser: Execute a scrape in a browser session.usingHeadlessBrowser: Execute a scrape in a headless browser session.browse: Visit a URL and load the page.html: Select the inner HTML of a DOM node.fillForm: Input a string into a form field.click: Click on an DOM node.once: Continue the browser session once a predicate fulfills.onceLoaded: Continue the browser session once the page loaded.onceMs: Continue browser session once a set time passes.doUntil: Run an action once a predicate fulfills.