Package Exports
- hapi-goldwasher
This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (hapi-goldwasher) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.
Readme
hapi-goldwasher
A plugin for hapi to run goldwasher as a scraping API on the web. Basically a scraper proxy that will return information in the selected format, defaulting to JSON.
Installation
npm install hapi-goldwasherIf you aren't already running a hapi server, you need to install this too, to run the example:
npm install hapiOptions
When registering the plugin with hapi, you have several options, non of them required:
path- the endpoint you mount the plugin on. Defaults to/goldwasher.maxRedirects- the maximum number of redirects the scraper will accept before giving up. Defaults to5.cors- a CORS object. Defaults tofalse. See hapi docs for more information.raw- enable raw output mode. This will enableoutput=rawthat will return the raw, scraped result, usually HTML.
Parameters
url- url to scrape. Required.selector- cheerio (jQuery) selector, a selection of target tags. Defaults to the default of goldwasher, usually'h1, h2, h3, h4, h5, h6, p'.search- only pick results containing these terms. Not case or special character sensitive.limit- limit number of results.output- output format (json,xml,atom,rssor - if enabled -raw).filterTexts- stop texts that should be excluded.filterKeywords- stop words that should be excluded as keywords.filterLocale- stop words from external JSON file (see documentation on goldwasher)).
Example
var Hapi = require('hapi');
var HapiGoldwasher = require('./index');
var server = new Hapi.Server();
server.connection({ port: 7979 });
server.register({
register: HapiGoldwasher,
options: {
path: '/goldwasher',
cors: {
origin: ['*']
}
}
}, function(err) {
if (err) {
throw err;
}
server.start(function() {
console.log('Server running at: ' + server.info.uri);
});
});Go to the server uri and you will be presented with a JSON response containing documentation. I recommend using something like the Chrome JSON Formatter for readability.