Package Exports
- get-hrefs
This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (get-hrefs) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.
Readme
get-hrefs
Get all href urls from an HTML string
Installation
Install get-hrefs
using npm:
npm install --save get-hrefs
Usage
Module usage
const getHrefs = require('get-hrefs');
getHrefs(`
<body>
<a href="http://example.com">Example</a>
</body>
`);
// ["http://example.com"]
getHrefs(`
<head>
<base href="http://example.com/path1/">
</head>
<body>
<a href="path2/index.html">Example</a>
</body>
`);
// ["http://example.com/path1/path2/index.html"]
CLI usage
$> get-hrefs --help
Get all href urls from an HTML string
Usage:
get-hrefs <html file>
cat <html file> | get-hrefs
Options:
-b, --base-url Set baseUrl
<all other flags are passed to normalize-url>
Examples:
curl -s example.com | get-hrefs
echo '<a href="http://www.example.com">Link</a>' | get-hrefs --strip-w-w-w
API
getHrefs(html, [options])
Name | Type | Description |
---|---|---|
html | String |
The HTML string to extract hrefs from |
options | Object |
Optional options |
Returns: Array<String>
, all unique and normalized hrefs resolved from any provided baseUrl
and <base href="...">
in the HTML document.
options.baseUrl
Type: String
Default: ""
The baseUrl to use for relative hrefs. The module also takes <base ...>
tags into account.
options.allowedProtocols
Type: Object
Default: {"http": true, "https": true}
Specifies which protocols to allow by setting their respective key (the protocol name without ":") in allowedProtocols
to true
(or to false
to disable one of the defaults), e.g. allowedProtocols: {tel: true, http: false}
will return only found URLs with the protocols tel:
or https:
.
options.<any>
All other options are passed to normalize-url
. See its options for alternatives.
Related modules
- get-urls - Get all urls in a string
License
MIT © Joakim Carlstein