JSPM

  • ESM via JSPM
  • ES Module Entrypoint
  • Export Map
  • Keywords
  • License
  • Repository URL
  • TypeScript Types
  • README
  • Created
  • Published
  • Downloads 1540
  • Score
    100M100P100Q112731F
  • License MIT

Get all href urls from an HTML string

Package Exports

  • get-hrefs

This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (get-hrefs) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.

Readme

get-hrefs

Build status NPM version XO code style

Get all href urls from an HTML string

Installation

Install get-hrefs using npm:

npm install --save get-hrefs

Usage

Module usage

const getHrefs = require('get-hrefs');

getHrefs(`
    <body>
        <a href="http://example.com">Example</a>
    </body>
`);
// ["http://example.com"]

getHrefs(`
    <head>
        <base href="http://example.com/path1/">
    </head>
    <body>
        <a href="path2/index.html">Example</a>
    </body>
`);
// ["http://example.com/path1/path2/index.html"]

CLI usage

$> get-hrefs --help

Get all href urls from an HTML string

  Usage:
    get-hrefs <html file>
    cat <html file> | get-hrefs

  Options:
    -b, --base-url	Set baseUrl
    <all other flags are passed to normalize-url>

  Examples:
    curl -s example.com | get-hrefs
    echo '<a href="http://www.example.com">Link</a>' | get-hrefs --strip-w-w-w

API

getHrefs(html, [options])

Name Type Description
html String The HTML string to extract hrefs from
options Object Optional options

Returns: Array<String>, all unique and normalized hrefs resolved from any provided baseUrl and <base href="..."> in the HTML document.

options.baseUrl

Type: String
Default: ""

The baseUrl to use for relative hrefs. The module also takes <base ...> tags into account.

options.allowedProtocols

Type: Object
Default: {"http": true, "https": true}

Specifies which protocols to allow by setting their respective key (the protocol name without ":") in allowedProtocols to true (or to false to disable one of the defaults), e.g. allowedProtocols: {tel: true, http: false} will return only found URLs with the protocols tel: or https:.

options.<any>

All other options are passed to normalize-url. See its options for alternatives.

License

MIT © Joakim Carlstein