JSPM

  • Created
  • Published
  • Downloads 654
  • Score
    100M100P100Q96754F
  • License MIT

Get all links from a HTML markup

Package Exports

  • html-urls

This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (html-urls) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.

Readme

html-urls

Last version Build Status Coverage Status Dependency status Dev Dependencies Status NPM Status Donate

Get all links from a HTML markup. It's based on W3C link checker.

Install

$ npm install html-urls --save

Usage

const got = require('got')
const getLinks = require('html-urls')

;(async() => {
  const url = process.argv[2]
  if (!url) throw new TypeError('Need to provide an url as first argument.')
  const {body: html} = await got(url)
  const links = getLinks({html, url})

  links.forEach(({ url, normalizedUrl }, index) => console.log(normalizedUrl))

  // => [
  //   'https://microlink.io/component---src-layouts-index-js-86b5f94dfa48cb04ae41.js',
  //   'https://microlink.io/component---src-pages-index-js-a302027ab59365471b7d.js',
  //   'https://microlink.io/path---index-709b6cf5b986a710cc3a.js',
  //   'https://microlink.io/app-8b4269e1fadd08e6ea1e.js',
  //   'https://microlink.io/commons-8b286eac293678e1c98c.js',
  //   'https://microlink.io',
  //   ...
  // ]
})()

See examples.

API

htmlUrls([options])

options

html

Type: string
Default: ''

The HTML markup.

url

Type: string
Default: ''

The URL associated with the HTML markup.

It is used for resolve relative links that can be present in the HTML markup.

whitelist

Type: array
Default: []

A list of links to be excluded from the final output. It supports regex patterns.

See [matcher](https://github.com/sindresorhus/matcher#matcher-= for know more.

removeDuplicates

Type: boolean
Default: true`

Remove duplicated links detected over all the HTML tags.

License

html-urls © Kiko Beats, released under the MIT License.
Authored and maintained by Kiko Beats with help from contributors.

kikobeats.com · GitHub @Kiko Beats · Twitter @Kikobeats