Package Exports
- html-urls
This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (html-urls) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.
Readme
html-urls
Get all links from a HTML markup. It's based on W3C link checker.
Install
$ npm install html-urls --save
Usage
const got = require('got')
const getLinks = require('html-urls')
;(async() => {
const url = process.argv[2]
if (!url) throw new TypeError('Need to provide an url as first argument.')
const {body: html} = await got(url)
const links = getLinks({html, url})
links.forEach(({ url, normalizedUrl }, index) => console.log(normalizedUrl))
// => [
// 'https://microlink.io/component---src-layouts-index-js-86b5f94dfa48cb04ae41.js',
// 'https://microlink.io/component---src-pages-index-js-a302027ab59365471b7d.js',
// 'https://microlink.io/path---index-709b6cf5b986a710cc3a.js',
// 'https://microlink.io/app-8b4269e1fadd08e6ea1e.js',
// 'https://microlink.io/commons-8b286eac293678e1c98c.js',
// 'https://microlink.io',
// ...
// ]
})()
See examples.
API
htmlUrls([options])
options
html
Type: string
Default: ''
The HTML markup.
url
Type: string
Default: ''
The URL associated with the HTML markup.
It is used for resolve relative links that can be present in the HTML markup.
whitelist
Type: array
Default: []
A list of links to be excluded from the final output. It supports regex patterns.
See [matcher](https://github.com/sindresorhus/matcher#matcher-= for know more.
removeDuplicates
Type: boolean
Default: true`
Remove duplicated links detected over all the HTML tags.
License
html-urls © Kiko Beats, released under the MIT License.
Authored and maintained by Kiko Beats with help from contributors.
kikobeats.com · GitHub @Kiko Beats · Twitter @Kikobeats