JSPM

url-scraper

1.0.2
  • ESM via JSPM
  • ES Module Entrypoint
  • Export Map
  • Keywords
  • License
  • Repository URL
  • TypeScript Types
  • README
  • Created
  • Published
  • Downloads 3
  • Score
    100M100P100Q44670F
  • License MIT

Url scraper which takes the text input and finds the links/urls, scraps them using cheerio and will returns an object with original text, parsed text (using npm-text-parser) and array of objects where each object contains scraped webpage's information.

Package Exports

  • url-scraper

This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (url-scraper) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.

Readme

url-scraper

Url scraper which takes the text input and finds the links/urls, scraps them using cheerio and will returns an object with original text, parsed text (using npm-text-parser) and array of objects where each object contains scraped webpage's information.

Installation

npm i url-scraper

Usage

var urlScraper = require('url-scraper');

scrap(text)

Receives the input text and finds the links/url, scraps them and returns an object with original, parsed and array of scrapped websites info.

  var inputString = "This is awesome it parses the url's dude and http://krishcdbry.com done !"
     
  urlScraper
          .scrap(inputString)
          .then(function(response) {
                console.log(response); // It returns the response object when promise gets resolved satisfies.
          });	
  
  //{
  // original_text: 'This is awesome it scraps the sites dude and http://heartynote.com done !',
  // parsed_text: 'This is awesome it scraps the sites dude and <a href="http://heartynote.com" target="_blank">http://heartynote.com</a> done !',
  // scraped_data:
  //   [
  //     { domain: 'heartynote.com',
  //       title: 'heartynote welcomes u !!',
  //       description: 'Bring your life',
  //       thumb: 'http://heartynote.com/pngs/thums/hearty.png',
  //       canonical: 'http://krishcdbry.com',
  //       isValid: true,
  //        _links: {
  //			self : http://krishcdbry.com
  //		}
  //     }
  //   ]
  // }

Demo

Demo @url-scraper | https://tonicdev.com/npm/url-scraper

Author

Krishcdbry [krishcdbry@gmail.com]

Licence

MIT @krishcdbry