Package Exports
- truncate-html
This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (truncate-html) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.
Readme
Truncate-html
Truncate html string and keep tags in safe. You can custom ellipsis sign, ignore unwanted elements and truncate html by words.
Notice This is a node module depends on cheerio can only run on nodejs. If you need a browser version, you may consider truncate or nodejs-html-truncate.
Method
truncate(html, [length], [options])
Available options
{
length: Number, content length to truncate
byWords: Boolean, whether truncate by words, aka `length` means words count
stripTags: Boolean, whether to remove tags
ellipsis: String, custom ellipsis sign, set it to empty string to remove the ellipsis postfix
excludes: String or Array, the selectors of the elements you want to ignore
decodeEntities: Boolean, auto decode html entities in the html string
keepWhitespaces: Boolean, keep whitespaces, whether to replace continuous spaces with one space
}
Default options
truncate.defaultOptions = {
byWords: false,
stripTags: false,
ellipsis: '...',
decodeEntities: false,
keepWhitespaces: false
};
Install
npm install truncate-html
Usage
Notice Extra blank spaces in html content will be removed. If the html string content's length is shorter than options.length
, then no ellipsis will be appended to the final html string. If longer, then the final html content's length will be options.length
+ options.ellipsis
.
var truncate = require('truncate-html');
// truncate html
var html = '<p><img src="abc.png">This is a string</p> for test.';
truncate(html, 10);
// returns: <p><img src="abc.png">This is a ...</p>
// with options, remove all tags
var html = '<p><img src="abc.png">This is a string</p> for test.';
truncate(html, 10, {stripTags: true});
// returns: This is a ...
// with options, truncate by words.
// if you try to truncate none alphabet language(like CJK)
// it will not act as you wish
var html = '<p><img src="abc.png">This is a string</p> for test.';
truncate(html, 3, {byWords: true});
// returns: <p><img src="abc.png">This is a ...</p>
// with options, keep whitespaces
var html = '<p> <img src="abc.png">This is a string</p> for test.';
truncate(html, 10, {keepWhitespaces: true});
// returns: <p> <img src="abc.png">This is a ...</p>
// combine length and options
var html = '<p><img src="abc.png">This is a string</p> for test.';
truncate(html, {
length: 10,
stripTags: true
});
// returns: This is a ...
// custom ellipsis sign
var html = '<p><img src="abc.png">This is a string</p> for test.';
truncate(html, {
length: 10,
ellipsis: '~'
});
// reutrns: <p><img src="abc.png">This is a ~</p>
// exclude some special elements(by selector), they will be removed before counting content's length
var html = '<p><img src="abc.png">This is a string</p> for test.';
truncate(html, {
length: 10,
ellipsis: '~',
excludes: 'img'
});
// reutrns: <p>This is a ~</p>
// exclude more than one category elements
var html = '<p><img src="abc.png">This is a string</p><div class="something-unwanted"> unwanted string inserted ( ´•̥̥̥ω•̥̥̥` )</div> for test.';
truncate(html, {
length: 20,
stripTags: true,
ellipsis: '~',
excludes: ['img', '.something-unwanted']
});
// returns: This is a string for~
// handing encoded characters
var html = '<p> test for <p> encoded string</p>'
truncate(html, {
length: 20,
decodeEntities: true
});
// returns: <p> test for <p> encode...</p>
// when set decodeEntities false
var html = '<p> test for <p> encoded string</p>'
truncate(html, {
length: 20,
decodeEntities: false // this is the dafault value
});
// returns: <p> test for <p...</p>
// and there may be a surprise by setting `decodeEntities` to true when handing CJK characters
var html = '<p> test for <p> 中文 string</p>'
truncate(html, {
length: 20,
decodeEntities: true
});
// returns: <p> test for <p> 中文 str...</p>
// to fix this, see below for instructions
Known issues
Known issues about handing CJK characters when set the option decodeEntities
to true
.
You have seen the option decodeEntities
, it's really magic! When it's true, encoded html entities will be decoded automatically, so &
will be treat as a single character. This is probably what we want. But, if there are CJK characters in the html string, they will be replaced by characters like ö
in the final html you get. That's confused.
To fix this, you have two choices:
- keep the option
decodeEntities
false, but&
will treat as five characters. - modify cheerio's source code: find out the function
getInverse
in the file./node_modules/cheerio/node_modules/entities/lib/decode.js
, comment out the last line.replace(re_nonASCII, singleCharReplacer);
.