JSPM

unicode-substring

1.0.0
  • ESM via JSPM
  • ES Module Entrypoint
  • Export Map
  • Keywords
  • License
  • Repository URL
  • TypeScript Types
  • README
  • Created
  • Published
  • Downloads 272175
  • Score
    100M100P100Q174068F
  • License MIT

Unicode-aware substring

Package Exports

  • unicode-substring

This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (unicode-substring) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.

Readme

unicode-substring Build Status

Unicode-aware substring for JavaScript. Surrogate pairs are counted as a single character.

What?

Characters in JavaScript strings are exposed as 16-bit code points, also known as UCS-2 encoding. This usually good enough, but since there are more than 2^16 characters in Unicode, 16 bits is not enough to represent all characters. To overcome this limitation, characters with scalar value over 0x10FFFF need to be encoded as surrogate pairs. This encoding is known as UTF-16.

The purpose of this library is to treat surrogate pairs as one character when extracting substrings from a string. This might be preferable if indices are returned from an Unicode-compatible environment.

Usage

var unicodeSubstring = require('unicode-substring')
// unicodeSubstring(string, start, end)
unicodeSubstring("💥Emoji Rule💥", 0, 6)
// => "💥Emoji"

The start and end parameters behave similarly as String.prototype.substring.