JSPM

khmer-normalizer

1.0.2
  • ESM via JSPM
  • ES Module Entrypoint
  • Export Map
  • Keywords
  • License
  • Repository URL
  • TypeScript Types
  • README
  • Created
  • Published
  • Downloads 9
  • Score
    100M100P100Q53367F
  • License MIT

Normalize Khmer strings according to https://www.unicode.org/L2/L2022/22290-khmer-encoding.pdf

Package Exports

  • khmer-normalizer
  • khmer-normalizer/build/src/index.js

This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (khmer-normalizer) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.

Readme

khmer-normalizer

This module normalizes Khmer text according to the proposed normal encoding structure at https://www.unicode.org/L2/L2022/22290-khmer-encoding.pdf.

It does not attempt to identify faulty text, merely to ensure that two strings that would have rendered the same are output as the same string.

A live online version is available at https://convert.ភាសាខ្មែរ.com/ (see https://ភាសាខ្មែរ.com for related tools).

Example

ខែ្មរ is corrected to ខ្មែរ:

  • Input: ខ U+1781U+17C2U+17D2U+1798U+179A
  • Output: ខ U+1781U+17D2U+1798U+17C2U+179A

Installation

npm install khmer-normalizer

API Usage

import { khnormal } from 'khmer-normalizer';

// Normal use -- Modern Khmer
const cleanKhmer = khnormal(inputKhmer);

// Specifying the Modern Khmer language tag is optional
const cleanKhmer = khnormal(inputKhmer, 'km');

// For Middle Khmer, use the language tag 'xhm'
const cleanMiddleKhmerText = khnormal(inputMiddleKhmerText, 'xhm');

Command line usage

khnormal [options] [inputFile...]

If no input files are specified, reads from stdin, utf-8

# Options

--outfile, -o   Write concatenated output to file; if not specified, writes to stdout, utf-8
--fail, -f      Highlight places where khnormal was unable to regularize text
--lang, -l      Specify processing language, km (Modern Khmer, default) or xhm (Middle Khmer)
--help, -h      Print this help