JSPM

  • ESM via JSPM
  • ES Module Entrypoint
  • Export Map
  • Keywords
  • License
  • Repository URL
  • TypeScript Types
  • README
  • Created
  • Published
  • Downloads 735650
  • Score
    100M100P100Q183935F

Character encoding auto-detection in JavaScript (port of python's chardet)

Package Exports

  • jschardet
  • jschardet/package.json
  • jschardet/src/big5freq
  • jschardet/src/big5prober
  • jschardet/src/chardistribution
  • jschardet/src/charsetgroupprober
  • jschardet/src/charsetprober
  • jschardet/src/codingstatemachine
  • jschardet/src/constants
  • jschardet/src/escprober
  • jschardet/src/escsm
  • jschardet/src/eucjpprober
  • jschardet/src/euckrfreq
  • jschardet/src/euckrprober
  • jschardet/src/euctwfreq
  • jschardet/src/euctwprober
  • jschardet/src/gb2312freq
  • jschardet/src/gb2312prober
  • jschardet/src/hebrewprober
  • jschardet/src/init
  • jschardet/src/jisfreq
  • jschardet/src/jpcntx
  • jschardet/src/langbulgarianmodel
  • jschardet/src/langcyrillicmodel
  • jschardet/src/langgreekmodel
  • jschardet/src/langhebrewmodel
  • jschardet/src/langhungarianmodel
  • jschardet/src/langthaimodel
  • jschardet/src/latin1prober
  • jschardet/src/mbcharsetprober
  • jschardet/src/mbcsgroupprober
  • jschardet/src/mbcssm
  • jschardet/src/sbcharsetprober
  • jschardet/src/sbcsgroupprober
  • jschardet/src/sjisprober
  • jschardet/src/universaldetector
  • jschardet/src/utf8prober

This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (jschardet) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.

Readme

JsChardet

Port of python's chardet (http://chardet.feedparser.org/).

License

LGPL

How To Use It

npm install jschardet
var jschardet = require("jschardet")

// "àíàçã" in UTF-8
jschardet.detect("\xc3\xa0\xc3\xad\xc3\xa0\xc3\xa7\xc3\xa3")
// { encoding: "utf-8", confidence: 0.9690625 }

// "次常用國字標準字體表" in Big5 
jschardet.detect("\xa6\xb8\xb1\x60\xa5\xce\xb0\xea\xa6\x72\xbc\xd0\xb7\xc7\xa6\x72\xc5\xe9\xaa\xed")
// { encoding: "Big5", confidence: 0.99 }

Supported Charsets

  • Big5, GB2312/GB18030, EUC-TW, HZ-GB-2312, and ISO-2022-CN (Traditional and Simplified Chinese)
  • EUC-JP, SHIFT_JIS, and ISO-2022-JP (Japanese)
  • EUC-KR and ISO-2022-KR (Korean)
  • KOI8-R, MacCyrillic, IBM855, IBM866, ISO-8859-5, and windows-1251 (Russian)
  • ISO-8859-2 and windows-1250 (Hungarian)
  • ISO-8859-5 and windows-1251 (Bulgarian)
  • windows-1252
  • ISO-8859-7 and windows-1253 (Greek)
  • ISO-8859-8 and windows-1255 (Visual and Logical Hebrew)
  • TIS-620 (Thai)
  • UTF-32 BE, LE, 3412-ordered, or 2143-ordered (with a BOM)
  • UTF-16 BE or LE (with a BOM)
  • UTF-8 (with or without a BOM)
  • ASCII

Technical Information

I haven't been able to create tests to correctly detect:

  • ISO-2022-CN
  • windows-1250 in Hungarian
  • windows-1251 in Bulgarian
  • windows-1253 in Greek
  • EUC-CN

A one-file minimized version is missing.

Authors