Package Exports

detect-character-encoding
detect-character-encoding/index.js

This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (detect-character-encoding) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.

Readme

detect-character-encoding

Detect character encoding using ICU

Tip: If you don’t need ICU in particular, consider using ced, which is based on Google’s lighter compact_enc_det library.

Installation

$ npm install detect-character-encoding

detect-character-encoding is a C++ addon. Therefore, you may need to install various build tools. Check node-gyp’s readme for more information.

Usage

const fs = require('fs');
const detectCharacterEncoding = require('detect-character-encoding');

const fileBuffer = fs.readFileSync('file.txt');
const charsetMatch = detectCharacterEncoding(fileBuffer);

console.log(charsetMatch);
// {
//   encoding: 'UTF-8',
//   confidence: 60
// }

detect-character-encoding may return null if no charset matches.

Supported operating systems

macOS Sonoma
Ubuntu 22.04 and 20.04
Debian 12, 11, and 10

detect-character-encoding does not support 32-bit operating systems.

Supported character sets

As listed in ICU’s user guide:

UTF-8
UTF-16BE
UTF-16LE
UTF-32BE
UTF-32LE
Shift_JIS
ISO-2022-JP
ISO-2022-CN
ISO-2022-KR
GB18030
Big5
EUC-JP
EUC-KR
ISO-8859-1
ISO-8859-2
ISO-8859-5
ISO-8859-6
ISO-8859-7
ISO-8859-8
ISO-8859-9
windows-1250
windows-1251
windows-1252
windows-1253
windows-1254
windows-1255
windows-1256
KOI8-R
IBM420
IBM424

License

detect-character-encoding is licensed under the BSD 2-clause license but includes third-party software under different licenses. See LICENSE.md for the full license text.