Package Exports
- detect-character-encoding
- detect-character-encoding/index.js
This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (detect-character-encoding) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.
Readme
detect-character-encoding
Detect character encoding using ICU
Tip: If you don’t need ICU in particular, consider using ced, which is based on Google’s lighter compact_enc_det library.
Installation
$ npm install detect-character-encoding
detect-character-encoding is a C++ addon. Therefore, you may need to install various build tools. Check node-gyp’s readme for more information.
Usage
const fs = require('fs');
const detectCharacterEncoding = require('detect-character-encoding');
const fileBuffer = fs.readFileSync('file.txt');
const charsetMatch = detectCharacterEncoding(fileBuffer);
console.log(charsetMatch);
// {
// encoding: 'UTF-8',
// confidence: 60
// }
detect-character-encoding may return null
if no charset matches.
Supported operating systems
- macOS Sonoma
- Ubuntu 22.04 and 20.04
- Debian 12, 11, and 10
detect-character-encoding does not support 32-bit operating systems.
Supported character sets
As listed in ICU’s user guide:
- UTF-8
- UTF-16BE
- UTF-16LE
- UTF-32BE
- UTF-32LE
- Shift_JIS
- ISO-2022-JP
- ISO-2022-CN
- ISO-2022-KR
- GB18030
- Big5
- EUC-JP
- EUC-KR
- ISO-8859-1
- ISO-8859-2
- ISO-8859-5
- ISO-8859-6
- ISO-8859-7
- ISO-8859-8
- ISO-8859-9
- windows-1250
- windows-1251
- windows-1252
- windows-1253
- windows-1254
- windows-1255
- windows-1256
- KOI8-R
- IBM420
- IBM424
License
detect-character-encoding is licensed under the BSD 2-clause license but includes third-party software under different licenses. See LICENSE.md
for the full license text.