JSPM

@cloudrumbles/indic-transliterate

1.1.1
  • ESM via JSPM
  • ES Module Entrypoint
  • Export Map
  • Keywords
  • License
  • Repository URL
  • TypeScript Types
  • README
  • Created
  • Published
  • Downloads 8
  • Score
    100M100P100Q49821F
  • License MIT

High-accuracy transliteration for 21 Indian languages using AI4Bharat's IndicXlit models via ONNX.

Package Exports

  • @cloudrumbles/indic-transliterate
  • @cloudrumbles/indic-transliterate/dist/index.js

This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (@cloudrumbles/indic-transliterate) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.

Readme

@cloudrumbles/indic-transliterate

Node.js transliteration for 21 Indian languages using AI4Bharat's IndicXlit models via ONNX Runtime.

This is an unofficial port. Models (~47MB) are bundled with the package.

Install

npm install @cloudrumbles/indic-transliterate

Usage

import { IndicTransliterator } from '@cloudrumbles/indic-transliterate'

const transliterator = new IndicTransliterator()

// Tamil
const tamil = await transliterator.transliterate('amma', 'ta')
// => ['அம்மா', 'அம்ம', 'ஆம்மா', ...]

// Hindi
const hindi = await transliterator.transliterate('namaste', 'hi')
// => ['नमस्ते', 'नमस्ती', ...]

// Get more candidates
const results = await transliterator.transliterate('chennai', 'ta', 10)

API

new IndicTransliterator(options?)

Creates a new transliterator instance. Models are loaded lazily on first use.

const transliterator = new IndicTransliterator({
  beamWidth: 4,   // beam search width (default: 4)
  maxLen: 20,     // max output length (default: 20)
  modelPath: '...' // custom model path (default: bundled)
})

transliterate(word, langCode, count?): Promise<string[]>

Transliterates a romanized word to the target script. Returns count candidates (default 5) ranked by likelihood.

getSupportedLanguages(): string[]

Returns array of supported language codes.

initialize(): Promise<void>

Pre-loads the ONNX models. Optional - models load automatically on first transliterate() call.

dispose(): Promise<void>

Releases ONNX sessions to free memory. Call when done.

isInitialized: boolean

Whether models are currently loaded.

Supported Languages

Code Language Code Language Code Language
as Assamese hi Hindi ne Nepali
bn Bengali kn Kannada or Odia
brx Bodo ks Kashmiri pa Punjabi
gom Konkani mai Maithili sa Sanskrit
gu Gujarati ml Malayalam sd Sindhi
mr Marathi mni Manipuri si Sinhala
ta Tamil te Telugu ur Urdu

Credits

Models and vocabulary from IndicXlit by AI4Bharat (MIT License).

@article{Madhani2022AksharantarTB,
  title={Aksharantar: Towards building open transliteration tools for the next billion users},
  author={Yash Madhani and Sushane Parthan and Priyanka A. Bedekar and Ruchi Khapra and Vivek Seshadri and Anoop Kunchukuttan and Pratyush Kumar and Mitesh M. Khapra},
  journal={ArXiv},
  year={2022},
  volume={abs/2205.03018}
}

License

MIT