JSPM

  • Created
  • Published
  • Downloads 5
  • Score
    100M100P100Q64125F
  • License Apache-2.0

A nodejs based library to (re)index and transform data from/to Elasticsearch.

Package Exports

  • node-es-transformer

This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (node-es-transformer) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.

Readme

npm npm npm

node-es-transformer

A nodejs based library to (re)index and transform data from/to Elasticsearch.

This is experimental code, use at your own risk. Nonetheless, I encourage you to give it a try so I can gather some feedback.

Features

  • While I'd generally recommend using Logstash and filebeat for established use cases, this tool may be of help especially in a JavaScript based setup for customized ingestion and data transformation use cases.
  • Buffering/Streaming for both reading and indexing. Files are read using streaming and Elasticsearch ingestion is done using buffered bulk indexing. This is tailored towards ingestion of large files. Successfully tested so far with JSON and CSV files in the range of 20-30 GBytes. On a single machine running both node-es-transformer and Elasticsearch ingestion rates up to 20k documents/second were achieved (2,9 GHz Intel Core i7, 16GByte RAM, SSD).
  • Supports wildcards to ingest/transform a range of files in one go.

Getting started

In your node-js project, add node-es-transformer as a dependency (yarn add node-es-transformer or npm install node-es-transformer).

Use the library in your code like:

Read from a file

const transformer = require('node-es-transformer');

transformer({
  fileName: 'filename.json',
  targetIndexName: 'my-index',
  typeName: 'doc',
  mappings: {
    doc: {
      properties: {
        '@timestamp': {
          type: 'date'
        },
        'first_name': {
          type: 'keyword'
        },
        'last_name': {
          type: 'keyword'
        }
        'full_name': {
          type: 'keyword'
        }
      }
    }
  },
  transform(line) {
    return {
      ...line,
      full_name: `${line.first_name} ${line.last_name}`
    }
  }
});

Read from another index

const transformer = require('node-es-transformer');

transformer({
  sourceIndexName: 'my-source-index',
  targetIndexName: 'my-target-index',
  typeName: 'doc',
  mappings: {
    doc: {
      properties: {
        '@timestamp': {
          type: 'date'
        },
        'first_name': {
          type: 'keyword'
        },
        'last_name': {
          type: 'keyword'
        }
        'full_name': {
          type: 'keyword'
        }
      }
    }
  },
  transform(doc) {
    return {
      ...doc,
      full_name: `${line.first_name} ${line.last_name}`
    }
  }
});

Development

Clone this repository and install its dependencies:

git clone https://github.com/walterra/node-es-transformer
cd node-es-transformer
yarn

yarn build builds the library to dist, generating two files:

  • dist/node-es-transformer.cjs.js A CommonJS bundle, suitable for use in Node.js, that requires the external dependency. This corresponds to the "main" field in package.json
  • dist/node-es-transformer.esm.js an ES module bundle, suitable for use in other people's libraries and applications, that imports the external dependency. This corresponds to the "module" field in package.json

yarn dev builds the library, then keeps rebuilding it whenever the source files change using rollup-watch.

yarn test builds the library, then tests it.

License

Apache 2.0.