JSPM

  • Created
  • Published
  • Downloads 1868
  • Score
    100M100P100Q110091F
  • License ISC

a library to import events in mixpanel for node

Package Exports

  • mixpanel-import
  • mixpanel-import/index.js

This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (mixpanel-import) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.

Readme

mixpanel-import

note: if you're trying to add real-time mixpanel tracking to a node.js web application - this module is NOT what you want; you want mixpanel-node the official node.js SDK.

wat.

stream events, users, and groups into mixpanel

This module is designed for streaming large amounts of event or object data to Mixpanel from a node.js environment. It implements the /import, /engage, and /groups APIs by streaming JSON files that are compliant with Mixpanel's data model.

This utility is particularly useful for running one-time backfills, streaming larget sets of data into Mixpanel from cloud-based data pipelines where RETL is not available.

tldr;

this module can be used in two ways:

  • as a module via require('mixpanel-import')
  • as a standalone script via npx mixpanel-import

module usage

install mixpanel-import as a dependency

npm i mixpanel-import --save

use it in code:

const mpImport  =  require('mixpanel-import') 
    ...
const importedData = await mp(credentials, data, options);
console.log(importedData) // array of responses from Mixpanel

read more about credentials, data, and options

stand-alone usage

clone the module:

$ git clone https://github.com/ak--47/mixpanel-import.git

run it and providing a path to the data you wish to import:

$ node index.js ./pathToData

alternatively:

$ npx mixpanel-import ./pathToData

when running stand-alone, pathToData can be a .json, .jsonl, .ndjson, or .txt file OR a directory which contains said files.

you will also need a .env configuration file for authentication.

arguments

when using mixpanel-import in code, you will pass in 3 arguments: credentials, data, and options

credentials

Mixpanel's ingestion APIs authenticate with service accounts OR API secrets; service accounts are the preferred authentication method.

service account:

const creds = {
    acct: `{{my-servce-acct}}`, //service acct username
    pass: `{{my-service-seccret}}`, //service acct secret
    project: `{{my-project-id}}`, //project id
    token: `{{my-project-token}}`  //project token
}
const importedData = await mpImport(creds, data, options);

API secret:

const creds = {
    secret: `{{my-api-secret}}`, //api secret (deprecated auth)
    token: `{{my-project-token}}`  //project token
}
const importedData = await mpImport(creds, data, options);

environment variables:

it is possible to delegate the authentication details to environment variables, using a .env file of the form:

# if using service account auth; these 3 values are required:
MP_PROJECT={{your-mp-project}}
MP_ACCT={{your-service-acct}}
MP_PASS={{your-service-pass}}

# if using secret based auth; only this value is required
MP_SECRET={{your-api-secret}}

# this is optional (but strongly encouraged)
MP_TOKEN={{your-mp-token}}

when using environment variables for authentication, pass null as the creds (first argument) to the module:

const importedData = await mpImport(null, data, options);

data

the data param represents the data you wish to import; this might be events, user profiles, or group profiles

the value of data can be:

  • a path to a file, which contains records as .json, .jsonl, .ndjson, or .txt
const data = `./myEventsToImport.json`
const importedData = await mpImport(creds, data, options);
  • a path to a directory, which contains files that have records as .json, .jsonl, .ndjson, or .txt

const data = `./myEventsToImport/`
const importedData = await mpImport(creds, data, options);
  • an array of objects (records), in memory
const data = require('./myEventsToImport.json')
const importedData = await mpImport(creds, data, options);
  • a stringified array of objects
const records = require('./myEventsToImport.json')
const data = JSON.stringify(data)
const importedData = await mpImport(creds, data, options);
  • a node.js JSON (or JSONL) stream
const myStream = fs.createReadStream('./testData/lines.json')
const res = await mpImport(creds, myStream, {streamFormat: `json`})	

important note: you will use the options (below) to specify what type of records you are importing; event is thedefault type

options

options is an object that allows you to configure the behavior of this module.

Below, the default values are given, but you can override them with your own values:

const options = {
    recordType: `event`, //event, user, OR group
    streamSize: 27, // highWaterMark for streaming chunks (2^27 ~= 134MB)
    region: `US`, //US or EU
    recordsPerBatch: 2000, //max # of records in each batch
    bytesPerBatch: 2 * 1024 * 1024, //max # of bytes in each batch
    strict: true, //use strict mode?
    logs: false, //print to stdout?
    streamFormat: 'json', //or jsonl... required if source is a Readable or Transform stream

    //a function reference to be called on every record
    //useful if you need to transform the data
    transformFunc: function noop(a) { return a }
}

note: the recordType param is very important; by default this module assumes you wish to import event records but change this value to user or group if you are importing other entities.

recipies

the transformFunc is useful because it can preprocess records in the pipeline using arbitrary javascript.

here are some examples:

  • putting a token on every user record:
function addToken(user) {
    user.token = `{{my token}}`
    return user
}

let res = await mpImport(creds, data, { transformFunc: addToken, recordType: 'user' })
  • constructing an $insert_id for each event:
const md5 = require('md5')

function addInsert(event) {
    let hash = md5(event);
    event.properties.$insert_id = hash;
    return event
}
let res = await mpImport(creds, data, { transformFunc: addInsert }

test data

sometimes it's helpful to generate test data, so this module includes a separate utility to do that:

$ npm run generate

someTestData.json will be written to ./testData ... so you can then node index.js ./testData/someTestData.json

why?

because... i needed this and it didn't exist... so i made it.

then i made it public it because i thought it would be useful to others