Package Exports
- mixpanel-import
- mixpanel-import/index.js
This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (mixpanel-import) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.
Readme
mixpanel-import
๐คจ wat.
create data streams to mixpanel... quickly

mixpanel-import implements Mixpanel's /import, /engage, /groups, and /lookup APIs with best practices, providing a clean, configurable interface to stream JSON (or NDJSON) files compliant with Mixpanel's data model.
using streams in node.js high-throughput backfills are possible with no intermediate storage; this particularly useful for cloud/lambda data pipelines where rETL is not available.
note: if you're trying to add real-time mixpanel tracking to a node.js web application - this module is NOT what you want; you want mixpanel-node the official node.js SDK.
๐ tldr;
this module can be used in two ways:
- as a CLI, standalone script via:
npx mixpanel-import file --options- as a module in code via
//for esm:
import mpStream from 'mixpanel-import'
//for cjs:
const mpStream = require('mixpanel-import')
const myImportDated = await mpSteam(creds, data, options)๐ป CLI usage
npx --yes mixpanel-import ./pathToDatawhen running stand-alone, pathToData can be a .json, .jsonl, .ndjson, .csv or .txt file OR a directory which contains said files.
when using the CLI, you will supply params to specify options of the form --option value, for example your project credentials:
npx --yes mixpanel-import ./data.ndjson --secret abc123many other options are available; to see a full list of CLI params, use the --help option:
npx --yes mixpanel-import --helpalternatively, you may use an .env configuration file to provide your project credentials (and some other values).
the CLI will write response logs to a ./logs directory by default. you can specify a --where dir option as well!
๐ module usage
install mixpanel-import as a dependency in your project
npm i mixpanel-import --savethen use it in code:
const mpStream = require("mixpanel-import");
const importedData = await mpStream(credentials, data, options);
console.log(importedData);
/*
{
success: 5003,
failed: 0,
total: 5003,
batches: 3,
rps: 3,
eps: 5000,
recordType: "event",
duration: 1.299,
retries: 0,
responses: [ ... ],
errors: [ ... ]
}
*/read more about credentials, data, and options below
๐ฃ๏ธ arguments
when using mixpanel-import in code, you will pass in 3 arguments: credentials, data, and options
๐ credentials
Mixpanel's ingestion APIs authenticate with service accounts OR API secrets; service accounts are the preferred authentication method.
๐ค service account:
const creds = {
acct: `my-service-acct`, //service acct username
pass: `my-service-secret`, //service acct secret
project: `my-project-id`, //project id
};
const importedData = await mpStream(creds, data, options);๐ API secret:
const creds = {
secret: `my-api-secret`, //api secret (deprecated auth)
};
const importedData = await mpStream(creds, data, options);๐ profiles + tables:
if you are importing user profiles, group profiles, or lookup tables, you should also provide also provide the corresponding values in your creds configuration:
const creds = {
token: `my-project-token`, //for user/group profiles
groupKey: `my-group-key`, //for group profiles
lookupTableId: `my-lookup-table-id`, //for lookup tables
}๐ค environment variables:
it is possible to delegate the authentication details to environment variables, using a .env file of the form:
# if using service account auth; these 3 values are required:
MP_PROJECT={{your-mp-project}}
MP_ACCT={{your-service-acct}}
MP_PASS={{your-service-pass}}
# if using secret based auth; only this value is required
MP_SECRET={{your-api-secret}}
# type of records to import; valid options are event, user, group or table
MP_TYPE=event
# required for user profiles + group profiles
MP_TOKEN={{your-mp-token}}
# required for group profiles
MP_GROUP_KEY={{your-group-key}}
# required for lookup tables
MP_TABLE_ID={{your-lookup-id}}note: pass null as the creds to the module to use .env variables for authentication:
const importedData = await mpStream(null, data, options);๐ data
the data param represents the data you wish to import; this might be events, user profiles, group profiles, or lookup tables
the value of data can be:
- a path to a file, which contains records as
.json,.jsonl,.ndjson, or.txt
const data = `./myEventsToImport.json`;
const importedData = await mpStream(creds, data, options);- a path to a directory, which contains files that have records as
.json,.jsonl,.ndjson, or.txt
const data = `./myEventsToImport/`;
const importedData = await mpStream(creds, data, options);- an array of objects (records), in memory
const data = require("./myEventsToImport.json");
const importedData = await mpStream(creds, data, options);- a stringified array of objects, in memory
const records = require("./myEventsToImport.json");
const data = JSON.stringify(data);
const importedData = await mpStream(creds, data, options);- a JSON (or JSONL) readable file stream
const myStream = fs.createReadStream("./testData/lines.json");
const imported = await mpStream(creds, myStream, { streamFormat: `json` });note: please specify streamFormat as json or jsonl in the options
- an "object mode" readable stream:
const { createMpStream } = require('mixpanel-import');
const mixpanelStream = createMpStream(creds, options, (results) => { ... })
const myStream = new Readable.from(data, { objectMode: true });
const myOtherStream = new PassThrough()
myOtherStream.on('data', (response) => { ... });
myStream.pipe(mixpanelStream).pipe(myOtherStream)note: object mode streams use a different named import: createMpStream() ... the callback receives a summary of the import and downstream consumers of the stream will receives API responses from Mixpanel.
you will use the options (below) to specify what type of records you are importing; event is the default type
๐ options
options is an object that allows you to configure the behavior of this module. you can specify options as the third argument in module mode or as flags in CLI mode.
Below, the default values are given, but you can override them with your own values:
module options
const options = {
recordType: `event`, // event, user, group or table
compress: false, //gzip payload on egress (events only)
region: `US`, // US or EU
recordsPerBatch: 2000, // records in each req; max 2000
bytesPerBatch: 2 * 1024 * 1024, // max bytes in each req
strict: true, // use strict mode
logs: false, // write results to a log file
verbose: true, // show progress bar
fixData: false, //apply transforms on the data to fix common mistakes
streamFormat: "jsonl", // json or jsonl ... only relevant for streams
//will be called on every record
transformFunc: function noop(a) {
return a;
},
};cli options
use npx mixpanel-import --help to see the full list.
option, alias description default
----------------------------------------------------------------
--type, --recordType event/user/group/table "event"
--compress, --gzip gzip on egress false
--strict /import strict mode true
--logs log import results to file true
--verbose show progress bar true
--streamFormat, --format either json or jsonl "jsonl"
--region either US or EU "US"
--fixData fix common mistakes false
--streamSize 2^n value of highWaterMark 27
--recordsPerBatch # records in each request 2000
--bytesPerBatch max size of each request 2MB
--where directory to put logsnote: the recordType param is very important; by default this module assumes you wish to import event records.
change this value to user, group, or table if you are importing other entities.
๐จโ๐ณ๏ธ recipes
the transformFunc is useful because it can pre-process records in the pipeline using arbitrary javascript.
here are some examples:
- putting a
tokenon everyuserrecord:
function addToken(user) {
user.token = `{{my token}}`;
return user;
}
let imported = await mpStream(creds, data, {
transformFunc: addToken,
recordType: "user",
});- constructing an
$insert_idfor each event:
const md5 = require('md5')
function addInsert(event) {
let hash = md5(event);
event.properties.$insert_id = hash;
return event
}
let imported = await mpStream(creds, data, { transformFunc: addInsert })- reshape/rename profile data with a proper
$setkey and$distinct_idvalue
function fixProfiles(user) {
const mpUser = { $set: { ...user } };
mpUser.$set.$distinct_id = user.uuid;
return mpUser
}
let imported = await mpStream(creds, data, { transformFunc: fixProfiles, recordType: "user"});โ๏ธ test data
sometimes it's helpful to generate test data, so this module includes a separate utility to do that:
$ npm run generatesomeTestData.json will be written to ./testData ... so you can then node index.js ./testData/someTestData.json
๐คท why?
because... i needed this and it didn't exist... so i made it.
then i made it public it because i thought it would be useful to others
found a bug? have an idea?