Package Exports
- jataframe
- jataframe/src/Jataframe.js
This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (jataframe) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.
Readme
Jataframe
Javascript Dataframe library, familiar to Pandas users, made for idiots.
Installation
npm i --save jataframe
What is it ?
A Dataframe library similar to Pandas with a few annoying differences. I needed something simple enough for an idiot to use (me).
Intro
const Jataframe = require('jataframe');
const data = [{price: 2.12, name: 'apple'}, {price: 3.12, name: 'banana'}, {price: 154.12, name: 'eggs'}];
const df = new Jataframe(data);
df.columns // ['price', 'name']
df['price'] // [2.12, 3.12, 154.12]
df['name'] // ['apple', 'banana', 'eggs']
df.length // 3
df.head() // [{price: 2.12, name: 'apple'}, {price: 3.12, name: 'banana'}]
df.print()Access
Here is the annoying difference. In Jataframe columns are just raw arrays of data, so every function call needs to be on the dataframe itself, aggregation functions sum/max/mean for example are called as df.sum('column') as opposed to pandas df['column'].sum()
const df = new Jataframe(data);
// Aggregation functions are on the Jataframe object, pass the column name to the agg function
assert(df.mean('price') == 42);
assert(df.sum('price') == 178);
assert(df.max('price') == 154);
assert(df.min('price') == 2);
assert(df.std('price') == 74);
// To filter data, use the filter method, itll return a Jataframe
const filtered = df.filter((row) => row.price > 3);
assert(filtered.length == 2);
assert(filtered['price'] == [154.12, 42.12]);
// It can slice by indices
const df = new Jataframe(data);
const sliced = df.slice(1, 3);
assert(sliced.length == 2);
assert(sliced['price'] == [3.12, 154.12]);
// You can slice by timestamp with ts_slice
const tsliced_df = df.col_slice('TS_COLUMN', new Date('2018-01-01'), new Date('2018-01-03'));
// Sorting
const sorted = df.sort('price');
assert(sorted['price'] == [2.12, 3.12, 154.12]);
// You can specfiy an order
const sorted = df.sort('price', 'desc'); // 'descending'
assert(sorted['price'] == [154.12, 3.12, 2.12]);
// Const get the contents of a row as a Jataframe
const row = df.row(42);
// Const get the contents of as JSON
const same_row = df.iloc(42);
Manipulation
const df = new Jataframe(data);
// You can add a column
df['new_column'] = [1, 2, 3];
assert(df['new_column'] == [1, 2, 3]);
// fillna will fill undefined with a value
df.fillna('sketchy_col',0);If more than one row is returned from a Jataframe.function(), it will return it as a Jataframe, making chaining easy.
GroupBy
const data = [
{group: 'A', name: 'Babraham Lincoln'},
{group: 'A', name: 'Franklin Brosevelt'},
{group: 'B', name: 'Beninjamin Franklins'},
];
const df = new Jataframe(data);
const groups = df.groupBy('group');groups is now an object whose keys are the groups, and values are Jataframes of the rows in that group.
// A and B are dataframes
assert(groups.A.length == 2)
assert(groups.B.length == 1)
assert(groups.A.unique('name') == ['Babe Lincoln', 'Franklin Brosevelt']);
assert(groups.B.unique('name') == ['Beninjamin Franklins']);
AggregateBy
aggregateBy will reduce the row count to the grouped values row count, and aggregate the columns you supply
const data = [
{group: 'A', name: 'Babe Lincoln', price: 2.12},
{group: 'A', name: 'Franklin Brosevelt', price: 3.12},
{group: 'B', name: 'Beninjamin Franklins', price: 154.12},
];
const df = new Jataframe(data);
const groups = df.aggregateBy('group', {
'price_ttl': {'price': Jataframe.sum},
'price_avg': {'price': Jataframe.mean},
});
// Now it contains just two rows, one for group A, and one for group B
expect(groups.length).toBe(2);
expect(groups['price_ttl']).toEqual([5.24, 154.12]);
expect(groups['price_avg']).toEqual([2.62, 154.12]);
AggregateBy include_full_rows
The optional parameter include_full_rows will return the entire row for aggregation instead of just the data point. This makes it possible to 'collect' items of an array instead of just a single data point.
const closersGrouped = closers.aggregateBy('date', {
'win_amount': {
'pnl': (data) => {
const all_rows = data.filter(row => row.pnl > 0).data;
return all_rows.reduce((acc, row) => acc + row.pnl, 0);
}
},
'win_symbols': {
'symbol': (data) => {
return data.filter(row => row.pnl > 0).data.map(row => row.symbol);
}
},
}, true);
closersGrouped['2023-01-01']['win_symbols'] == ['AAPL','GOOG']
closersGrouped['2023-01-01']['win_amount'] == 212.21