Package Exports
- wasm-dist-bhtsne
- wasm-dist-bhtsne/wasm_dist_bhtsne.js
This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (wasm-dist-bhtsne) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.
Readme
wasm-dist-bhtsne
This is a WebAssembly (WASM) port of the modified Barnes-Hut t-SNE algorithm that works with pre-computed distance vector. The original Rust implementation can be found here.
Installation
Install the wasm-dist-bhtsne package from npm:
npm i wasm-dist-bhtsneExample
import { threads } from 'wasm-feature-detect';
import init, { initThreadPool, DistbhtSNEf64 } from 'wasm-dist-bhtsne';
// Final embedding dimension
const outputDim = 2;
// example data format
const exampleData = {
distances: [1.0, 2.0, 3.0, 4.0, ..., 5.0, 6.0, 7.0, 8.0, 9.0, 10.0],
indices: [0, 11, 23, 43, ..., 4, 5, 64, 7, 8, 9],
n_samples: 1000
};
(async {
const { memory } = await init();
if (await threads()) {
console.log("Browser supports threads");
initThreadPool(navigator.hardwareConcurrency);
}
else {
console.log("Browser does not support threads");
return;
}
// set hyperparameters
const opt = {
learning_rate: 150.0,
perplexity: 30.0,
theta: 0.5,
};
try {
const tsneEncoder = new DistbhtSNEf64(
exampleData.distances, // precomputed distance matrix
exampleData.indices, // precomputed indices matrix
exampleData.n_samples, // number of vectors
opt
);
// run the algorithm for 1000 iterations
for (let i = 0; i < 1000; i++) {
tsneEncoder.step(1);
}
// get the embedding
const embedding = tsneEncoder.get_embedding();
const result = new Float64Array(
memory.buffer,
embedding,
exampleData.n_samples * outputDim
);
}
catch (e) {
console.error(e);
}
})();Features
- Harnesses multi-threading capabilities through wasm-bindgen-rayon.
- Allows passing t-SNE hyperparameters through a JavaScript object, where you only need to include the parameters you want to change from the defaults. If you don't specify any, default values are used.
- Supports running the algorithm in iterations, enabling progressive refinement of the embedding.
- Supports both Float32Array and Float64Array for data input.
Requirements
To use the multithreading feature, you need to enable SharedArrayBuffer on the Web. As stated in the wasm-bindgen-rayon readme:
In order to use SharedArrayBuffer on the Web, you need to enable cross-origin isolation policies. Check out the linked article for details.
Hyperparameters
Here is a list of hyperparameters that can be set in the JavaScript object, along with their default values and descriptions:
learning_rate(default:200.0): controls the step size during the optimization.momentum(default:0.5): helps accelerate gradients vectors in the right directions, thus leading to faster converging.final_momentum(default:0.8): momentum value used after a certain number of iterations.momentum_switch_epoch(default:250): the epoch after which the algorithm switches to final_momentum for the map update.stop_lying_epoch(default:250): the epoch after which the P distribution values become true. For epochs < stop_lying_epoch, the values of the P distribution are multiplied by a factor equal to12.0.theta(default:0.5): Determines the accuracy of the approximation. Larger values increase the speed but decrease accuracy. Must be strictly greater than0.0.embedding_dim(default:2): the dimensionality of the embedding space.perplexity(default:20.0): the perplexity value. It determines the balance between local and global aspects of the data. A good value lies between5.0and50.0.
Acknowledgements
Here are the list of people whom I would like to express my gratitude for helping me with this project: