Package Exports
- efrt
This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (efrt) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.
Readme
npm i efrt
efrt
is a prefix/suffix trie optimised for compression of english words.
it is based on mckoss/lookups by Mike Koss and bits.js by Steve Hanov
- squeeze a list of words into a very compact form
- reduce filesize/bandwidth a bunch
- ensure unpacking overhead is negligible
- word-lookups are critical-path
By doing the fancy stuff ahead-of-time, efrt lets you ship much bigger word-lists to the client-side, without much hassle.
var efrt = require('efrt')
var words = [
'coolage',
'cool',
'cool cat',
'cool.com',
'coolamungo'
];
//pack these words as tightly as possible
var compressed = efrt.pack(words);
//cool0;! cat,.com,a0;ge,mungo
//create a lookup-trie
var trie = efrt.unpack(compressed);
//hit it!
console.log(trie.has('cool'));//true
console.log(trie.has('miles davis'));//false
Demo!
the words you input should be pretty normalized. Spaces and unicode are good, but numbers, case-sensitivity, and some punctuation are not (yet) supported.
##Performance
there are two modes that efrt
can run in, depending on what you want to optimise for.
By itself, it will be ready-instantly, but must lookup words by their prefixes in the trie. This is not super-fast. If you want lookups to go faster, you can call trie.cache()
first, to pre-compute the queries. Things will run much faster after this:
var compressed = efrt.pack(skateboarders);//1k words (on a macbook)
var trie = efrt.unpack(compressed)
trie.has('tony hawk')
// trie-lookup: 1.1ms
trie.cache()
// caching-step: 5.1ms
trie.has('tony hawk')
// cached-lookup: 0.02ms
the trie.cache()
command will spin the trie into a good-old javascript object, for faster lookups. It takes some time building it though.
In this example, with 1k words, it makes sense to hit .cache()
if you are going to do more-than 5 lookups on the trie, but your mileage may vary.
You can access the object from trie.toObject()
, or trie.toArray()
if you'd like use it directly.
Size
efrt
will pack filesize down as much as possible, depending upon the redundancy of the prefixes/suffixes in the words, and the size of the list.
- list of countries -
1.5k -> 0.8k
(46% compressed) - all adverbs in wordnet -
58k -> 24k
(58% compressed) - all adjectives in wordnet -
265k -> 99k
(62% compressed) - all nouns in wordnet -
1,775k -> 692k
(61% compressed)
but there are some things to consider:
- bigger files compress further (see 🎈 birthday problem)
- using efrt will reduce gains from gzip compression, which most webservers quietly use
- english is more suffix-redundant than prefix-redundant, so non-english words may benefit from other styles
##Use IE9+
<script src="https://unpkg.com/efrt@latest/builds/efrt.min.js"></script>
<script>
var smaller=efrt.pack(['larry','curly','moe'])
var trie=efrt.unpack(smaller)
console.log(trie.has('moe'))
</script>
if you're doing the second step in the client, you can load just the unpack-half of the library(~3k):
<script src="https://unpkg.com/efrt@latest/builds/efrt-unpack.min.js"></script>
<script>
var trie=unpack(compressedStuff);
trie.has('miles davis');
</script>
MIT