Package Exports
- fuzzball
This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (fuzzball) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.
Readme
Fuzzball.js
Easy to use and powerful fuzzy string matching.
This is a JavaScript port of the fuzzywuzzy Python library. Uses fast-levenshtein for distance calculations. (with a slight modification to match the behavior of python-Levenshtein where substitutions are weighted 2 instead of 1 in ratio calculations. or specify an options.subcost to override)
Try it out on runkit!
Requirements
- jsdifflib
- heap.js
Installation
Using NPM
npm install fuzzballUsage
var fuzz = require('fuzzball');
fuzz.ratio("this is a test", "this is a test");
100Browser
<script src="fuzzball_browser.js"></script><script>
var fuzz = require('fuzzball');
alert(fuzz.ratio("hello world", "hiyyo wyrld"));
</script>Simple Ratio
fuzz.ratio("this is a test", "this is a test!"); // "!" stripped in pre-processing by default
100Partial Ratio
fuzz.partial_ratio("this is a test", "this is a test!");
100
fuzz.partial_ratio("this is a test", "this is a test again!"); //still 100, substring of 2nd is a perfect match of the first
100Token Sort Ratio
fuzz.ratio("fuzzy wuzzy was a bear", "wuzzy fuzzy was a bear");
91
fuzz.token_sort_ratio("fuzzy wuzzy was a bear", "wuzzy fuzzy was a bear");
100Token Set Ratio
fuzz.token_sort_ratio("fuzzy was a bear", "fuzzy fuzzy was a bear");
84
fuzz.token_set_ratio("fuzzy was a bear", "fuzzy fuzzy was a bear");
100Distance (Levenshtein distance without any ratio calculations)
fuzz.distance("fuzzy was a bear", "fozzy was a bear");
1Other Scoring Options
- partial_token_set_ratio
- partial_token_sort_ratio
- WRatio (WRatio is weighted based on relative string length, runs tests based on relative length and returns top score)
Blog post with overview of scoring algorithms can be found here.
Pre-Processing
// eh, don't need to clean it up..
var options = {full_process: false}; //non-alphanumeric will not be converted to whitespace if false, default true
fuzz.ratio("this is a test", "this is a test!", options);
97Pre-processing run by default unless options.full_process is set to false, but can run separately as well. (so if searching same list repeatedly can only run once)
fuzz.full_process("myt^eXt!");
myt extInternational (a.k.a. non-ascii)
// currently full_process must be set to false if useCollator is true
// or non-roman alphanumeric will be removed (got a good locale-specific alphanumeric check in js?)
var options = {full_process: false, useCollator: true};
fuzz.ratio("this is ä test", "this is a test", options);
100Extract (search a list of choices for top results)
Simple: array of strings
var query = "polar bear";
var choices = ["brown bear", "polar bear", "koala bear"];
results = fuzz.extract(query, choices);
[ [ 'polar bear', 100 ],
[ 'koala bear', 80 ],
[ 'brown bear', 60 ] ]Less simple: array of objects with options
Processor function takes a choice and returns a string which will be used for scoring. Default scorer is ratio.
var query = "126abzx";
var choices = [{id: 345, modelnumber: "123abc"},{id: 346, modelnumber: "123efg"},{id: 347, modelnumber: "456abdzx"}];
var options = {
scorer: fuzz.partial_ratio,
processor: function(choice) {return choice['modelnumber']},
limit: 2, // max number of results, default: no limit
cutoff: 50 // lowest score to return, default: 0
};
results = fuzz.extract(query, choices, options);
[ [ { id: 347, modelnumber: '456abdzx' }, 71 ],
[ { id: 345, modelnumber: '123abc' }, 67 ] ]