Package Exports
- fod4se
- fod4se/dist/index.js
This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (fod4se) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.
Readme
Flexible Obfuscation Dictionary For Sanitization Enforcement (FOD4SE)
A powerful text sanitization library with support for multiple languages, symbol normalization, and flexible configuration options.
Features
- Built-in dictionaries for multiple languages
- Symbol normalization (e.g.,
@→ a,$→s) - Partial or full word replacement
- Left-to-right or right-to-left replacement direction
- Customizable replacement characters
- Ignore list support
- Full or partial word matching
Installation
npm install fod4seQuick Start
import { LanguageFilter } from "fod4se";
// Create a filter with English dictionary
const filter = new LanguageFilter({ baseLanguage: "en" });
// Clean text
const cleaned = filter.getSafe("Your text here");Alternative Usages
You can also use the getSafeText and analyzeText functions directly without creating a LanguageFilter instance.
Using getSafeText
import { getSafeText } from "fod4se";
const cleaned = getSafeText("Your text here", { baseLanguage: "en" });
console.log(cleaned); // Returns sanitized textUsing analyzeText
import { analyzeText } from "fod4se";
const result = analyzeText("Text to analyze", { baseLanguage: "en" });
console.log(result.cleaned); // Sanitized text
console.log(result.profanity); // true if anything was found
console.log(result.matches); // Array of matches with detailsDetailed Usage
Basic Usage with Built-in Dictionary
import { LanguageFilter } from "fod4se";
const filter = new LanguageFilter({
baseLanguage: "en", // Use built-in English dictionary
});
filter.getSafe("Text to clean"); // Returns sanitized textThe base dictionaries are at an early stage of development and are very incomplete. If you miss something, refer to the contributing section.Custom Dictionary
import { LanguageFilter } from "fod4se";
const filter = new LanguageFilter({
baseLanguage: "none",
config: {
profanity: ["word1", "word2"],
ignore: ["goodword1", "goodword2"],
},
});Advanced Analysis
import { LanguageFilter } from "fod4se";
const filter = new LanguageFilter({ baseLanguage: "en" });
const result = filter.analyze("Text to analyze");
console.log(result.cleaned); // Sanitized text
console.log(result.profanity); // true if anything was found
console.log(result.matches); // Array of matches with detailsCustom Configuration
import { LanguageFilter, regexTemplate } from "fod4se";
const filter = new LanguageFilter({
baseLanguage: "en",
config: {
replaceString: "#@", // Pattern used in replacement (What is this #@#@#)
replaceRatio: 0.5, // Replace 50% of matched words
replaceDirection: "LTR", // Replace from left to right
matchTemplate: regexTemplate.partialMatch, // Match partial words
ignoreSymbols: true, // Don't normalize symbols
},
});Configuration Options
LanguageFilter Options
| Option | Type | Default | Description |
|---|---|---|---|
| baseLanguage | "none" | "en" | "pt-br" | - | Built-in dictionary to use |
| config | FSConfig | - | Configuration object |
FSConfig Options
| Option | Type | Default | Description |
|---|---|---|---|
| profanity | string[] | [] | Custom list of words to filter |
| ignore | string[] | [] | Words to exclude from filtering |
| replaceString | string | "*" | Character(s) used for replacement |
| replaceRatio | number | 1 | Portion of word to replace (0 to 1) |
| replaceDirection | "LTR" | "RTL" | "RTL" | Direction of partial replacement |
| matchTemplate | string | regexTemplate.fullWord | Word matching pattern |
| ignoreSymbols | boolean | false | Disable symbol normalization |
Match Templates
import { getSafeText, regexTemplate } from "fod4se";
const text = "c4t category [cat]";
const profanity = ["cat"];
const templates = [
//regexTemplate.fullWord matches "cat" but not "category":
regexTemplate.fullWord,
//regexTemplate.partialMatch matches both "cat" and "category"
regexTemplate.partialMatch,
//custom template to match only [cat]
"\\[{0}\\]",
];
templates
.map((matchTemplate) => getSafeText(text, profanity, { matchTemplate }))
.forEach((result) => console.log(result));
/*
Outputs:
*** category [***] //full
*** ***egory [***] //partial
c4t category ***** //custom
*/License
MIT License - see LICENSE file for details.
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.