stopword
A module for node.js and the browser that takes in text and returns text that is stripped of stopwords. Has pre-defined stopword lists for 62 languages and also takes lists with custom stopwords as input.
Found 80 results for document-processing
A module for node.js and the browser that takes in text and returns text that is stripped of stopwords. Has pre-defined stopword lists for 62 languages and also takes lists with custom stopwords as input.
The Retrieval-Augmented Generation (RAG) module contains document processing and embedding utilities.
🚀 MARIA v4.3.46 - Enterprise AI Development Platform with identity system and character voice implementation. Features 74 production-ready commands with comprehensive fallback implementation, local LLM support, and zero external dependencies. Includes na
Promptbook: Run AI apps in plain human language across multiple models and platforms
A developer-friendly transformation engine for programmatic document manipulation
Javascript SDK for Sensible, the developer-first platform for extracting structured data from documents so that you can build document-automation features into your SaaS products
Promptbook: Run AI apps in plain human language across multiple models and platforms
Promptbook: Run AI apps in plain human language across multiple models and platforms
MCP server for PageIndex
Official Node.js client for Docstrange API - Extract data from PDFs, images, and documents in multiple formats
N8N nodes for processing PDF and Excel files
UniCraft N8N custom nodes - Unified AI Model Router with Multi-Modal Support by CloudCraft Labs for OpenAI, Anthropic, Google Gemini, and more
Solar LLM and Embeddings nodes for n8n
n8n community nodes for Docutray OCR, document identification, and knowledge base search services
Production-ready MCP server for document ingestion and knowledge management with vector search. Supports PDF, DOCX, TXT, MD, CSV, JSON, HTML with ChromaDB and multiple embedding providers.
MCP server for Ignidor IDP B2B API integration - enables Claude to process documents through Ignidor's enterprise document processing pipeline
A complete Retrieval-Augmented Generation system using pgvector, LangChain, and LangGraph for Node.js applications with buffer, URL processing, and advanced filtering support - fully configurable without environment variables
AI-powered PDF accessibility automation for N8N - comprehensive WCAG compliance analysis, intelligent remediation, and professional audit reporting with 5 integrated accessibility tools
MCP server for document-to-Markdown conversion using Mistral AI OCR
TypeScript SDK for document processing with zero-friction framework adapters. Features intelligent coordinate handling, semantic regions, React UI overlays, and automatic route detection for Remix and Next.js. Transform raw bounding boxes into interactive
n8n node for Mistral OCR API integration with structured annotations
Powerful PDF data extraction library powered by AI vision models. Transform PDFs into structured, validated data using TypeScript, Zod, and AI providers like Scaleway and Ollama.
A REST wrapper for SAP AI Core Vector API with document grounding capabilities
MCP Document Converter Server — A Model Context Protocol server for seamless document format conversion and processing
An intelligent text chunking library that respects document structure and semantic boundaries
AI-powered Mongoose plugin for intelligent document processing with auto-summarization, semantic search, MongoDB Vector Search, and function calling
n8n community node with intelligent batched chain summarization for processing large documents efficiently
n8n nodes for Unstract services including LLMWhisperer and Unstract API
Node.js client for Chunkr API
Universal queue abstraction library supporting RabbitMQ, AWS SQS, Azure Service Bus, and GCP Pub/Sub with a single unified interface
Modern JavaScript-first RAG framework with contextual embeddings, professional CLI, and one-command deployment
Blazing-fast and lightweight PaddleOCR library for Node.js and Bun. Perform accurate text detection, recognition, and image deskew with a simple, modern, and type-safe API. Ideal for document processing, data extraction, and computer vision tasks.
Local-first TypeScript retrieval engine for semantic search over static documents
TypeScript SDK for Saral Structura, providing Zod schemas and validation for document processing outputs.
**context1000** is a documentation format for software systems, designed for integration with artificial intelligence tools. The key artifacts are ADRs and RFCs, enriched with formalized links between documents.
Modern TypeScript library for converting Office documents (DOCX) to Markdown format, optimized for Bun runtime with enhanced table support and math equation conversion.
Instafill AI Node.js library for automating PDF form filling using AI-powered technology.
Passport OCR API client for extracting passport data from images and PDF files using OCR technology.
High-performance PDF manipulation library with native processing capabilities. Supports encryption, decryption, merging, splitting, watermarking, optimization, and comprehensive PDF operations with both file and buffer support.
Document processing tools for majk chat - PDF, Excel, Word, PowerPoint parsing and analysis
An SDK providing helpers to create Lakechain middlewares in TypeScript.
TypeScript MCP server for DocRouter API
n8n node to extract text, images and tables from PDF with multilingual support, language detection and comprehensive test suite
Document processing application with CLI and API interfaces
MCP server for Upstage AI document processing - Node.js implementation
Advanced n8n node for Puter.js AI with RAG agentic capabilities, document processing, audio transcription, Supabase integration, and cost-optimized model priorities
A Node.js package to interact with the Peslac API for document processing.
Universal document-to-markdown and section splitter for HTML, URLs, and PDFs.
A Node.js package for processing ODG (OpenDocument Graphics) files using LibreOffice API
n8n node package for DOCX document manipulation and processing
A flexible and customizable React chat component that supports context-aware conversations and document processing
Node.js SDK for the Nanonets API: OCR, document extraction, and workflow automation.
The Retrieval-Augmented Generation (RAG) module contains document processing and embedding utilities.
An SDK for intelligent document processing using State of the Art AI models.
A CLI tool to extract text from a static Next.js export and generate llm.txt for LLM ingestion.
A comprehensive React TypeScript component library for viewing and interacting with PDF files using Mozilla PDF.js. Features include text selection, highlighting, search, sidebar, multiple view modes, and complete PDF.js web viewer functionality.
Universal MCP Server for Multi-Rendering PDF Quality Assurance System with AI-powered optimization
Pure JavaScript MCP server for Unstructured.io - No Python required!
A simple and tiny traversal library for MarkDoc AST
Enhanced n8n community node for DOCX to text conversion with RAG capabilities, page-aware chunking, and metadata extraction. Fork of n8n-nodes-docx-converter with advanced features for AI/ML workflows.
Hierarchical markdown chunking for RAG systems with AI-powered context summarization
JavaScript SDK for the Koncile Intelligent Document Processing API
n8n node package for DOCX document manipulation and processing
PDF scraping library for Chilean tax documents. Extract emitter name, economic activities, and address from structured PDF documents like 'CARPETA TRIBUTARIA ELECTRÓNICA PARA SOLICITAR CRÉDITOS'
A comprehensive Model Context Protocol (MCP) server for document processing, PDF manipulation, format conversion, and text extraction with robust error handling. Now includes advanced features like document conversion, image processing, PDF comparison, se
Advanced n8n node for Agentic RAG with Supabase pgvector - handles structured/unstructured documents with AI-powered query refinement
A reliable Model Context Protocol server for PDF text extraction using pdftotext from poppler-utils
The Retrieval-Augmented Generation (RAG) module contains document processing and embedding utilities.
Node RED Custom Nodes for LCP
📟 The official CLI for Project Lakechain.
MCP服务器用于分步审核可行性研究报告等文件的批处理
JavaScript/TypeScript SDK for Hashub Document Processing API
A modular document processing system for converting HTML to Markdown
A flexible and customizable React chat component that supports context-aware conversations and document processing
AI-powered invoice to JSON converter using Mistral AI with dynamic field detection and master schema management
Minimal recursive text chunking functionality extracted from @mastra/rag for edge deployments
Semantic document layer for AI-to-Office pipeline. Transform markdown into professional PDF/DOCX with preserved document structure.
TypeScript SDK for DocRouter API
Extract text from images using docTR OCR in n8n workflows