node-llama-cpp
Run AI models locally on your machine with node.js bindings for llama.cpp. Enforce a JSON schema on the model output on the generation level
Found 13 results for gguf
Run AI models locally on your machine with node.js bindings for llama.cpp. Enforce a JSON schema on the model output on the generation level
a GGUF parser that works on remotely hosted files
llama.cpp gguf file parser for javascript
A native Capacitor plugin that embeds llama.cpp directly into mobile apps, enabling offline AI inference with chat-first API design. Supports both simple text generation and advanced chat conversations with system prompts, multimodal processing, TTS, LoRA
Various utilities for maintaining Ollama compatibility with models on Hugging Face hub
Chat UI and Local API for the Llama models
Run AI models locally on your machine with node.js bindings for llama.cpp. Enforce a JSON schema on the model output on the generation level
Lightweight JavaScript package for running GGUF language models
Native Node.JS plugin to run LLAMA inference directly on your machine with no other dependencies.
Run AI models locally on your machine with node.js bindings for llama.cpp. Force a JSON schema on the model output on the generation level
A browser-friendly library for running LLM inference using Wllama with preset and dynamic model loading, caching, and download capabilities.
Universal LLM model loader and inference router - agnostic, fast, and intelligent
a GGUF parser that works on remotely hosted files