Package Exports

llm-pulse

Readme

llm-pulse

Zero-config CLI that tells you what LLMs your PC can run. Scans hardware, finds runtimes, recommends models.

npx llm-pulse

Install

# Run directly (no install)
npx llm-pulse

# Or install globally
npm install -g llm-pulse

Requires Node.js 18+.

Commands

`llm-pulse` / `llm-pulse scan`

Hardware scan + model recommendations.

llm-pulse                            # Full scan (default)
llm-pulse --format json              # JSON output
llm-pulse --category coding --top 3  # Top 3 coding models

Flag	Description	Default
`-f, --format`	`table` or `json`	`table`
`-c, --category`	`general`, `coding`, `reasoning`, `creative`, `multilingual`	`all`
`-t, --top <n>`	Number of recommendations	`5`
`-v, --verbose`	Detailed output	`false`

`llm-pulse doctor`

System health check — scores your setup and gives suggestions.

llm-pulse doctor
llm-pulse doctor --format json

`llm-pulse models`

Browse the model database filtered for your hardware.

llm-pulse models                      # All 45+ models
llm-pulse models --search llama       # Search by name
llm-pulse models --category coding    # Filter by category
llm-pulse models --fits               # Only models that fit your VRAM

`llm-pulse monitor`

Live TUI dashboard — like htop for LLMs. Press Tab to switch views, q to quit.

Overview — CPU/GPU/RAM/VRAM bars with sparklines + smart alerts
Inference — Throughput chart + session stats
GPU — Per-GPU utilization, temperature, VRAM, and power sparklines with peak stats + temperature alerts
VRAM Map — Visual VRAM breakdown (model weights / KV cache / overhead / free)

llm-pulse monitor

`llm-pulse benchmark`

Quick inference benchmark via Ollama.

llm-pulse benchmark                  # Auto-picks smallest model
llm-pulse benchmark --model phi3     # Specific model
llm-pulse benchmark --rounds 5       # 5 rounds (default: 3)

Programmatic API

import { detectHardware, getRecommendations } from "llm-pulse";

const hardware = await detectHardware();
const recs = getRecommendations(hardware, { category: "coding", top: 3 });

console.log(recs[0].score.model.name);  // "Qwen 2.5 Coder 14B"
console.log(recs[0].score.fitLevel);     // "comfortable"
console.log(recs[0].pullCommand);        // "ollama pull qwen2.5-coder:14b"

MCP Server

Use llm-pulse as an MCP tool from Claude Code, Cursor, or any MCP-compatible AI assistant. The assistant can scan your hardware, check model compatibility, and snapshot live GPU/VRAM state — all without leaving the chat.

Add to your Claude Code config (~/.claude.json or your project's .mcp.json):

{
  "mcpServers": {
    "llm-pulse": {
      "command": "npx",
      "args": ["-y", "llm-pulse-mcp"]
    }
  }
}

Exposed tools:

Tool	What it does
`scan`	Full hardware scan + ranked model recommendations
`check`	"Can I run this model?" verdict (yes/maybe/no) with best quantization + speed estimate
`recommend`	Ranked model list for your hardware, filterable by category
`doctor`	System health score with actionable suggestions
`models`	Browse / search the model database, optionally filtered to models that fit
`monitor`	One-shot live snapshot — CPU/GPU%, VRAM, temp, power, active Ollama model + tok/s

Supported

Hardware: NVIDIA GPU (full CUDA/VRAM), AMD, Intel, Apple Silicon, any CPU (AVX2/NEON), DDR4/DDR5, NVMe/SSD/HDD

Runtimes: Ollama, llama.cpp, LM Studio

Models: 45+ models across general, coding, reasoning, creative, multilingual — each with Q4/Q5/Q8/F16 quantization variants

License

MIT

llm-pulse

Package Exports

Readme

llm-pulse

Install

Commands

llm-pulse / llm-pulse scan

llm-pulse doctor

llm-pulse models

llm-pulse monitor

llm-pulse benchmark

Programmatic API

MCP Server

Supported

License

`llm-pulse` / `llm-pulse scan`

`llm-pulse doctor`

`llm-pulse models`

`llm-pulse monitor`

`llm-pulse benchmark`