Found 72 results for multimodal

promptbook

Promptbook: Run AI apps in plain human language across multiple models and platforms

@promptbook/javascript

Promptbook: Run AI apps in plain human language across multiple models and platforms

@promptbook/ollama

Promptbook: Run AI apps in plain human language across multiple models and platforms

@promptbook/wizard

Promptbook: Run AI apps in plain human language across multiple models and platforms

@promptbook/browser

Promptbook: Run AI apps in plain human language across multiple models and platforms

@promptbook/anthropic-claude

Promptbook: Run AI apps in plain human language across multiple models and platforms

@promptbook/cli

Promptbook: Run AI apps in plain human language across multiple models and platforms

@promptbook/deepseek

Promptbook: Run AI apps in plain human language across multiple models and platforms

@speechly/speech-recognition-polyfill

Polyfill for the Speech Recognition API using Speechly

@promptbook/website-crawler

Promptbook: Run AI apps in plain human language across multiple models and platforms

ptbk

Promptbook: Run AI apps in plain human language across multiple models and platforms

@promptbook/components

Promptbook: Run AI apps in plain human language across multiple models and platforms

contextual-agent-sdk

SDK for building AI agents with seamless voice-text context switching

empathai-core

Non-intrusive behavior-based emotion detection SDK (keyboard & mouse) — EmpathAI core.

icon-generator-mcp

Zero-dependency MCP server for AI-powered SVG icon generation with multimodal LLM support

modelmix

🧬 ModelMix - Unified API for Diverse AI LLM.

lucid-mcp-server

Model Context Protocol (MCP) server for Lucid App integration with multimodal AI analysis

botrun-pdf-multimodal

PDF multimodal conversion MCP tool for Claude Code and Gemini CLI

n8n-nodes-gemini-ai

n8n community node for Google Gemini AI integration with text generation, file upload & analysis, and TTS (Text-to-Speech) support

n8n-nodes-siliconflow

n8n community node for SiliconFlow AI models - chat completions, vision language models, embeddings, and reranking

channel3-sdk

The official TypeScript/JavaScript SDK for Channel3 AI Shopping API

@kortexa-ai/react-multimodal

A set of react components and hooks to help with multimodal input

A Node.js library harnessing the power of Bard's Large Language Model (LLM) for seamless chat experiences and streamlined accessibility to Google's Gemini. Empower your applications with advanced conversational AI, leveraging Bard's LLM to answer question

mmir-lib

MMIR (Mobile Multimodal Interaction and Relay) library

@morphik/mcp

MCP server for Morphik multimodal database

jimeng-ai-mcp

火山引擎即梦AI多模态生成服务MCP工具

modelpilot

Official JavaScript/TypeScript library for the ModelPilot API - OpenAI-compatible interface for intelligent model routing

duckduckgo-chat-interface

A powerful Node.js interface for DuckDuckGo AI Chat with advanced configuration, rate limiting, and image support

@multiface.js/sensors

Device sensor integration for multimodal interactions

@multiface.js/react-native

React Native specific components and utilities for multimodal UI

zerolabel

Zero-shot multimodal classification SDK - classify text and images with custom labels, no training required

gemini-cli-sidx1fork

Fork of Google's Gemini CLI by sidx1. CLI tool for accessing Gemini AI with enhancements by sidx1.

@multiface.js/context

Context awareness and memory management for multimodal interactions

ai-pp3

CLI tool combining multimodal AI analysis with RawTherapee's engine to generate optimized PP3 profiles for RAW photography. Features automatic histogram analysis for enhanced AI processing.

@multiface.js/fusion

Multi-modal input fusion engine for simultaneous interaction handling

@promptbook/wizzard

Promptbook: Run AI apps in plain human language across multiple models and platforms

gemini-multimodal-mcp

MCP server with multimodal capabilities - process documents, images, videos, audio using Gemini Pro with 1M context window

@unum-cloud/uform

Pocket-Sized Multimodal AI for Content Understanding and Generation

@callmedayz/ai-prompt-toolkit

Professional AI prompt engineering toolkit with advanced template features, real-time dashboards, conditional logic, template inheritance, live monitoring, OpenRouter integration, and 310+ model support

@iteleport/speechly-browser-client

Browser client for Speechly API

claude-gemini-multimodal-bridge

Enterprise-grade AI integration bridge connecting Claude Code, Gemini CLI, and Google AI Studio with intelligent routing and advanced multimodal processing capabilities

llama-latex

Image to LaTeX with Llama 3.2 Vision.

unimodaly-ingest

A unified data-ingestion CLI that auto-detects and converts text, image, audio and tabular sources into standardized training datasets

google-genai-live-lib

A library for interacting with Google's Generative AI models in real-time

llmplug

A library to easily integrate various LLM models and vendors into applications, with advanced features.

llama-cpp-capacitor

A native Capacitor plugin that embeds llama.cpp directly into mobile apps, enabling offline AI inference with comprehensive support for text generation, multimodal processing, TTS, LoRA adapters, and more.

chakra-multi-modal

A Chakra UI Multi Modal - one modal with multiple, switchable sections

rate-limiter-multimodal

Rate limiter middleware for Express.js that allows very tight limits while providing a seamless experience to the users.

@ashvardanian/uform

Pocket-Sized Multimodal AI for Content Understanding and Generation

multimodal-search-bar

Componente de barra de pesquisa multimodal para React com suporte a texto multilinhas e imagens