JSPM

@glogwa/llama-roblox

1.0.4
  • ESM via JSPM
  • ES Module Entrypoint
  • Export Map
  • Keywords
  • License
  • Repository URL
  • TypeScript Types
  • README
  • Created
  • Published
  • Downloads 25
  • Score
    100M100P100Q35060F
  • License ISC

LLaMA model inference implementation for Roblox using llama.cpp architecture

Package Exports

    This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (@glogwa/llama-roblox) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.

    Readme

    @glogwa/llama-roblox

    Complete LLaMA model inference for Roblox using llama.cpp architecture

    npm version License: ISC TypeScript roblox-ts

    A production-ready implementation of llama.cpp for Roblox, enabling on-device LLM inference with GGUF model support.

    ✨ Features

    • 🚀 Full GGUF v3 Support - Load quantized models directly
    • 🎯 10 Quantization Formats - Q4_0, Q4_1, Q5_0, Q5_1, Q8_0, Q8_1, Q2_K-Q6_K, F16, F32, BF16
    • 🧠 Complete Transformer - Multi-head attention, RoPE, feed-forward networks
    • 💬 Chat Templates - ChatML, Llama 2, Alpaca, Vicuna
    • 🎲 7 Sampling Strategies - Temperature, Top-K, Top-P, Min-P, Mirostat, and more
    • Optimized Performance - Cache-blocked matrix multiplication, KV cache
    • 📦 Zero Dependencies - Pure TypeScript implementation

    📥 Installation

    npm install @glogwa/llama-roblox

    🚀 Quick Start

    import { quickSetup } from "@glogwa/llama-roblox";
    
    // Load your GGUF model (e.g., Qwen 3 0.6B Q4_K_M)
    const modelBuffer = loadModelFromStorage();
    
    // Quick setup with sensible defaults
    const llm = quickSetup(modelBuffer, {
        n_ctx: 2048,
        temperature: 0.7,
    });
    
    // Generate text
    const response = llm.generate("Hello, world!", 100);
    print(response);
    
    // Clean up
    llm.free();

    💬 Chat Example

    import { createLLM, ChatTemplateType } from "@glogwa/llama-roblox";
    
    const llm = createLLM();
    
    // Load and configure
    llm.loadModel(modelBuffer);
    llm.createContext({ n_ctx: 2048 });
    llm.setupSampler({ temperature: 0.8 });
    
    // Setup chat
    llm.setupConversation(ChatTemplateType.CHATML);
    llm.setSystemPrompt("You are a helpful AI assistant.");
    
    // Multi-turn conversation
    const response1 = llm.chat("What is TypeScript?", 100);
    print(response1);
    
    const response2 = llm.chat("How is it different from JavaScript?", 100);
    print(response2);
    
    llm.free();

    🎯 Supported Models

    Works with any GGUF model, including:

    • Qwen 3 (0.6B, 1.5B, 3B, 7B)
    • LLaMA 2/3 (7B, 13B, 70B)
    • Mistral (7B)
    • Phi-2/3 (2.7B, 3.8B)
    • TinyLlama (1.1B)
    • ✅ And many more!

    📊 Quantization Support

    Format Bits Description Size Reduction
    F32 32 Full precision 1x (baseline)
    F16 16 Half precision 2x
    Q8_0 8 8-bit quantization 4x
    Q6_K 6 6-bit K-quants 5.3x
    Q5_0/Q5_1 5 5-bit quantization 6.4x
    Q4_0/Q4_1 4 4-bit quantization 8x
    Q4_K_M 4 4-bit K-quants (medium) 8x
    Q3_K 3 3-bit K-quants 10.7x
    Q2_K 2 2-bit K-quants 16x

    🎲 Sampling Strategies

    // Greedy (deterministic)
    llm.setupSampler({ temperature: 0.0 });
    
    // Balanced
    llm.setupSampler({
        temperature: 0.7,
        top_k: 40,
        top_p: 0.95,
    });
    
    // Creative
    llm.setupSampler({
        temperature: 1.0,
        top_p: 0.98,
        repeat_penalty: 1.1,
    });
    
    // Mirostat (perplexity control)
    llm.setupSampler({
        mirostat: 2,
        mirostat_tau: 5.0,
        mirostat_eta: 0.1,
    });

    Building from Source

    To build the project from scratch, use:

    npm install
    npm run build

    Or with Rojo:

    rojo build -o "LLM-on-roblox.rbxlx"

    For development with live sync:

    rojo serve

    For more help, check out the Rojo documentation.

    Documentation

    See the full documentation for detailed usage, API reference, and examples.

    License

    ISC License

    Credits

    Based on llama.cpp by Georgi Gerganov