JSPM

  • Created
  • Published
  • Downloads 1224
  • Score
    100M100P100Q117698F
  • License MIT

Build autonomous AI agents for React Native and Expo apps. Provides AI-native UI traversal, tool calling, and structured reasoning.

Package Exports

  • @mobileai/react-native
  • @mobileai/react-native/package.json

Readme

AI Agent SDK for React Native & Expo

Add an autonomous AI agent to any React Native app โ€” no rewrite needed. Wrap your app with <AIAgent> and get: natural language UI control, real-time voice conversations, and a built-in knowledge base. Fully customizable, production-grade security, performant, and lightweight. Plus: an MCP bridge that lets any AI connect to and test your app.

npm install @mobileai/react-native
# โ€” or โ€”
npm install react-native-agentic-ai

๐Ÿค– AI Agent โ€” Autonomous UI Control

AI Agent autonomously controlling a React Native app UI via natural language

๐Ÿงช AI-Powered Testing โ€” Test Your App in English, Not Code

AI-Powered Testing via Model Context Protocol โ€” finding bugs in React Native app without test code

Google Antigravity running 5 checks on the emulator and finding 5 real bugs โ€” zero test code, zero selectors, just English.


npm

npm

license

platform

Two names, one package โ€” install either: @mobileai/react-native or react-native-agentic-ai

โญ If this helped you, star this repo โ€” it helps others find it!


๐Ÿง  How It Works โ€” Structure-First Agentic AI

What if your AI could understand your app the way a real user does โ€” not by looking at pixels, but by reading the actual UI structure?

That's what this SDK does. It reads your app's live UI natively โ€” every button, label, input, and screen โ€” in real time. The AI understands your app's structure, not a screenshot of it.

No OCR. No image pipelines. No selectors. No annotations. No view wrappers.

The result: an AI that truly understands your app โ€” and can act on it autonomously.

This SDK Screenshot-based AI Build It Yourself
Setup <AIAgent> โ€” one wrapper Vision model + custom pipeline Months of custom code
How it reads UI Native structure โ€” real time Screenshot โ†’ OCR Custom integration
AI agent loop โœ… Built-in multi-step โŒ Build from scratch โŒ Build from scratch
Voice mode โœ… Real-time bidirectional โŒ โŒ
Custom business logic โœ… useAction hook Custom code Custom code
MCP bridge (any AI connects) โœ… One command โŒ โŒ
Knowledge base โœ… Built-in retrieval โŒ โŒ

โœจ What's Inside

Ship to Production

๐Ÿค– Autonomous AI Agent โ€” Natural Language UI Automation

Your users describe what they want in natural language. The SDK reads the live screen, plans a sequence of actions, and executes them end-to-end โ€” tapping buttons, filling forms, navigating screens โ€” all autonomously. Powered by Google Gemini.

  • Zero-config โ€” wrap your app with <AIAgent>, done. No annotations, no selectors
  • Multi-step reasoning โ€” navigates across screens to complete complex tasks
  • Custom actions โ€” expose any business logic (checkout, API calls, mutations) via useAction
  • Knowledge base โ€” AI queries your FAQs, policies, product data on demand
  • Human-in-the-loop โ€” native Alert.alert confirmation before critical actions

๐ŸŽค Real-time Voice AI Agent โ€” Bidirectional Audio with Gemini Live API

Full bidirectional voice AI powered by the Gemini Live API. Users speak naturally; the agent responds with voice AND controls your app simultaneously.

  • Sub-second latency โ€” real-time audio via WebSockets, not turn-based
  • Full UI control โ€” same tap, type, navigate, custom actions as text mode โ€” all by voice
  • Screen-aware โ€” auto-detects screen changes and updates its context instantly

๐Ÿ’ก Speech-to-text in text mode: Install expo-speech-recognition and a mic button appears in the chat bar โ€” letting users dictate messages instead of typing. This is separate from voice mode.


Supercharge Your Dev Workflow

๐Ÿ”Œ MCP Bridge โ€” Connect Any AI to Your App

Your app becomes MCP-compatible with one prop. Any AI that speaks the Model Context Protocol โ€” editors, autonomous agents, CI/CD pipelines, custom scripts โ€” can remotely read and control your app.

The MCP bridge uses the same AgentRuntime that powers the in-app AI agent. If the agent can do it via chat, an external AI can do it via MCP.

MCP-only mode โ€” just want testing? No chat popup needed:

<AIAgent
  showChatBar={false}
  mcpServerUrl="ws://localhost:3101"
  apiKey="YOUR_KEY"
  navRef={navRef}
>
  <App />
</AIAgent>

๐Ÿ”ฎ Looking ahead: The same architecture can power production use cases โ€” imagine a user's personal AI assistant ordering food through your app via MCP. The runtime is ready; auth and multi-session support are on the roadmap.

๐Ÿงช AI-Powered Testing via MCP

The most powerful use case: test your app without writing test code. Connect your AI (Antigravity, Claude Desktop, or any MCP client) to the emulator and describe what to check โ€” in English. No selectors to maintain, no flaky tests, self-healing by design.

Skip the test framework. Just ask:

Ad-hoc โ€” ask your AI anything about the running app:

"Is the Laptop Stand price consistent between the home screen and the product detail page?"

YAML Test Plans โ€” commit reusable checks to your repo:

# tests/smoke.yaml
checks:
  - id: price-sync
    check: "Read the Laptop Stand price on home, tap it, compare with detail page"
  - id: profile-email
    check: "Go to Profile tab. Is the email displayed under the user's name?"

Then tell your AI: "Read tests/smoke.yaml and run each check on the emulator"

Real Results โ€” 5 bugs found autonomously:

# What was checked Bug found AI steps
1 Price consistency (list โ†’ detail) Laptop Stand: $45.99 vs $49.99 2
2 Profile completeness Email missing โ€” only name shown 2
3 Settings navigation Help Center missing from Support section 2
4 Description vs specifications "breathable mesh" vs "Leather Upper" 3
5 Cross-screen price sync Yoga Mat: $39.99 vs $34.99 4

๐Ÿ“ฆ Installation

npm install @mobileai/react-native
# โ€” or โ€”
npm install react-native-agentic-ai

No native modules required by default. Works with Expo managed workflow out of the box โ€” no eject needed.

Optional Dependencies

๐Ÿ“ธ Screenshots โ€” for image/video content understanding
npx expo install react-native-view-shot
๐ŸŽ™๏ธ Speech-to-Text in Text Mode โ€” dictate messages instead of typing
npx expo install expo-speech-recognition

Automatically detected. No extra config needed โ€” a mic icon appears in the text chat bar, letting users speak their message instead of typing. This is separate from voice mode.

๐ŸŽค Voice Mode โ€” real-time bidirectional voice agent
npm install react-native-audio-api

Expo Managed โ€” add to app.json:

{
  "expo": {
    "android": { "permissions": ["RECORD_AUDIO", "MODIFY_AUDIO_SETTINGS"] },
    "ios": { "infoPlist": { "NSMicrophoneUsageDescription": "Required for voice chat with AI assistant" } }
  }
}

Then rebuild: npx expo prebuild && npx expo run:android (or run:ios)

Expo Bare / React Native CLI โ€” add RECORD_AUDIO + MODIFY_AUDIO_SETTINGS to AndroidManifest.xml and NSMicrophoneUsageDescription to Info.plist, then rebuild.

Hardware echo cancellation (AEC) is automatically enabled โ€” no extra setup.


๐Ÿš€ Quick Start

React Navigation

import { AIAgent } from '@mobileai/react-native';
import { NavigationContainer, useNavigationContainerRef } from '@react-navigation/native';

export default function App() {
  const navRef = useNavigationContainerRef();

  return (
    <AIAgent
      // โš ๏ธ Prototyping ONLY โ€” don't ship API keys in production
      apiKey="YOUR_GEMINI_API_KEY"

      // โœ… Production: route through your secure backend proxy
      // proxyUrl="https://api.yourdomain.com/gemini-proxy"
      // proxyHeaders={{ Authorization: `Bearer ${userToken}` }}

      navRef={navRef}
    >
      <NavigationContainer ref={navRef}>
        {/* Your existing screens โ€” zero changes needed */}
      </NavigationContainer>
    </AIAgent>
  );
}

Expo Router

In your root layout (app/_layout.tsx):

import { AIAgent } from '@mobileai/react-native';
import { Slot, useNavigationContainerRef } from 'expo-router';

export default function RootLayout() {
  const navRef = useNavigationContainerRef();

  return (
    <AIAgent
      apiKey={process.env.EXPO_PUBLIC_GEMINI_API_KEY!}
      navRef={navRef}
    >
      <Slot />
    </AIAgent>
  );
}

A floating chat bar appears automatically. Ask the AI to navigate, tap buttons, fill forms, answer questions.

Knowledge-Only Mode โ€” AI Assistant Without UI Automation

Set enableUIControl={false} for a lightweight FAQ / support assistant. Single LLM call, ~70% fewer tokens:

<AIAgent enableUIControl={false} knowledgeBase={KNOWLEDGE} />
Full Agent (default) Knowledge-Only
UI analysis โœ… Full structure read โŒ Skipped
Tokens per request ~500-2000 ~200
Agent loop Up to 10 steps Single call
Tools available 7 2 (done, query_knowledge)

๐Ÿง  Knowledge Base

Give the AI domain knowledge it can query on demand โ€” policies, FAQs, product details. Uses a query_knowledge tool to fetch only relevant entries (no token waste).

Static Array

import type { KnowledgeEntry } from '@mobileai/react-native';

const KNOWLEDGE: KnowledgeEntry[] = [
  {
    id: 'shipping',
    title: 'Shipping Policy',
    content: 'Free shipping on orders over $75. Standard: 5-7 days. Express: 2-3 days.',
    tags: ['shipping', 'delivery'],
  },
  {
    id: 'returns',
    title: 'Return Policy',
    content: '30-day returns on all items. Refunds in 5-7 business days.',
    tags: ['return', 'refund'],
    screens: ['product/[id]', 'order-history'], // only surface on these screens
  },
];

<AIAgent knowledgeBase={KNOWLEDGE} />
<AIAgent
  knowledgeBase={{
    retrieve: async (query: string, screenName?: string) => {
      const results = await fetch(`/api/knowledge?q=${query}&screen=${screenName}`);
      return results.json();
    },
  }}
/>

๐Ÿ”Œ MCP Bridge Setup โ€” Connect AI Editors to Your App

Architecture

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”                  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”    WebSocket     โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  Antigravity     โ”‚  Streamable HTTP โ”‚                  โ”‚                 โ”‚                  โ”‚
โ”‚  Claude Desktop  โ”‚ โ—„โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–บ โ”‚ @mobileai/       โ”‚ โ—„โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–บ โ”‚  Your React      โ”‚
โ”‚  or any MCP      โ”‚    (port 3100)   โ”‚  mcp-server      โ”‚   (port 3101)   โ”‚  Native App      โ”‚
โ”‚  compatible AI   โ”‚  + Legacy SSE    โ”‚                  โ”‚                 โ”‚                  โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜                  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜                 โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Setup in 3 Steps

1. Start the MCP bridge โ€” no install needed:

npx @mobileai/mcp-server

2. Connect your React Native app:

<AIAgent
  apiKey="YOUR_GEMINI_KEY"
  mcpServerUrl="ws://localhost:3101"
/>

3. Connect your AI:

Google Antigravity

Add to ~/.gemini/antigravity/mcp_config.json:

{
  "mcpServers": {
    "mobile-app": {
      "command": "npx",
      "args": ["@mobileai/mcp-server"]
    }
  }
}

Click Refresh in MCP Store. You'll see mobile-app with 2 tools: execute_task and get_app_status.

Claude Desktop

Add to ~/Library/Application Support/Claude/claude_desktop_config.json:

{
  "mcpServers": {
    "mobile-app": {
      "url": "http://localhost:3100/mcp/sse"
    }
  }
}
Other MCP Clients
  • Streamable HTTP: http://localhost:3100/mcp
  • Legacy SSE: http://localhost:3100/mcp/sse

MCP Tools

Tool Description
execute_task(command) Send a natural language command to the app
get_app_status() Check if the React Native app is connected

Environment Variables

Variable Default Description
MCP_PORT 3100 HTTP port for MCP clients
WS_PORT 3101 WebSocket port for the React Native app

๐Ÿ”Œ API Reference

<AIAgent> Props

Prop Type Default Description
apiKey string โ€” Gemini API key (prototyping only).
proxyUrl string โ€” Backend proxy URL (production).
proxyHeaders Record<string, string> โ€” Auth headers for proxy.
voiceProxyUrl string โ€” Dedicated proxy for Voice Mode WebSockets.
voiceProxyHeaders Record<string, string> โ€” Auth headers for voice proxy.
model string 'gemini-2.5-flash' Gemini model name.
navRef NavigationContainerRef โ€” Navigation ref for auto-navigation.
maxSteps number 10 Max agent steps per task.
showChatBar boolean true Show the floating chat bar.
enableVoice boolean true Enable voice mode tab.
enableUIControl boolean true When false, AI becomes knowledge-only.
instructions { system?, getScreenInstructions? } โ€” Custom system prompt + per-screen instructions.
customTools Record<string, ToolDefinition | null> โ€” Override or remove built-in tools.
knowledgeBase KnowledgeEntry[] | KnowledgeRetriever โ€” Domain knowledge the AI can query.
knowledgeMaxTokens number 2000 Max tokens for knowledge results.
mcpServerUrl string โ€” WebSocket URL for MCP bridge.
accentColor string โ€” Accent color for the chat bar.
theme ChatBarTheme โ€” Full chat bar color customization.
onResult (result) => void โ€” Called when agent finishes.
onBeforeStep (stepCount) => void โ€” Called before each step.
onAfterStep (history) => void โ€” Called after each step.
onTokenUsage (usage) => void โ€” Token usage per step.
stepDelay number โ€” Delay between steps (ms).
router { push, replace, back } โ€” Expo Router instance.
pathname string โ€” Current pathname (Expo Router).
debug boolean false Enable SDK debug logging.

๐ŸŽจ Customization

// Quick โ€” one color:
<AIAgent accentColor="#6C5CE7" />

// Full theme:
<AIAgent
  accentColor="#6C5CE7"
  theme={{
    backgroundColor: 'rgba(44, 30, 104, 0.95)',
    inputBackgroundColor: 'rgba(255, 255, 255, 0.12)',
    textColor: '#ffffff',
    successColor: 'rgba(40, 167, 69, 0.3)',
    errorColor: 'rgba(220, 53, 69, 0.3)',
  }}
/>

useAction โ€” Custom AI-Callable Business Logic

import { useAction } from '@mobileai/react-native';

function CartScreen() {
  const { cart, clearCart, getTotal } = useCart();

  useAction('checkout', 'Place the order and checkout', {}, async () => {
    if (cart.length === 0) return { success: false, message: 'Cart is empty' };

    // Human-in-the-loop: AI pauses until user taps Confirm
    return new Promise((resolve) => {
      Alert.alert('Confirm Order', `Place order for $${getTotal()}?`, [
        { text: 'Cancel', onPress: () => resolve({ success: false, message: 'User denied.' }) },
        { text: 'Confirm', onPress: () => { clearCart(); resolve({ success: true, message: `Order placed!` }); } },
      ]);
    });
  });
}

useAI โ€” Headless / Custom Chat UI

import { useAI } from '@mobileai/react-native';

function CustomChat() {
  const { send, isLoading, status, messages } = useAI();

  return (
    <View style={{ flex: 1 }}>
      <FlatList data={messages} renderItem={({ item }) => <Text>{item.content}</Text>} />
      {isLoading && <Text>{status}</Text>}
      <TextInput onSubmitEditing={(e) => send(e.nativeEvent.text)} placeholder="Ask the AI..." />
    </View>
  );
}

Chat history persists across navigation. Override settings per-screen:

const { send } = useAI({
  enableUIControl: false,
  onResult: (result) => router.push('/(tabs)/chat'),
});

๐Ÿ”’ Security & Production

Backend Proxy โ€” Keep API Keys Secure

<AIAgent
  proxyUrl="https://myapp.vercel.app/api/gemini"
  proxyHeaders={{ Authorization: `Bearer ${userToken}` }}
  voiceProxyUrl="https://voice-server.render.com"  // only if text proxy is serverless
  navRef={navRef}
>

voiceProxyUrl falls back to proxyUrl if not set. Only needed when your text API is on a serverless platform that can't hold WebSocket connections.

Next.js Text Proxy Example
import { NextResponse } from 'next/server';

export async function POST(req: Request) {
  const body = await req.json();
  const response = await fetch('https://generativelanguage.googleapis.com/...', {
    method: 'POST',
    headers: { 'Content-Type': 'application/json', 'x-goog-api-key': process.env.GEMINI_API_KEY! },
    body: JSON.stringify(body),
  });
  return NextResponse.json(await response.json());
}
Express WebSocket Proxy (Voice Mode)
const express = require('express');
const { createProxyMiddleware } = require('http-proxy-middleware');

const app = express();
const geminiProxy = createProxyMiddleware({
  target: 'https://generativelanguage.googleapis.com',
  changeOrigin: true,
  ws: true,
  pathRewrite: (path) => `${path}${path.includes('?') ? '&' : '?'}key=${process.env.GEMINI_API_KEY}`,
});

app.use('/v1beta/models', geminiProxy);
const server = app.listen(3000);
server.on('upgrade', geminiProxy.upgrade);

Element Gating โ€” Hide Elements from AI

<Pressable aiIgnore={true}><Text>Admin Panel</Text></Pressable>

Content Masking โ€” Sanitize Before LLM Sees It

<AIAgent transformScreenContent={(c) => c.replace(/\b\d{13,16}\b/g, '****-****-****-****')} />

Screen-Specific Instructions

<AIAgent instructions={{
  system: 'You are a food delivery assistant.',
  getScreenInstructions: (screen) => screen === 'Cart' ? 'Confirm total before checkout.' : undefined,
}} />

Lifecycle Hooks

Hook When
onBeforeStep Before each agent step
onAfterStep After each step (with full history)
onBeforeTask Before task execution
onAfterTask After task completes

๐Ÿ› ๏ธ Built-in Tools

Tool What it does
tap(index) Tap any interactive element โ€” buttons, switches, checkboxes, custom components
type(index, text) Type into a text input
navigate(screen) Navigate to any screen
capture_screenshot(reason) Capture the screen as an image (requires react-native-view-shot)
done(text) Finish the task with a response
ask_user(question) Ask the user for clarification
query_knowledge(question) Search the knowledge base

๐Ÿ“‹ Requirements

  • React Native 0.72+
  • Expo SDK 49+ (or bare React Native)
  • Gemini API key โ€” Get one free

Currently supports Google Gemini models only. Text mode defaults to gemini-2.5-flash (configurable via the model prop โ€” any Gemini model works). Voice mode uses gemini-2.5-flash-native-audio-preview (fixed). Additional providers may be added in future releases.

๐Ÿ“„ License

MIT ยฉ Mohamed Salah

๐Ÿ‘‹ Let's connect โ€” LinkedIn