Local-first LLM inference

Your hardware.
Any website.
No exposed ports.

Robot Queue connects browser chat requests to Ollama workers running on machines you control — without exposing a single port to the internet.

Get Started Free How it works

🌐

Browser

SDK request

⚙️

Robot Queue

cloud broker

🤖

Ollama Worker

local models

✨

Desktop AI App

Claude, ChatGPT…

How it works

Three steps from sign-up to running

Create a site

2 minutes

Embed the SDK

Add one script tag to your page and initialize the client with your site key. Call chat.completions.create() like any LLM API.

5 minutes

Connect a worker

Run the Ollama CLI on any local machine, or connect your desktop AI app via MCP — Claude, ChatGPT, and others work out of the box.

3 minutes

Simple browser SDK

Works like any LLM API.

index.html

<!-- 1. Include the hosted SDK -->
<script src="https://cdn.robot-queue.robrighter.com/sdk.js"></script>

<script>
// 2. Initialize with your public site key
const client = createRobotQueueClient({
  siteKey: 'rq_site_public_...'
});

// 3. Call chat completions — runs on your local Ollama worker
const result = await client.chat.completions.create({
  messages: [{ role: 'user', content: userMessage }]
});

document.getElementById('output').textContent = result.message.content;
</script>

Two ways to process jobs

Your hardware or your AI app.

🤖

Ollama CLI

Run open-source models locally. The worker script polls silently in the background — pull any Ollama-compatible model and it's ready to process jobs from your queue.

npx tsx worker.ts start --worker-key <key>

Llama, Mistral, Phi, Gemma, and more
Runs on any machine — laptop, desktop, server
No data leaves your network

✨

Desktop AI App via MCP

Connect Claude Desktop, ChatGPT, or any MCP-compatible AI app as a worker. Your AI assistant polls the queue and processes jobs using its own frontier models — no GPU required.

npx tsx mcp-worker.ts --worker-key <key>

Claude, ChatGPT, and any MCP client
Frontier model quality, zero local setup
Switch workers without changing your website

Why Robot Queue

Built for developers who want control.

🔒

Your hardware, your data

Prompts and completions run on machines you own. No third-party model provider ever sees your data.

🌐

No exposed servers

Workers connect outbound to the queue — no inbound firewall rules, no dynamic DNS, no reverse proxies.

⚡

Use your GPU for free

Stop paying per-token API fees. If you have a local machine with a GPU and Ollama, you already have an inference server.

🔑

Per-worker keys

Each worker installation gets its own revocable key. Add workers for throughput, remove them instantly if needed.

🔄

Offline-tolerant

Workers reconnect automatically after network interruptions. Job leases expire and are retried if a worker goes silent.

📦

Single JS include

The browser SDK has no dependencies and works in any modern browser. One script tag is all it takes to integrate.

Your hardware.Any website.No exposed ports.