Cloud AI proxy

How Trellis Cloud routes agent LLM calls through a metered broker proxy and Vercel AI Gateway.

Cloud AI proxy

On studio.trellis.computer, agent prompts do not call Google (or other providers) directly from your sandbox. The broker at api.studio.trellis.computer exposes a metered AI proxy. OpenCode in your E2B sandbox talks to that proxy; the proxy routes through Vercel AI Gateway to the default hosted model.

Prerequisite: the in-sandbox OpenCode backend must accept POST /session/{id}/prompt_async on port 4096 (via the turtlecode proxy on 3333). If the composer shows Method Not Allowed, the Studio stack is mismatched after a CLI upgrade — fix Cloud hosting first; the gateway is not involved yet.

Architecture

Cloud dashboard (iframe)
  → Studio app receives InstantDB bearer token via postMessage
  → CloudProviderSync registers trellis-cloud provider on OpenCode
  → User prompt → POST …/session/{id}/prompt_async (OpenCode on :4096)
  → OpenCode → POST api.studio.trellis.computer/ai/v1/chat/completions
  → Broker ai-proxy → Vercel AI Gateway → Gemini Flash Lite
  → Per-user token metering in InstantDB

The gateway API key lives on the broker only. Sandboxes authenticate with the user's InstantDB token (Authorization: Bearer …). Studio syncs that token into OpenCode as the trellis-cloud provider key when cloud mode is active.

Default model

Cloud workspaces temporarily default to MiniMax M2.5 Free on OpenCode Zen (opencode / minimax-m2.5-free) while the metered trellis-cloud → Vercel AI Gateway path is stabilized.

When re-enabled, the hosted default is Gemini Flash-Lite Latest via trellis-cloud (gemini-flash-lite-latest), mapped by the broker to GEMINI_DEFAULT_MODEL (for example google/gemini-2.5-flash-lite). If neither path is available, Studio falls back to Nemotron 3 Super Free on OpenCode Zen.

Connect your own providers under Settings → Providers as on local Studio. ChatGPT Pro/Plus (Codex) via device code still works on cloud; see the Studio introduction.

Usage gauge

The status bar shows daily token budget when the IDE runs inside the cloud dashboard iframe. Studio polls GET /ai/usage on the broker every 30 seconds. The response includes dailyBudget, usedTokens, remainingTokens, and resetsAt (UTC midnight rollover).

Default free-tier budget is controlled by broker env AI_DAILY_BUDGET (default 2000000 tokens per user per day). Rate limits: 10 requests per minute and 2 concurrent streams per user on the shared proxy path.

Broker endpoints

MethodPathPurpose
POST/ai/chatNon-streaming completion (OpenAI-compatible response)
POST/ai/chat/streamSSE streaming (OpenAI-compatible chunks)
POST/ai/v1/chat/completionsOpenAI-compatible path for OpenCode createOpenAICompatible clients; honors stream: true in the body
GET/ai/usageCurrent user's daily usage (InstantDB bearer required)

All chat routes accept an OpenAI-style message list. Optional body.apiKey enables BYOK: the request bypasses the gateway and calls Google directly with the supplied key. Metering and rate limits are skipped on the BYOK path.

Operator configuration

Deploy from the cloud/ repo (broker project on Vercel). Required env vars for the shared default model:

VariablePurpose
AI_GATEWAY_API_KEYVercel AI Gateway key (server-side only)
GEMINI_DEFAULT_MODELGateway model slug (for example google/gemini-2.5-flash-lite)
AI_DAILY_BUDGETPer-user daily token cap (default 2000000)

Set a spend cap in the Vercel AI Gateway dashboard before enabling production traffic. See cloud/DEPLOY.md in the trellis-cloud repo for full broker and web deploy steps.

Verification

Unit tests (CI-safe, no live gateway): cloud/src/ai-proxy.test.ts mocks the AI SDK and covers model aliases, message conversion, auth, rate limits, budget caps, streaming SSE shape, and /ai/usage. Included in bun test src/ and passes bun run typecheck in the broker package. Studio covers default model constants in studio/packages/app/src/lib/opencode-zen-model.test.ts.

cd cloud && bun run typecheck
cd cloud && bun test src/ai-proxy.test.ts
cd studio/packages/app && bun test src/lib/opencode-zen-model.test.ts

Manual smoke (requires AI_GATEWAY_API_KEY in cloud/.env): hits the real Vercel AI Gateway and the full broker proxy path.

cd cloud && bun run test:ai-smoke

Source layout

PathRole
cloud/src/ai-proxy.tsGateway routing, metering, SSE, BYOK
cloud/src/ai-proxy.test.tsUnit tests for ai-proxy routes
cloud/scripts/smoke-ai-gateway.tsManual gateway key smoke test
cloud/scripts/smoke-ai-proxy.tsManual broker proxy smoke test
cloud/src/server.tsMounts /ai/* routes
studio/packages/opencode/src/provider/provider.tstrellis-cloud provider loader
studio/packages/app/src/context/cloud-provider-sync.tsxPushes iframe auth token to OpenCode
studio/packages/app/src/lib/opencode-zen-model.tsDefault cloud model constants
studio/packages/app/src/lib/opencode-zen-model.test.tsUnit tests for hosted model keys