Quick Start
Once the daemon is running, Lattis listens on 127.0.0.1:1234 by default. Point
any OpenAI- or Anthropic-compatible client at it.
1. Confirm it’s up
Section titled “1. Confirm it’s up”curl http://127.0.0.1:1234/health# {"status":"ok"}2. Add a model
Section titled “2. Add a model”Open the app and either:
- Download a local model from the Library (a GGUF from Hugging Face), or
- Connect a cloud provider (Anthropic or OpenAI / Codex) — see Cloud Providers.
List what is currently available:
curl http://127.0.0.1:1234/v1/modelsThis merges your local models and any connected cloud models.
3. Send a request
Section titled “3. Send a request”OpenAI-compatible:
curl http://127.0.0.1:1234/v1/chat/completions \ -H 'Content-Type: application/json' \ -d '{ "model": "qwen3-4b-instruct-2507", "messages": [{"role": "user", "content": "Hello!"}] }'Anthropic-compatible:
curl http://127.0.0.1:1234/v1/messages \ -H 'Content-Type: application/json' \ -d '{ "model": "qwen3-4b-instruct-2507", "max_tokens": 256, "messages": [{"role": "user", "content": "Hello!"}] }'4. Switch to a cloud model
Section titled “4. Switch to a cloud model”Use a connected cloud model by passing its id as model — Lattis routes it to
the provider and translates formats as needed:
curl http://127.0.0.1:1234/v1/chat/completions \ -H 'Content-Type: application/json' \ -d '{ "model": "claude-opus-4-8", "messages": [{"role": "user", "content": "Hello!"}] }'Point your SDK at it
Section titled “Point your SDK at it”Any OpenAI or Anthropic SDK works — just set the base URL:
from openai import OpenAI
client = OpenAI(base_url="http://127.0.0.1:1234/v1", api_key="local")resp = client.chat.completions.create( model="qwen3-4b-instruct-2507", messages=[{"role": "user", "content": "Hello!"}],)print(resp.choices[0].message.content)Next: the full Public API, or learn about Local Models and Usage & Cost.