OpenAI¶
The OpenAI provider connects locus directly to OpenAI's API
(api.openai.com). It's what you reach for when you want the latest
OpenAI model the day it ships without going through any gateway,
translation layer, or middleware.
It's also the fastest way to try locus — one env var, one line of code, you're talking to GPT-5 or the o-series reasoning models.
When to pick OpenAI¶
| You want… | This is the right provider |
|---|---|
| GPT-5, GPT-4o, or any latest OpenAI release | ✓ |
The o-series reasoning models (o3, o4-mini) |
✓ |
| To go through Azure / Portkey / LiteLLM / vLLM | ✓ — same class, different base_url |
| Claude or Llama | use Anthropic or OCI instead |
| To run on Oracle infrastructure | use OCI — you'll get the same OpenAI models without a separate key |
Getting started¶
1. Set your API key¶
That's the only setup. locus reads the env var on import.
2. Pick a model¶
The string "openai:gpt-5.5" does two things: tells locus to use the
OpenAI provider (openai: prefix), and which model id to call
(gpt-5.5). Any model id OpenAI accepts, locus accepts.
3. Run it¶
Done. Streaming, tool calls, structured output — all of it works without further configuration.
What you get out of the box¶
Chat completions across the GPT family¶
Every chat-shaped OpenAI model: gpt-4o, gpt-4.1, gpt-5, gpt-5.5,
gpt-image-1. Vision input (image URLs / base64), audio input, and
function calling work the same way you'd use them on the OpenAI SDK
directly — locus just normalises the events the model emits.
Reasoning models — the o-series¶
o1, o3, o4-mini route through the same Agent(model="openai:o3")
call. They're slower and more expensive but think before they answer.
locus surfaces the model's thinking blocks as ThinkEvents so your
UI can show "thinking…" without parsing the response yourself.
agent = Agent(
model="openai:o3",
model_config={"reasoning_effort": "high"}, # low | medium | high
)
reasoning_effort is OpenAI's knob for how long the model spends
thinking. Default is medium.
Real SSE streaming¶
Token-level streaming over Server-Sent Events. The model emits
deltas, locus turns them into ModelChunkEvents, your async for
loop reads them as they arrive — no buffering, no fake chunking.
async for event in agent.run("Write a haiku about latency."):
if isinstance(event, ModelChunkEvent) and event.content:
print(event.content, end="", flush=True)
Tool calling — the OpenAI protocol¶
@tool functions are converted to OpenAI's tool-call schema and
the structured tool_calls field in the response is parsed back into
locus ToolCall objects. Parallel tool calls are supported (the
model can request multiple tools per turn; locus runs them
concurrently via the ConcurrentExecutor).
Structured output — Pydantic models in, validated objects out¶
from pydantic import BaseModel
class Answer(BaseModel):
summary: str
confidence: float
agent = Agent(
model="openai:gpt-5.5",
output_schema=Answer,
system_prompt="Reply as JSON matching the schema.",
)
result = agent.run_sync("Was the meeting productive?")
print(result.parsed) # Answer(summary='...', confidence=0.83)
Under the hood, locus sends an OpenAI response_format with the
schema and a strict-mode flag; if the model produces invalid JSON,
locus retries with the validation errors in the prompt
(output_schema_retries=2 by default).
Going through a gateway¶
A base_url override turns OpenAIModel into a client for any
OpenAI-compatible endpoint:
| Gateway | When to use it | base_url |
|---|---|---|
| Azure OpenAI | Enterprise / regulated workloads, Azure billing | https://<resource>.openai.azure.com/openai/deployments/<deployment-id> |
| Portkey | Virtual keys, request routing across providers, retries | https://api.portkey.ai/v1 |
| LiteLLM Proxy | Self-hosted control plane in front of N providers | https://<your-litellm-host>/v1 |
| vLLM | Self-hosted inference for open models with the OpenAI shape | http://localhost:8000/v1 |
| together.ai / fireworks / groq | Hosted open-model inference at OpenAI-shape | their published /v1 |
The api_key your OPENAI_API_KEY provides is forwarded — for Azure
that's the Azure resource key, for Portkey it's the Portkey virtual
key, etc.
Common gotchas¶
| Symptom | Likely cause |
|---|---|
401 Unauthorized |
OPENAI_API_KEY not set, or set to the wrong project's key |
429 Rate limit exceeded |
OpenAI quota; ModelRetryHook (if installed) retries with backoff |
model_not_found |
Model id doesn't exist for your tier — check https://platform.openai.com/docs/models |
Empty tool_calls |
Model decided not to call a tool; check the system prompt |
reasoning_effort rejected |
Only valid for o-series models, not GPT-4o / GPT-5 |
Source¶
OpenAIModel in src/locus/models/native/openai.py
See also¶
- Models overview — the full provider tree.
- Anthropic — Claude family direct.
- OCI Generative AI — same OpenAI models without a separate key, on Oracle infrastructure.