OpenAI¶

The OpenAI provider connects locus directly to OpenAI's API (api.openai.com). It's what you reach for when you want the latest OpenAI model the day it ships without going through any gateway, translation layer, or middleware.

It's also the fastest way to try locus — one env var, one line of code, you're talking to GPT-5 or the o-series reasoning models.

When to pick OpenAI¶

You want…	This is the right provider
GPT-5, GPT-4o, or any latest OpenAI release	✓
The o-series reasoning models (`o3`, `o4-mini`)	✓
To go through Azure / Portkey / LiteLLM / vLLM	✓ — same class, different `base_url`
Claude or Llama	use Anthropic or OCI instead
To run on Oracle infrastructure	use OCI — you'll get the same OpenAI models without a separate key

Getting started¶

1. Set your API key¶

export OPENAI_API_KEY=sk-...

That's the only setup. locus reads the env var on import.

2. Pick a model¶

from locus import Agent

agent = Agent(model="openai:gpt-5.5", system_prompt="You are helpful.")

The string "openai:gpt-5.5" does two things: tells locus to use the OpenAI provider (openai: prefix), and which model id to call (gpt-5.5). Any model id OpenAI accepts, locus accepts.

3. Run it¶

result = agent.run_sync("What is two plus two?")
print(result.message)
# → 'Four.'

Done. Streaming, tool calls, structured output — all of it works without further configuration.

What you get out of the box¶

Chat completions across the GPT family¶

Every chat-shaped OpenAI model: gpt-4o, gpt-4.1, gpt-5, gpt-5.5, gpt-image-1. Vision input (image URLs / base64), audio input, and function calling work the same way you'd use them on the OpenAI SDK directly — locus just normalises the events the model emits.

Reasoning models — the o-series¶

o1, o3, o4-mini route through the same Agent(model="openai:o3") call. They're slower and more expensive but think before they answer. locus surfaces the model's thinking blocks as ThinkEvents so your UI can show "thinking…" without parsing the response yourself.

agent = Agent(
    model="openai:o3",
    model_config={"reasoning_effort": "high"},   # low | medium | high
)

reasoning_effort is OpenAI's knob for how long the model spends thinking. Default is medium.

Real SSE streaming¶

Token-level streaming over Server-Sent Events. The model emits deltas, locus turns them into ModelChunkEvents, your async for loop reads them as they arrive — no buffering, no fake chunking.

async for event in agent.run("Write a haiku about latency."):
    if isinstance(event, ModelChunkEvent) and event.content:
        print(event.content, end="", flush=True)

Tool calling — the OpenAI protocol¶

@tool functions are converted to OpenAI's tool-call schema and the structured tool_calls field in the response is parsed back into locus ToolCall objects. Parallel tool calls are supported (the model can request multiple tools per turn; locus runs them concurrently via the ConcurrentExecutor).

Structured output — Pydantic models in, validated objects out¶

from pydantic import BaseModel

class Answer(BaseModel):
    summary: str
    confidence: float

agent = Agent(
    model="openai:gpt-5.5",
    output_schema=Answer,
    system_prompt="Reply as JSON matching the schema.",
)
result = agent.run_sync("Was the meeting productive?")
print(result.parsed)        # Answer(summary='...', confidence=0.83)

Under the hood, locus sends an OpenAI response_format with the schema and a strict-mode flag; if the model produces invalid JSON, locus retries with the validation errors in the prompt (output_schema_retries=2 by default).

Going through a gateway¶

A base_url override turns OpenAIModel into a client for any OpenAI-compatible endpoint:

Gateway	When to use it	`base_url`
Azure OpenAI	Enterprise / regulated workloads, Azure billing	`https://<resource>.openai.azure.com/openai/deployments/<deployment-id>`
Portkey	Virtual keys, request routing across providers, retries	`https://api.portkey.ai/v1`
LiteLLM Proxy	Self-hosted control plane in front of N providers	`https://<your-litellm-host>/v1`
vLLM	Self-hosted inference for open models with the OpenAI shape	`http://localhost:8000/v1`
together.ai / fireworks / groq	Hosted open-model inference at OpenAI-shape	their published `/v1`

agent = Agent(
    model="openai:gpt-4o",
    model_config={"base_url": "https://api.portkey.ai/v1"},
)

The api_key your OPENAI_API_KEY provides is forwarded — for Azure that's the Azure resource key, for Portkey it's the Portkey virtual key, etc.

Common gotchas¶

Symptom	Likely cause
`401 Unauthorized`	`OPENAI_API_KEY` not set, or set to the wrong project's key
`429 Rate limit exceeded`	OpenAI quota; `ModelRetryHook` (if installed) retries with backoff
`model_not_found`	Model id doesn't exist for your tier — check `https://platform.openai.com/docs/models`
Empty `tool_calls`	Model decided not to call a tool; check the system prompt
`reasoning_effort` rejected	Only valid for o-series models, not GPT-4o / GPT-5

Source¶

OpenAIModel in src/locus/models/native/openai.py