MODELS

Models reference

GPT-5.4, GPT-5.5, Codex-5.3, GPT-5.4-mini, Fast and Priority tiers — cost coefficients, reasoning effort levels, selection guide.

Summary table

ModelPurposeCost coefficientContext
gpt-5.4Base GPT-5, balanced default1.0×256k
gpt-5.5Extended, better for long-context tasks1.4×400k
gpt-5.4-miniCheap and fast, for classification0.2×128k
codex-5.3Code-specialized for Codex CLI1.1×256k

The coefficient is applied to the base OpenAI token price and multiplied by the 1.09 margin.

GPT-5.4

The default general-purpose model. Good for chat, content generation, simple reasoning. Pick it when unsure — it offers the best price/quality trade-off.

{"model": "gpt-5.4", "messages": [...]}

GPT-5.5

Extended version with larger context and deeper reasoning. Use it for:

  • long documents (>100k tokens),
  • complex code analysis,
  • multi-step reasoning,
  • math and logic problems.

It costs 40% more, but pays off when gpt-5.4 underperforms.

GPT-5.4-mini

The cheapest model. Suitable for:

  • classification, tagging, labeling,
  • structured-data extraction from text,
  • simple fallback flows,
  • embedding-alternative use cases.

Do not use it for long-form generation or code — quality drops.

Codex-5.3

Specialized for Codex CLI and code generation. Supports the Responses API (POST /v1/responses) for interactive agent sessions. Pick it for:

  • IDE code completion,
  • patch/diff generation,
  • agent loops with tool use.
codex --model codex-5.3 "Generate a REST API in FastAPI"

Tiers: Fast and Priority

You can choose a processing tier with the service_tier request parameter:

  • default — standard queue.
  • priority — higher priority, lower latency, +30% price.
  • flex — batched processing, −40% price, latency up to several minutes.
{"model": "gpt-5.4", "service_tier": "priority", "messages": [...]}

Reasoning effort

GPT-5.x models accept a reasoning_effort parameter that controls internal reasoning depth:

ValueOutput-token multiplierWhen to use
minimal1.0×Plain answers, chat, classification
low1.5×Basic reasoning, typical tasks
medium2.5×Complex analysis, multi-step logic
high4.0×Math, proof checking, debugging
{
  "model": "gpt-5.4",
  "reasoning_effort": "medium",
  "messages": [{"role": "user", "content": "Solve this equation..."}]
}

Higher effort means more internal reasoning tokens spent — and a more expensive request. Start with minimal and increase only if the response quality is insufficient.

How to choose

  1. General-purpose chat / agentgpt-5.4 + minimal.
  2. Long PDF / large-repo analysisgpt-5.5 + medium.
  3. Codex CLI / IDE assistantcodex-5.3.
  4. Bulk classificationgpt-5.4-mini + minimal.
  5. Production agent with tool usecodex-5.3 + low/medium.

See Billing for token-cost details.