API

API reference

Full Codex Key endpoint reference: /v1/chat/completions, /v1/responses, /v1/models. Parameters, request and response examples, error codes.

May 19, 2026

Base URL

https://codexkey.ru/v1

All endpoints are OpenAI-compatible — you can use the official SDK by overriding base_url.

POST /v1/chat/completions

Main endpoint for GPT-5.x chat models and gpt-5.4-mini.

Request parameters

Parameter	Type	Required	Description
`model`	string	yes	Model identifier (`gpt-5.4`, `gpt-5.5`, …).
`messages`	array	yes	Array of `{role, content}` messages.
`temperature`	number	no	0.0–2.0. Default 1.0.
`top_p`	number	no	Nucleus sampling, 0.0–1.0.
`max_tokens`	integer	no	Maximum tokens in the response.
`stream`	boolean	no	If `true` — Server-Sent Events.
`reasoning_effort`	string	no	`minimal` / `low` / `medium` / `high`.
`service_tier`	string	no	`default` / `priority` / `flex`.
`tools`	array	no	Function declarations for tool use.
`tool_choice`	string\|obj	no	`auto` / `none` / `{type: "function", ...}`.

Example

curl https://codexkey.ru/v1/chat/completions \
  -H "Authorization: Bearer sk-clb-..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5.4",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "Write hello world in Rust."}
    ],
    "temperature": 0.2,
    "max_tokens": 500
  }'

Response

{
  "id": "chatcmpl-9X...",
  "object": "chat.completion",
  "created": 1747641600,
  "model": "gpt-5.4",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "fn main() { println!(\"Hello, world!\"); }"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 24,
    "completion_tokens": 14,
    "reasoning_tokens": 0,
    "total_tokens": 38
  }
}

POST /v1/responses

Endpoint for Codex models and agent sessions with tool use. Supports streamed events and stateful mode.

Parameters

Parameter	Type	Required	Description
`model`	string	yes	`codex-5.3` or `gpt-5.4` with Responses API support.
`input`	string	yes	Prompt text or array of messages.
`instructions`	string	no	System instructions.
`tools`	array	no	Functions / built-in tools.
`previous_response_id`	string	no	Continue a session — reference to a previous response.
`stream`	boolean	no	SSE event stream.

Example

curl https://codexkey.ru/v1/responses \
  -H "Authorization: Bearer sk-clb-..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "codex-5.3",
    "input": "Create a file hello.py with print(\"hi\")"
  }'

Response

{
  "id": "resp_abc123",
  "object": "response",
  "created_at": 1747641600,
  "model": "codex-5.3",
  "output": [
    {
      "type": "message",
      "role": "assistant",
      "content": [{"type": "output_text", "text": "Done. File hello.py created."}]
    }
  ],
  "usage": {
    "input_tokens": 18,
    "output_tokens": 12,
    "reasoning_tokens": 4,
    "total_tokens": 34
  }
}

GET /v1/models

Returns the list of models available to the current key.

curl https://codexkey.ru/v1/models \
  -H "Authorization: Bearer sk-clb-..."

{
  "object": "list",
  "data": [
    {"id": "gpt-5.4", "object": "model", "owned_by": "codexkey"},
    {"id": "gpt-5.5", "object": "model", "owned_by": "codexkey"},
    {"id": "gpt-5.4-mini", "object": "model", "owned_by": "codexkey"},
    {"id": "codex-5.3", "object": "model", "owned_by": "codexkey"}
  ]
}

Error codes

Code	Meaning	What to do
200	OK	Success.
400	Bad Request	Check JSON and required parameters.
401	Unauthorized	Verify the `Authorization` header and the key.
402	Payment Required	Balance depleted — top up at `/cabinet`.
429	Too Many Requests	Backoff — respect `Retry-After`.
500	Internal Server Error	Transient error — retry with delay.
503	Service Unavailable	Upstream overload — retry in 5–30 seconds.

Error format

{
  "error": {
    "type": "insufficient_quota",
    "code": "balance_zero",
    "message": "Account balance is zero. Top up at https://codexkey.ru/cabinet/billing.",
    "param": null
  }
}

Rate limits

Default per-key limits:

60 RPM (requests per minute),
200k TPM (tokens per minute).

Limits scale automatically with spend growth. Current limits are visible at /cabinet/limits.