Endpoint
Authentication
Requires authentication using Bearer token or x-api-key header. See Authentication.Request Body
The model to use for completion. Can be:
- Specific model ID (e.g.,
gpt-4o,claude-3-5-sonnet-20241022) - Provider-prefixed model (e.g.,
openai/gpt-4o,anthropic/claude-3-5-sonnet-20241022) autofor automatic model selection based on cost and capabilities
"gpt-5"Array of message objects in the conversation.Each message has:
role(string):"user","assistant","system", or"tool"content(string | array): Message content or array of content parts for multimodal messagesname(string, optional): Name of the message sendertool_call_id(string, optional): ID of the tool call this message is responding totool_calls(array, optional): Tool calls made by the assistant
Sampling temperature between 0 and 2. Higher values make output more random.Example:
0.7Maximum number of tokens to generate in the completion.Example:
1000Nucleus sampling parameter. Alternative to temperature.Example:
0.9Penalty for repeated tokens based on frequency. Range: -2.0 to 2.0.Example:
0.0Penalty for repeated tokens based on presence. Range: -2.0 to 2.0.Example:
0.0Format for the model response. Options:
{"type": "text"}- Plain text (default){"type": "json_object"}- Valid JSON object{"type": "json_schema", "json_schema": {...}}- JSON matching schema
Whether to stream the response as Server-Sent Events.Example:
falseArray of tools the model can use. Each tool has:
type:"function"or"web_search"function: For function tools, includesname,description, andparameters
Controls which tools the model uses:
"auto"- Model decides (default)"none"- Never use tools"required"- Must use at least one tool{"type": "function", "function": {"name": "..."}}- Force specific function
Controls reasoning effort for reasoning-capable models.Options:
"minimal", "low", "medium", "high", "xhigh"Example: "medium"Unified reasoning configuration. Alternative to
reasoning_effort.Properties:effort: Same asreasoning_effortmax_tokens: Exact number of tokens for reasoning (overrides effort)
Computational effort for supported models (currently claude-opus-4-5).Options:
"low", "medium", "high"Example: "medium"Enable native web search for models that support it.Example:
trueWhen using auto routing, only route to free models.Example:
falseWhen using auto routing, exclude reasoning models from selection.Example:
falsePlugins to enable for this request.Example:Available plugins:
response-healing: Automatically repairs malformed JSON responses
Response
Unique identifier for the completion.
Object type, always
"chat.completion".Unix timestamp of when the completion was created.
The model used for completion.
Array of completion choices.Each choice contains:
index(number): Choice indexmessage(object): The generated messagerole(string): Always"assistant"content(string | null): Message contentreasoning(string | null, optional): Internal reasoning for reasoning modelstool_calls(array, optional): Tool calls made by the modelimages(array, optional): Generated images
finish_reason(string): Why generation stopped ("stop","length","tool_calls", etc.)
Token usage information.Contains:
prompt_tokens(number): Tokens in the promptcompletion_tokens(number): Tokens in the completiontotal_tokens(number): Total tokens usedreasoning_tokens(number, optional): Tokens used for reasoningprompt_tokens_details(object, optional): Breakdown of prompt tokenscached_tokens(number): Tokens served from cache
cost_usd_total(number, optional): Total cost in USDcost_usd_input(number, optional): Input cost in USDcost_usd_output(number, optional): Output cost in USDcost_usd_cached_input(number, optional): Cached input cost in USDcost_usd_request(number, optional): Per-request cost in USD
Routing and provider information.Contains:
requested_model(string): Model requested by clientrequested_provider(string | null): Provider requested by clientused_model(string): Actual model usedused_provider(string): Actual provider usedunderlying_used_model(string): Provider’s native model namerouting(array, optional): Routing attempts and errors