Endpoint
Request Examples
Response
Streaming
Enable streaming to receive partial responses in real-time via Server-Sent Events (SSE):Streaming Response Format
Each SSE event contains a JSON chunk:Request Parameters
The model ID to use. Examples:
claude-3-5-sonnet, claude-3-5-haiku, codex. See Supported Models for the full list.A list of messages in the conversation. Each message has a
role (system, user, or assistant) and content (string).Sampling temperature between 0 and 2. Higher values (e.g. 0.8) make output more random, lower values (e.g. 0.2) make it more deterministic. Default:
1.Maximum number of tokens to generate in the response.
If
true, partial responses are sent as Server-Sent Events. Default: false.Nucleus sampling parameter. Only tokens with cumulative probability up to
top_p are considered. Default: 1.Penalizes tokens based on their frequency in the text so far. Range:
-2.0 to 2.0. Default: 0.Penalizes tokens based on whether they have appeared in the text so far. Range:
-2.0 to 2.0. Default: 0.Up to 4 sequences where the API will stop generating further tokens.
Number of chat completion choices to generate for each input message. Default:
1.Supported Models
| Provider | Models |
|---|---|
| Anthropic | claude-sonnet-4-20250514, claude-3-5-sonnet, claude-3-5-haiku, claude-3-opus |
| OpenAI | codex |
View All Models
See the complete list