AI API
Chat completion. Auth: Authorization: Bearer <token> or x-api-key: <your-api-key>. Non-stream: JSON with choices[].message.content. Stream: SSE chunks with choices[].delta.content.
- Version: 1.0
- Base URL:
https://api.b.ai - OpenAPI: 3.1.0
Authentication
Bearer Token
- Type: HTTP Bearer token
- Header:
Authorization: Bearer <token> - Format: Use the same API key-style secret issued by the platform, for example
sk-xxx - Example:
Bearer sk-xxx
API Key
- Type: API Key
- Header:
x-api-key: <your-api-key> - Note: The
Chat CompletionsandMessagesendpoints both accept eitherx-api-keyorAuthorization: Bearer <token>. In practice, both use the same platform-issued secret.
An API Key is an important credential for accessing API services. To protect your account and projects, we recommend the following practices:
- Do not expose API Keys in public code repositories, frontend pages, or public documentation.
- Use separate API Keys for different projects to reduce the impact if one key is leaked.
- Rotate API Keys regularly and avoid using the same key for an extended period.
- If you suspect an API Key has been leaked, delete the old key promptly and create a new one.
- For team collaboration, establish clear rules for API Key management and permission usage.
B.AI will continue improving API Key security management to provide developers with a safer and more stable API experience.
If you are using a legacy API Key, please create a new API Key and complete the migration during the compatibility period.
Why do I need to replace my API Key?
This update is part of the B.AI API Key security architecture upgrade. New API Keys use a more secure generation and management mechanism, helping improve account security and API service stability.
Will my old API Key stop working immediately?
No. Legacy API Keys will remain supported during a 30-day compatibility migration period from the official migration notice. During this period, legacy API Keys can continue to be used. Please follow the official notice for the exact cutoff date.
What happens if I do not replace it?
After the compatibility period ends, legacy API Keys will no longer be supported for API calls. To avoid service interruptions, create and switch to a new API Key before the cutoff date.
Will replacing the API Key affect my production service?
It should not affect normal calls as long as you complete the replacement during the compatibility period. We recommend creating a new API Key in advance, verifying it in a test environment, and then replacing it in production.
Does this mean my API Key was leaked?
No. This is a platform API Key security architecture upgrade and does not mean your current API Key has been compromised.
Do I need to change request URLs or parameters?
No. In most cases, you only need to replace the API Key value used in Authorization: Bearer <token> or x-api-key. The Base URL and API paths remain unchanged.
Where can I create a new API Key?
Create a new API Key from the API Key management page in B.AI. Keep the new Key secure after creation, and update your application configuration or secret manager accordingly.
Endpoints
1. List Models
GET /v1/models
List available models. Auth: Bearer token.
Auth: Bearer Token or API Key (x-api-key)
Response 200:
{
"object": "list",
"success": true,
"data": [
{
"id": "gpt-5.2",
"object": "model",
"created": 1626777600,
"owned_by": "openai",
"supported_endpoint_types": ["openai", "anthropic"]
},
{
"id": "claude-sonnet-4.6",
"object": "model",
"created": 1626777600,
"owned_by": "anthropic",
"supported_endpoint_types": ["openai", "anthropic"]
}
]
}
Model ID aliases: For Claude models, the catalog may return dot-version aliases such as
claude-sonnet-4.6. B.AI API requests and Claude Code both accept the hyphenated aliases used in this documentation, such asclaude-sonnet-4-6. For Claude Code configuration, prefer the hyphenated alias.
| Status | Description |
|---|---|
| 200 | Success - list of models |
| 400 | Bad Request - invalid parameters or malformed body |
| 401 | Unauthorized - invalid or missing authentication |
| 403 | Forbidden - access denied, insufficient quota, or banned |
| 429 | Too Many Requests - rate limit exceeded |
| 500 | Internal Server Error |
2. Chat Completions (OpenAI Compatible)
POST /v1/chat/completions
Accepts a list of messages and returns a model-generated response. Supports both single-turn and multi-turn conversations. Responses can be streamed (SSE) or returned as a single JSON object.
Auth: Bearer Token
Request Body
| Parameter | Type | Required | Description |
|---|---|---|---|
model | string | Yes | ID of the model to use (e.g. gpt-5.2). Claude models also accept hyphenated aliases such as claude-sonnet-4-6. |
messages | array | Yes | List of messages in the conversation. See ChatMessage. |
stream | boolean | No | If true, partial message deltas will be sent as server-sent events. Default false. |
max_tokens | integer | No | Maximum number of tokens that can be generated in the completion. |
temperature | number | No | Sampling temperature between 0 and 2. Higher = more random. Default 1. |
top_p | number | No | Nucleus sampling: consider tokens with top_p probability mass. Default 1. |
stop | string | string[] | No | Up to 4 sequences where the API will stop generating. |
n | integer | No | How many chat completion choices to generate. Default 1. |
frequency_penalty | number | No | -2.0 to 2.0. Penalize repeated tokens. Default 0. |
presence_penalty | number | No | -2.0 to 2.0. Penalize tokens that appear in the text so far. Default 0. |
seed | integer | No | Random seed for deterministic sampling (if supported by model). |
response_format | object | No | Specify output format: { "type": "text" } or { "type": "json_object" } or json_schema. |
tools | array | No | List of tools the model may call. See ChatTool. |
tool_choice | string | object | No | "auto", "none", "required", or { "type": "function", "function": { "name": "..." } }. |
user | string | No | Optional end-user identifier for abuse monitoring. |
web_search_options | object | No | Enables web search for supported models. See WebSearchOptions. |
Request Example
{
"model": "gpt-5.2",
"messages": [
{ "role": "system", "content": "You are a helpful assistant." },
{ "role": "user", "content": "Hello" }
],
"stream": false,
"max_tokens": 1024,
"temperature": 1
}
Response (Non-stream)
{
"id": "chatcmpl-xxx",
"object": "chat.completion",
"created": 1677652288,
"model": "gpt-5.2",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! How can I help you?",
"refusal": null,
"annotations": []
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 12,
"completion_tokens": 8,
"total_tokens": 20,
"prompt_tokens_details": { "cached_tokens": 0, "audio_tokens": 0 },
"completion_tokens_details": {
"reasoning_tokens": 0,
"audio_tokens": 0,
"accepted_prediction_tokens": 0,
"rejected_prediction_tokens": 0
}
}
}
Response (Stream)
Each SSE chunk has object: "chat.completion.chunk" with choices[].delta.content containing incremental text. The final chunk includes usage and finish_reason.
| Status | Description |
|---|---|
| 200 | Success |
| 400 | Bad Request - invalid parameters, malformed body, or invalid request |
| 401 | Unauthorized - invalid or missing authentication |
| 403 | Forbidden - access denied, insufficient quota, or model access restricted |
| 429 | Too Many Requests - rate limit exceeded |
| 500 | Internal Server Error |
| 502 | Bad Gateway - upstream service error |
| 503 | Service Unavailable - overloaded or no available channel |
3. Messages (Claude Compatible)
POST /v1/messages
Accepts a list of messages and returns a model-generated response. Supports both single-turn and multi-turn conversations. Authenticate via x-api-key header or Bearer token. Responses can be streamed (SSE) or returned as a single JSON object.
Auth: API Key (x-api-key) or Bearer Token
Request Body
| Parameter | Type | Required | Description |
|---|---|---|---|
model | string | Yes | ID of the model. Claude models accept the hyphenated aliases shown here (e.g. claude-sonnet-4-6, claude-opus-4-6, claude-haiku-4-5). If /v1/models returns a dot alias such as claude-sonnet-4.6, you can use the matching hyphenated alias in API requests and Claude Code. |
max_tokens | integer | Yes | Maximum number of tokens to generate. Different models have different maximum values. |
messages | array | Yes | Input messages. Alternating user/assistant turns. Limit: 100,000 messages. See MessagesMessageItem. |
system | string | array | No | System prompt. Can be a plain string or an array of text blocks (for cache_control). |
stream | boolean | No | Whether to stream the response using SSE. Default false. |
temperature | number | No | Randomness (0.0 - 1.0). Use ~0.0 for analytical tasks, ~1.0 for creative tasks. Default 1. |
top_p | number | No | Nucleus sampling. Default 1. |
top_k | integer | No | Only sample from the top K options. Default disabled. |
stop_sequences | string[] | No | Custom text sequences that cause the model to stop generating. |
metadata | object | No | Request metadata. Supports user_id (opaque identifier). |
thinking | object | No | Extended thinking config. See ThinkingConfig. |
tools | array | No | Tool definitions the model may use. See Tool. |
tool_choice | object | No | How the model should use tools: auto, any, tool, or none. |
Request Example
{
"model": "claude-sonnet-4-6",
"max_tokens": 1024,
"messages": [
{ "role": "user", "content": "Hello, Claude!" }
],
"system": "You are a helpful assistant.",
"temperature": 1.0
}
Response (Non-stream)
{
"id": "chatcmpl-xxx",
"type": "message",
"role": "assistant",
"content": [
{ "type": "text", "text": "Hello! How can I help you?" }
],
"stop_reason": "end_turn",
"model": "claude-sonnet-4-6",
"usage": {
"input_tokens": 4,
"cache_creation_input_tokens": 0,
"cache_read_input_tokens": 0,
"output_tokens": 12,
"claude_cache_creation_5_m_tokens": 0,
"claude_cache_creation_1_h_tokens": 0
}
}
Response (Stream - SSE Events)
Stream responses emit the following event types:
| Event Type | Description | Key Fields |
|---|---|---|
message_start | Initial message metadata | message (id, model, role, usage) |
content_block_start | New content block begins | index, content_block (type, text) |
content_block_delta | Incremental content | index, delta (type: text_delta, text) |
content_block_stop | Content block ends | index |
message_stop | Message complete | - |
| Status | Description |
|---|---|
| 200 | Success |
| 400 | Bad Request - invalid parameters, malformed body, or invalid request |
| 401 | Unauthorized - invalid or missing API key |
| 403 | Forbidden - access denied, insufficient quota, or model access restricted |
| 429 | Too Many Requests - rate limit exceeded |
| 500 | Internal Server Error |
| 502 | Bad Gateway - upstream service error |
| 503 | Service Unavailable - overloaded or no available channel |
Data Models
ChatMessage
| Field | Type | Required | Description |
|---|---|---|---|
role | string | Yes | "system", "user", "assistant", or "tool" |
content | string | Yes | Message content. For tool role, the result of the tool call. |
name | string | No | Optional name for the message author. |
tool_call_id | string | No | When role is "tool", the ID of the tool call this result is for. |
tool_calls | array | No | When role is "assistant" and the model called tools. Array of { id, type, function: { name, arguments } }. |
MessagesMessageItem
| Field | Type | Required | Description |
|---|---|---|---|
role | string | Yes | "user" or "assistant" (no "system" - use top-level system parameter). |
content | string | array | Yes | Text string or array of content blocks (text, image, tool_use, tool_result). |
Content Block Types (Messages API)
TextBlockParam
{ "type": "text", "text": "Hello, Claude!", "cache_control": { "type": "ephemeral" } }
ImageBlockParam
Base64 source:
{
"type": "image",
"source": {
"type": "base64",
"media_type": "image/jpeg",
"data": "/9j/4AAQSkZJRg..."
}
}
URL source:
{
"type": "image",
"source": {
"type": "url",
"url": "https://example.com/image.jpg"
}
}
Supported media types: image/jpeg, image/png, image/gif, image/webp
ToolUseBlockParam (from assistant)
{
"type": "tool_use",
"id": "toolu_01D7FLrfh4GYq7yT1ULFeyMV",
"name": "get_stock_price",
"input": { "ticker": "AAPL" }
}
ToolResultBlockParam (from user)
{
"type": "tool_result",
"tool_use_id": "toolu_01D7FLrfh4GYq7yT1ULFeyMV",
"content": "259.75 USD",
"is_error": false
}
ThinkingConfig
Enable extended thinking to let Claude show its reasoning process.
Enabled:
{ "type": "enabled", "budget_tokens": 1024 }
budget_tokens: Must be >= 1024 and less thanmax_tokens.
Disabled:
{ "type": "disabled" }
Tool (Anthropic)
{
"name": "get_stock_price",
"description": "Get the current stock price for a given ticker symbol.",
"input_schema": {
"type": "object",
"properties": {
"ticker": { "type": "string" }
},
"required": ["ticker"]
}
}
ToolChoice (Anthropic)
| Type | Description |
|---|---|
{ "type": "auto" } | Model decides whether to use tools. Supports disable_parallel_tool_use. |
{ "type": "any" } | Model will use any available tool. Supports disable_parallel_tool_use. |
{ "type": "tool", "name": "..." } | Model will use the specified tool. Supports disable_parallel_tool_use. |
{ "type": "none" } | Model will not use tools. |
ChatTool
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get weather for a location",
"parameters": {
"type": "object",
"properties": {
"location": { "type": "string" }
},
"required": ["location"]
}
}
}
WebSearchOptions
| Field | Type | Description |
|---|---|---|
search_context_size | string | "low", "medium", or "high" - how much context window for web search results. |
user_location | object | Approximate user location (country ISO 3166-1 alpha-2, city, region, timezone). |
ChatResponseFormat
| Field | Type | Description |
|---|---|---|
type | string | "text" or "json_object" |
json_schema | object | When type is json_schema, optional schema for the output. |
Error Response
All error responses follow this format:
{
"error": {
"message": "Error message",
"type": "invalid_request_error",
"param": null,
"code": null
}
}
| Field | Type | Description |
|---|---|---|
message | string | Error message |
type | string | Error type (e.g. invalid_request_error) |
param | string | null | Related parameter |
code | string | null | Error code |