Skip to main content

AI API

Chat completion. Auth: Authorization: Bearer <token> or x-api-key: <your-api-key>. Non-stream: JSON with choices[].message.content. Stream: SSE chunks with choices[].delta.content.

  • Version: 1.0
  • Base URL: https://api.b.ai
  • OpenAPI: 3.1.0

Authentication

Bearer Token

  • Type: HTTP Bearer token
  • Header: Authorization: Bearer <token>
  • Format: Use the same API key-style secret issued by the platform, for example sk-xxx
  • Example: Bearer sk-xxx

API Key

  • Type: API Key
  • Header: x-api-key: <your-api-key>
  • Note: The Chat Completions and Messages endpoints both accept either x-api-key or Authorization: Bearer <token>. In practice, both use the same platform-issued secret.
API Key security best practices

An API Key is an important credential for accessing API services. To protect your account and projects, we recommend the following practices:

  1. Do not expose API Keys in public code repositories, frontend pages, or public documentation.
  2. Use separate API Keys for different projects to reduce the impact if one key is leaked.
  3. Rotate API Keys regularly and avoid using the same key for an extended period.
  4. If you suspect an API Key has been leaked, delete the old key promptly and create a new one.
  5. For team collaboration, establish clear rules for API Key management and permission usage.

B.AI will continue improving API Key security management to provide developers with a safer and more stable API experience.

API Key security upgrade and migration FAQ

If you are using a legacy API Key, please create a new API Key and complete the migration during the compatibility period.

Why do I need to replace my API Key?

This update is part of the B.AI API Key security architecture upgrade. New API Keys use a more secure generation and management mechanism, helping improve account security and API service stability.

Will my old API Key stop working immediately?

No. Legacy API Keys will remain supported during a 30-day compatibility migration period from the official migration notice. During this period, legacy API Keys can continue to be used. Please follow the official notice for the exact cutoff date.

What happens if I do not replace it?

After the compatibility period ends, legacy API Keys will no longer be supported for API calls. To avoid service interruptions, create and switch to a new API Key before the cutoff date.

Will replacing the API Key affect my production service?

It should not affect normal calls as long as you complete the replacement during the compatibility period. We recommend creating a new API Key in advance, verifying it in a test environment, and then replacing it in production.

Does this mean my API Key was leaked?

No. This is a platform API Key security architecture upgrade and does not mean your current API Key has been compromised.

Do I need to change request URLs or parameters?

No. In most cases, you only need to replace the API Key value used in Authorization: Bearer <token> or x-api-key. The Base URL and API paths remain unchanged.

Where can I create a new API Key?

Create a new API Key from the API Key management page in B.AI. Keep the new Key secure after creation, and update your application configuration or secret manager accordingly.


Endpoints

1. List Models

GET /v1/models

List available models. Auth: Bearer token.

Auth: Bearer Token or API Key (x-api-key)

Response 200:

{
"object": "list",
"success": true,
"data": [
{
"id": "gpt-5.2",
"object": "model",
"created": 1626777600,
"owned_by": "openai",
"supported_endpoint_types": ["openai", "anthropic"]
},
{
"id": "claude-sonnet-4.6",
"object": "model",
"created": 1626777600,
"owned_by": "anthropic",
"supported_endpoint_types": ["openai", "anthropic"]
}
]
}

Model ID aliases: For Claude models, the catalog may return dot-version aliases such as claude-sonnet-4.6. B.AI API requests and Claude Code both accept the hyphenated aliases used in this documentation, such as claude-sonnet-4-6. For Claude Code configuration, prefer the hyphenated alias.

StatusDescription
200Success - list of models
400Bad Request - invalid parameters or malformed body
401Unauthorized - invalid or missing authentication
403Forbidden - access denied, insufficient quota, or banned
429Too Many Requests - rate limit exceeded
500Internal Server Error

2. Chat Completions (OpenAI Compatible)

POST /v1/chat/completions

Accepts a list of messages and returns a model-generated response. Supports both single-turn and multi-turn conversations. Responses can be streamed (SSE) or returned as a single JSON object.

Auth: Bearer Token

Request Body

ParameterTypeRequiredDescription
modelstringYesID of the model to use (e.g. gpt-5.2). Claude models also accept hyphenated aliases such as claude-sonnet-4-6.
messagesarrayYesList of messages in the conversation. See ChatMessage.
streambooleanNoIf true, partial message deltas will be sent as server-sent events. Default false.
max_tokensintegerNoMaximum number of tokens that can be generated in the completion.
temperaturenumberNoSampling temperature between 0 and 2. Higher = more random. Default 1.
top_pnumberNoNucleus sampling: consider tokens with top_p probability mass. Default 1.
stopstring | string[]NoUp to 4 sequences where the API will stop generating.
nintegerNoHow many chat completion choices to generate. Default 1.
frequency_penaltynumberNo-2.0 to 2.0. Penalize repeated tokens. Default 0.
presence_penaltynumberNo-2.0 to 2.0. Penalize tokens that appear in the text so far. Default 0.
seedintegerNoRandom seed for deterministic sampling (if supported by model).
response_formatobjectNoSpecify output format: { "type": "text" } or { "type": "json_object" } or json_schema.
toolsarrayNoList of tools the model may call. See ChatTool.
tool_choicestring | objectNo"auto", "none", "required", or { "type": "function", "function": { "name": "..." } }.
userstringNoOptional end-user identifier for abuse monitoring.
web_search_optionsobjectNoEnables web search for supported models. See WebSearchOptions.

Request Example

{
"model": "gpt-5.2",
"messages": [
{ "role": "system", "content": "You are a helpful assistant." },
{ "role": "user", "content": "Hello" }
],
"stream": false,
"max_tokens": 1024,
"temperature": 1
}

Response (Non-stream)

{
"id": "chatcmpl-xxx",
"object": "chat.completion",
"created": 1677652288,
"model": "gpt-5.2",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! How can I help you?",
"refusal": null,
"annotations": []
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 12,
"completion_tokens": 8,
"total_tokens": 20,
"prompt_tokens_details": { "cached_tokens": 0, "audio_tokens": 0 },
"completion_tokens_details": {
"reasoning_tokens": 0,
"audio_tokens": 0,
"accepted_prediction_tokens": 0,
"rejected_prediction_tokens": 0
}
}
}

Response (Stream)

Each SSE chunk has object: "chat.completion.chunk" with choices[].delta.content containing incremental text. The final chunk includes usage and finish_reason.

StatusDescription
200Success
400Bad Request - invalid parameters, malformed body, or invalid request
401Unauthorized - invalid or missing authentication
403Forbidden - access denied, insufficient quota, or model access restricted
429Too Many Requests - rate limit exceeded
500Internal Server Error
502Bad Gateway - upstream service error
503Service Unavailable - overloaded or no available channel

3. Messages (Claude Compatible)

POST /v1/messages

Accepts a list of messages and returns a model-generated response. Supports both single-turn and multi-turn conversations. Authenticate via x-api-key header or Bearer token. Responses can be streamed (SSE) or returned as a single JSON object.

Auth: API Key (x-api-key) or Bearer Token

Request Body

ParameterTypeRequiredDescription
modelstringYesID of the model. Claude models accept the hyphenated aliases shown here (e.g. claude-sonnet-4-6, claude-opus-4-6, claude-haiku-4-5). If /v1/models returns a dot alias such as claude-sonnet-4.6, you can use the matching hyphenated alias in API requests and Claude Code.
max_tokensintegerYesMaximum number of tokens to generate. Different models have different maximum values.
messagesarrayYesInput messages. Alternating user/assistant turns. Limit: 100,000 messages. See MessagesMessageItem.
systemstring | arrayNoSystem prompt. Can be a plain string or an array of text blocks (for cache_control).
streambooleanNoWhether to stream the response using SSE. Default false.
temperaturenumberNoRandomness (0.0 - 1.0). Use ~0.0 for analytical tasks, ~1.0 for creative tasks. Default 1.
top_pnumberNoNucleus sampling. Default 1.
top_kintegerNoOnly sample from the top K options. Default disabled.
stop_sequencesstring[]NoCustom text sequences that cause the model to stop generating.
metadataobjectNoRequest metadata. Supports user_id (opaque identifier).
thinkingobjectNoExtended thinking config. See ThinkingConfig.
toolsarrayNoTool definitions the model may use. See Tool.
tool_choiceobjectNoHow the model should use tools: auto, any, tool, or none.

Request Example

{
"model": "claude-sonnet-4-6",
"max_tokens": 1024,
"messages": [
{ "role": "user", "content": "Hello, Claude!" }
],
"system": "You are a helpful assistant.",
"temperature": 1.0
}

Response (Non-stream)

{
"id": "chatcmpl-xxx",
"type": "message",
"role": "assistant",
"content": [
{ "type": "text", "text": "Hello! How can I help you?" }
],
"stop_reason": "end_turn",
"model": "claude-sonnet-4-6",
"usage": {
"input_tokens": 4,
"cache_creation_input_tokens": 0,
"cache_read_input_tokens": 0,
"output_tokens": 12,
"claude_cache_creation_5_m_tokens": 0,
"claude_cache_creation_1_h_tokens": 0
}
}

Response (Stream - SSE Events)

Stream responses emit the following event types:

Event TypeDescriptionKey Fields
message_startInitial message metadatamessage (id, model, role, usage)
content_block_startNew content block beginsindex, content_block (type, text)
content_block_deltaIncremental contentindex, delta (type: text_delta, text)
content_block_stopContent block endsindex
message_stopMessage complete-
StatusDescription
200Success
400Bad Request - invalid parameters, malformed body, or invalid request
401Unauthorized - invalid or missing API key
403Forbidden - access denied, insufficient quota, or model access restricted
429Too Many Requests - rate limit exceeded
500Internal Server Error
502Bad Gateway - upstream service error
503Service Unavailable - overloaded or no available channel

Data Models

ChatMessage

FieldTypeRequiredDescription
rolestringYes"system", "user", "assistant", or "tool"
contentstringYesMessage content. For tool role, the result of the tool call.
namestringNoOptional name for the message author.
tool_call_idstringNoWhen role is "tool", the ID of the tool call this result is for.
tool_callsarrayNoWhen role is "assistant" and the model called tools. Array of { id, type, function: { name, arguments } }.

MessagesMessageItem

FieldTypeRequiredDescription
rolestringYes"user" or "assistant" (no "system" - use top-level system parameter).
contentstring | arrayYesText string or array of content blocks (text, image, tool_use, tool_result).

Content Block Types (Messages API)

TextBlockParam

{ "type": "text", "text": "Hello, Claude!", "cache_control": { "type": "ephemeral" } }

ImageBlockParam

Base64 source:

{
"type": "image",
"source": {
"type": "base64",
"media_type": "image/jpeg",
"data": "/9j/4AAQSkZJRg..."
}
}

URL source:

{
"type": "image",
"source": {
"type": "url",
"url": "https://example.com/image.jpg"
}
}

Supported media types: image/jpeg, image/png, image/gif, image/webp

ToolUseBlockParam (from assistant)

{
"type": "tool_use",
"id": "toolu_01D7FLrfh4GYq7yT1ULFeyMV",
"name": "get_stock_price",
"input": { "ticker": "AAPL" }
}

ToolResultBlockParam (from user)

{
"type": "tool_result",
"tool_use_id": "toolu_01D7FLrfh4GYq7yT1ULFeyMV",
"content": "259.75 USD",
"is_error": false
}

ThinkingConfig

Enable extended thinking to let Claude show its reasoning process.

Enabled:

{ "type": "enabled", "budget_tokens": 1024 }
  • budget_tokens: Must be >= 1024 and less than max_tokens.

Disabled:

{ "type": "disabled" }

Tool (Anthropic)

{
"name": "get_stock_price",
"description": "Get the current stock price for a given ticker symbol.",
"input_schema": {
"type": "object",
"properties": {
"ticker": { "type": "string" }
},
"required": ["ticker"]
}
}

ToolChoice (Anthropic)

TypeDescription
{ "type": "auto" }Model decides whether to use tools. Supports disable_parallel_tool_use.
{ "type": "any" }Model will use any available tool. Supports disable_parallel_tool_use.
{ "type": "tool", "name": "..." }Model will use the specified tool. Supports disable_parallel_tool_use.
{ "type": "none" }Model will not use tools.

ChatTool

{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get weather for a location",
"parameters": {
"type": "object",
"properties": {
"location": { "type": "string" }
},
"required": ["location"]
}
}
}

WebSearchOptions

FieldTypeDescription
search_context_sizestring"low", "medium", or "high" - how much context window for web search results.
user_locationobjectApproximate user location (country ISO 3166-1 alpha-2, city, region, timezone).

ChatResponseFormat

FieldTypeDescription
typestring"text" or "json_object"
json_schemaobjectWhen type is json_schema, optional schema for the output.

Error Response

All error responses follow this format:

{
"error": {
"message": "Error message",
"type": "invalid_request_error",
"param": null,
"code": null
}
}
FieldTypeDescription
messagestringError message
typestringError type (e.g. invalid_request_error)
paramstring | nullRelated parameter
codestring | nullError code