AI API

Chat completion. Auth: Authorization: Bearer <token> or x-api-key: <your-api-key>. Non-stream: JSON with choices[].message.content. Stream: SSE chunks with choices[].delta.content.

Version: 1.0
Base URL: https://api.b.ai
OpenAPI: 3.1.0

Authentication

Bearer Token

Type: HTTP Bearer token
Header: Authorization: Bearer <token>
Format: Use the same API key-style secret issued by the platform, for example sk-xxx
Example: Bearer sk-xxx

API Key

Type: API Key
Header: x-api-key: <your-api-key>
Note: The Chat Completions and Messages endpoints both accept either x-api-key or Authorization: Bearer <token>. In practice, both use the same platform-issued secret.

API Key security best practices

An API Key is an important credential for accessing API services. To protect your account and projects, we recommend the following practices:

Do not expose API Keys in public code repositories, frontend pages, or public documentation.
Use separate API Keys for different projects to reduce the impact if one key is leaked.
Rotate API Keys regularly and avoid using the same key for an extended period.
If you suspect an API Key has been leaked, delete the old key promptly and create a new one.
For team collaboration, establish clear rules for API Key management and permission usage.

B.AI will continue improving API Key security management to provide developers with a safer and more stable API experience.

API Key security upgrade and migration FAQ

If you are using a legacy API Key, please create a new API Key and complete the migration during the compatibility period.

Why do I need to replace my API Key?

This update is part of the B.AI API Key security architecture upgrade. New API Keys use a more secure generation and management mechanism, helping improve account security and API service stability.

Will my old API Key stop working immediately?

No. Legacy API Keys will remain supported during a 30-day compatibility migration period from the official migration notice. During this period, legacy API Keys can continue to be used. Please follow the official notice for the exact cutoff date.

What happens if I do not replace it?

After the compatibility period ends, legacy API Keys will no longer be supported for API calls. To avoid service interruptions, create and switch to a new API Key before the cutoff date.

Will replacing the API Key affect my production service?

It should not affect normal calls as long as you complete the replacement during the compatibility period. We recommend creating a new API Key in advance, verifying it in a test environment, and then replacing it in production.

Does this mean my API Key was leaked?

No. This is a platform API Key security architecture upgrade and does not mean your current API Key has been compromised.

Do I need to change request URLs or parameters?

No. In most cases, you only need to replace the API Key value used in Authorization: Bearer <token> or x-api-key. The Base URL and API paths remain unchanged.

Where can I create a new API Key?

Create a new API Key from the API Key management page in B.AI. Keep the new Key secure after creation, and update your application configuration or secret manager accordingly.

Endpoints

1. List Models

GET /v1/models

List available models. Auth: Bearer token.

Auth: Bearer Token or API Key (x-api-key)

Response 200:

{
  "object": "list",
  "success": true,
  "data": [
    {
      "id": "gpt-5.2",
      "object": "model",
      "created": 1626777600,
      "owned_by": "openai",
      "supported_endpoint_types": ["openai", "anthropic"]
    },
    {
      "id": "claude-sonnet-4.6",
      "object": "model",
      "created": 1626777600,
      "owned_by": "anthropic",
      "supported_endpoint_types": ["openai", "anthropic"]
    }
  ]
}

Model ID aliases: For Claude models, the catalog may return dot-version aliases such as claude-sonnet-4.6. B.AI API requests and Claude Code both accept the hyphenated aliases used in this documentation, such as claude-sonnet-4-6. For Claude Code configuration, prefer the hyphenated alias.

Status	Description
200	Success - list of models
400	Bad Request - invalid parameters or malformed body
401	Unauthorized - invalid or missing authentication
403	Forbidden - access denied, insufficient quota, or banned
429	Too Many Requests - rate limit exceeded
500	Internal Server Error

2. Chat Completions (OpenAI Compatible)

POST /v1/chat/completions

Accepts a list of messages and returns a model-generated response. Supports both single-turn and multi-turn conversations. Responses can be streamed (SSE) or returned as a single JSON object.

Auth: Bearer Token

Request Body

Parameter	Type	Required	Description
`model`	string	Yes	ID of the model to use (e.g. `gpt-5.2`). Claude models also accept hyphenated aliases such as `claude-sonnet-4-6`.
`messages`	array	Yes	List of messages in the conversation. See ChatMessage.
`stream`	boolean	No	If true, partial message deltas will be sent as server-sent events. Default `false`.
`max_tokens`	integer	No	Maximum number of tokens that can be generated in the completion.
`temperature`	number	No	Sampling temperature between 0 and 2. Higher = more random. Default `1`.
`top_p`	number	No	Nucleus sampling: consider tokens with top_p probability mass. Default `1`.
`stop`	string \| string[]	No	Up to 4 sequences where the API will stop generating.
`n`	integer	No	How many chat completion choices to generate. Default `1`.
`frequency_penalty`	number	No	-2.0 to 2.0. Penalize repeated tokens. Default `0`.
`presence_penalty`	number	No	-2.0 to 2.0. Penalize tokens that appear in the text so far. Default `0`.
`seed`	integer	No	Random seed for deterministic sampling (if supported by model).
`response_format`	object	No	Specify output format: `{ "type": "text" }` or `{ "type": "json_object" }` or `json_schema`.
`tools`	array	No	List of tools the model may call. See ChatTool.
`tool_choice`	string \| object	No	`"auto"`, `"none"`, `"required"`, or `{ "type": "function", "function": { "name": "..." } }`.
`user`	string	No	Optional end-user identifier for abuse monitoring.
`web_search_options`	object	No	Enables web search for supported models. See WebSearchOptions.

Request Example

{
  "model": "gpt-5.2",
  "messages": [
    { "role": "system", "content": "You are a helpful assistant." },
    { "role": "user", "content": "Hello" }
  ],
  "stream": false,
  "max_tokens": 1024,
  "temperature": 1
}

Response (Non-stream)

{
  "id": "chatcmpl-xxx",
  "object": "chat.completion",
  "created": 1677652288,
  "model": "gpt-5.2",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! How can I help you?",
        "refusal": null,
        "annotations": []
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 12,
    "completion_tokens": 8,
    "total_tokens": 20,
    "prompt_tokens_details": { "cached_tokens": 0, "audio_tokens": 0 },
    "completion_tokens_details": {
      "reasoning_tokens": 0,
      "audio_tokens": 0,
      "accepted_prediction_tokens": 0,
      "rejected_prediction_tokens": 0
    }
  }
}

Response (Stream)

Each SSE chunk has object: "chat.completion.chunk" with choices[].delta.content containing incremental text. The final chunk includes usage and finish_reason.

Status	Description
200	Success
400	Bad Request - invalid parameters, malformed body, or invalid request
401	Unauthorized - invalid or missing authentication
403	Forbidden - access denied, insufficient quota, or model access restricted
429	Too Many Requests - rate limit exceeded
500	Internal Server Error
502	Bad Gateway - upstream service error
503	Service Unavailable - overloaded or no available channel

3. Messages (Claude Compatible)

POST /v1/messages

Accepts a list of messages and returns a model-generated response. Supports both single-turn and multi-turn conversations. Authenticate via x-api-key header or Bearer token. Responses can be streamed (SSE) or returned as a single JSON object.

Auth: API Key (x-api-key) or Bearer Token

Request Body

Parameter	Type	Required	Description
`model`	string	Yes	ID of the model. Claude models accept the hyphenated aliases shown here (e.g. `claude-sonnet-4-6`, `claude-opus-4-6`, `claude-haiku-4-5`). If `/v1/models` returns a dot alias such as `claude-sonnet-4.6`, you can use the matching hyphenated alias in API requests and Claude Code.
`max_tokens`	integer	Yes	Maximum number of tokens to generate. Different models have different maximum values.
`messages`	array	Yes	Input messages. Alternating user/assistant turns. Limit: 100,000 messages. See MessagesMessageItem.
`system`	string \| array	No	System prompt. Can be a plain string or an array of text blocks (for `cache_control`).
`stream`	boolean	No	Whether to stream the response using SSE. Default `false`.
`temperature`	number	No	Randomness (0.0 - 1.0). Use ~0.0 for analytical tasks, ~1.0 for creative tasks. Default `1`.
`top_p`	number	No	Nucleus sampling. Default `1`.
`top_k`	integer	No	Only sample from the top K options. Default disabled.
`stop_sequences`	string[]	No	Custom text sequences that cause the model to stop generating.
`metadata`	object	No	Request metadata. Supports `user_id` (opaque identifier).
`thinking`	object	No	Extended thinking config. See ThinkingConfig.
`tools`	array	No	Tool definitions the model may use. See Tool.
`tool_choice`	object	No	How the model should use tools: `auto`, `any`, `tool`, or `none`.

Request Example

{
  "model": "claude-sonnet-4-6",
  "max_tokens": 1024,
  "messages": [
    { "role": "user", "content": "Hello, Claude!" }
  ],
  "system": "You are a helpful assistant.",
  "temperature": 1.0
}

Response (Non-stream)

{
  "id": "chatcmpl-xxx",
  "type": "message",
  "role": "assistant",
  "content": [
    { "type": "text", "text": "Hello! How can I help you?" }
  ],
  "stop_reason": "end_turn",
  "model": "claude-sonnet-4-6",
  "usage": {
    "input_tokens": 4,
    "cache_creation_input_tokens": 0,
    "cache_read_input_tokens": 0,
    "output_tokens": 12,
    "claude_cache_creation_5_m_tokens": 0,
    "claude_cache_creation_1_h_tokens": 0
  }
}

Response (Stream - SSE Events)

Stream responses emit the following event types:

Event Type	Description	Key Fields
`message_start`	Initial message metadata	`message` (id, model, role, usage)
`content_block_start`	New content block begins	`index`, `content_block` (type, text)
`content_block_delta`	Incremental content	`index`, `delta` (type: `text_delta`, text)
`content_block_stop`	Content block ends	`index`
`message_stop`	Message complete	-

Status	Description
200	Success
400	Bad Request - invalid parameters, malformed body, or invalid request
401	Unauthorized - invalid or missing API key
403	Forbidden - access denied, insufficient quota, or model access restricted
429	Too Many Requests - rate limit exceeded
500	Internal Server Error
502	Bad Gateway - upstream service error
503	Service Unavailable - overloaded or no available channel

Data Models

ChatMessage

Field	Type	Required	Description
`role`	string	Yes	`"system"`, `"user"`, `"assistant"`, or `"tool"`
`content`	string	Yes	Message content. For tool role, the result of the tool call.
`name`	string	No	Optional name for the message author.
`tool_call_id`	string	No	When role is `"tool"`, the ID of the tool call this result is for.
`tool_calls`	array	No	When role is `"assistant"` and the model called tools. Array of `{ id, type, function: { name, arguments } }`.

MessagesMessageItem

Field	Type	Required	Description
`role`	string	Yes	`"user"` or `"assistant"` (no `"system"` - use top-level `system` parameter).
`content`	string \| array	Yes	Text string or array of content blocks (text, image, tool_use, tool_result).

Content Block Types (Messages API)

TextBlockParam

{ "type": "text", "text": "Hello, Claude!", "cache_control": { "type": "ephemeral" } }

ImageBlockParam

Base64 source:

{
  "type": "image",
  "source": {
    "type": "base64",
    "media_type": "image/jpeg",
    "data": "/9j/4AAQSkZJRg..."
  }
}

URL source:

{
  "type": "image",
  "source": {
    "type": "url",
    "url": "https://example.com/image.jpg"
  }
}

Supported media types: image/jpeg, image/png, image/gif, image/webp

ToolUseBlockParam (from assistant)

{
  "type": "tool_use",
  "id": "toolu_01D7FLrfh4GYq7yT1ULFeyMV",
  "name": "get_stock_price",
  "input": { "ticker": "AAPL" }
}

ToolResultBlockParam (from user)

{
  "type": "tool_result",
  "tool_use_id": "toolu_01D7FLrfh4GYq7yT1ULFeyMV",
  "content": "259.75 USD",
  "is_error": false
}

ThinkingConfig

Enable extended thinking to let Claude show its reasoning process.

Enabled:

{ "type": "enabled", "budget_tokens": 1024 }

budget_tokens: Must be >= 1024 and less than max_tokens.

Disabled:

{ "type": "disabled" }

Tool (Anthropic)

{
  "name": "get_stock_price",
  "description": "Get the current stock price for a given ticker symbol.",
  "input_schema": {
    "type": "object",
    "properties": {
      "ticker": { "type": "string" }
    },
    "required": ["ticker"]
  }
}

ToolChoice (Anthropic)

Type	Description
`{ "type": "auto" }`	Model decides whether to use tools. Supports `disable_parallel_tool_use`.
`{ "type": "any" }`	Model will use any available tool. Supports `disable_parallel_tool_use`.
`{ "type": "tool", "name": "..." }`	Model will use the specified tool. Supports `disable_parallel_tool_use`.
`{ "type": "none" }`	Model will not use tools.

ChatTool

{
  "type": "function",
  "function": {
    "name": "get_weather",
    "description": "Get weather for a location",
    "parameters": {
      "type": "object",
      "properties": {
        "location": { "type": "string" }
      },
      "required": ["location"]
    }
  }
}

WebSearchOptions

Field	Type	Description
`search_context_size`	string	`"low"`, `"medium"`, or `"high"` - how much context window for web search results.
`user_location`	object	Approximate user location (country ISO 3166-1 alpha-2, city, region, timezone).

ChatResponseFormat

Field	Type	Description
`type`	string	`"text"` or `"json_object"`
`json_schema`	object	When type is `json_schema`, optional schema for the output.

Error Response

All error responses follow this format:

{
  "error": {
    "message": "Error message",
    "type": "invalid_request_error",
    "param": null,
    "code": null
  }
}

Field	Type	Description
`message`	string	Error message
`type`	string	Error type (e.g. `invalid_request_error`)
`param`	string \| null	Related parameter
`code`	string \| null	Error code

AI API

Authentication​

Bearer Token​

API Key​

Why do I need to replace my API Key?​

Will my old API Key stop working immediately?​

What happens if I do not replace it?​

Will replacing the API Key affect my production service?​

Does this mean my API Key was leaked?​

Do I need to change request URLs or parameters?​

Where can I create a new API Key?​

Endpoints​

1. List Models​

2. Chat Completions (OpenAI Compatible)​

Request Body​

Request Example​

Response (Non-stream)​

Response (Stream)​

3. Messages (Claude Compatible)​

Request Body​

Request Example​

Response (Non-stream)​

Response (Stream - SSE Events)​

Data Models​

ChatMessage​

MessagesMessageItem​

Content Block Types (Messages API)​

TextBlockParam​

ImageBlockParam​

ToolUseBlockParam (from assistant)​

ToolResultBlockParam (from user)​

ThinkingConfig​

Tool (Anthropic)​

ToolChoice (Anthropic)​

ChatTool​

WebSearchOptions​

ChatResponseFormat​

Error Response​

Authentication

Bearer Token

API Key

Why do I need to replace my API Key?

Will my old API Key stop working immediately?

What happens if I do not replace it?

Will replacing the API Key affect my production service?

Does this mean my API Key was leaked?

Do I need to change request URLs or parameters?

Where can I create a new API Key?

Endpoints

1. List Models

2. Chat Completions (OpenAI Compatible)

Request Body

Request Example

Response (Non-stream)

Response (Stream)

3. Messages (Claude Compatible)

Request Body

Request Example

Response (Non-stream)

Response (Stream - SSE Events)

Data Models

ChatMessage

MessagesMessageItem

Content Block Types (Messages API)

TextBlockParam

ImageBlockParam

ToolUseBlockParam (from assistant)

ToolResultBlockParam (from user)

ThinkingConfig

Tool (Anthropic)

ToolChoice (Anthropic)

ChatTool

WebSearchOptions

ChatResponseFormat

Error Response