MiniMax M3

Overview

MiniMax M3 is MiniMax's next-generation large language model in the M series. It introduces MiniMax Sparse Attention (MSA), a sub-quadratic attention mechanism designed to improve inference efficiency for million-token context workloads while maintaining output quality comparable to the M2 series. Availability may vary by B.AI model catalog and rollout status.

Key Features

MiniMax Sparse Attention (MSA): Uses a two-stage GQA-based sparse attention architecture. A lightweight index branch selects relevant KV blocks, while the sparse branch computes attention only on those selected blocks to reduce attention cost.
Million-Token Context: Designed for 1M-token context scenarios, with preview materials reporting substantially faster decoding and prefill compared with the M2 series at long context lengths.
Multimodal Support: Image-input support depends on platform rollout and model configuration, extending the M series beyond text-only workflows where available.
Agent & Office Focus: Optimized for document understanding, spreadsheet processing, presentation generation, and enterprise agent workflows.

Best Use Cases

Ultra-Long Context Processing: Multi-document analysis, long conversation history, and large codebase understanding.
AI Agent Deployment: Autonomous task execution that combines tool use, reasoning, and long-context state.
Office Automation: Intelligent processing and generation of documents, spreadsheets, and presentations.

Capabilities and Limitations

Capability	Description
Reasoning	Designed for frontier-tier reasoning and agent workflows; final benchmark details may vary
Creative Writing	Not specified
Multimodal	Text workflows; image-input support depends on platform rollout and model configuration
Response Speed	Designed for faster long-context prefill and decoding than the M2 series
Context Window	1,000,000 tokens
Max Output	Not specified
Tool Use	Designed for function calling and agent workflow support
Multilingual	Not specified

Known Limitations

Final availability, model limits, and benchmark scores may change during rollout.
Some capabilities described in preview materials may depend on platform support and model configuration.

Credits Usage

Model	Input (Credits/Token)	Cache Write (Credits/Token)	Cache Read (Credits/Token)	Output (Credits/Token)	Web Search (Credits/Use)	Billing Notes
MiniMax M3	`0.60`	`0.75`	`0.12`	`2.40`	`-`	-

MiniMax M3

Overview​

Key Features​

Best Use Cases​

Capabilities and Limitations​

Known Limitations​

Credits Usage​