Kimi K2 Thinking is an open-source model that operates as a "thinking agent," reasoning step-by-step while using tools to achieve state-of-the-art performance on various benchmarks. It is capable of executing up to 200-300 sequential tool calls without human intervention, allowing it to solve complex problems across a wide range of tasks. The model uses Quantization-Aware Training (QAT) to support INT4 inference, which provides a roughly 2x improvement in generation speed.
View model card in Model Garden
kimi-k2-thinking-maas
- Inputs:
Text , Documents - Outputs:
Text
- Supported
- Not supported
- Supported
- Not supported
-
Kimi K2 Thinking - Launch stage: GA
- Release date: Nov 13, 2025
Model availability
- United States
-
global
ML processing
- United States
-
Multi-region
global:
- Max output: 262144
- Context length: 262144

