Qwen3.5 397B A17B

Type: Generation
Capabilities: vision, reasoning

Overview

Meet Qwen3.5-397B-A17B - released Feb 2026, it is Qwen's most powerful model, delivering performance similar to GPT-5.2 and Claude Opus 4.5 on challenging tasks including advanced reasoning, mathematics, and complex code generation. Offers frontier-level capabilities at a fraction of the cost. Best for:

Tasks requiring maximum intelligence
Complex analysis
Sophisticated coding projects
Scenarios where quality justifies the additional cost over smaller models

Max New Tokens: 16384

Max Total Tokens: 262144

Sampling Parameters:

We have set the default sampling parameters using the recommended values set out by the Qwen team:

We suggest using Temperature=0.7, TopP=0.8, TopK=20, and MinP=0.

For supported frameworks, you can adjust the presence_penalty parameter between 0 and 2 to reduce endless repetitions. However, using a higher value may occasionally result in language mixing and a slight decrease in model performance.

We use a default presence_penalty of 1.5 to bias the model against endless repetitions, if you still notice this behaviour try increasing the presence_penalty.

You can adjust these on a per-request basis by setting the sampling parameters in the request body.

Thinking Mode:

This model reasons step-by-step before responding by default. To disable thinking, include the following in your request body: "chat_template_kwargs": {"enable_thinking": false}

This model does not support graduated thinking levels. Parameters such as reasoning_effort are not supported and will have no effect.

Pricing

Priority	Input Tokens (per 1M)	Output Tokens (per 1M)
Realtime¹	$0.60	$3.60
Async	$0.30	$1.80
Batch (24h)	$0.15	$1.20

Playground

Open this model in the Playground.

Realtime availability is limited. Doubleword is primarily a batch API. ↩

Qwen3.5 397B A17B

Overview

Pricing

Playground

Footnotes