Model Name
Qwen/Qwen3.5-35B-A3B-FP8Qwen3.5 35B A3B
- Type: Generation
- Capabilities:
vision,reasoning
Overview
Qwen3.5-35B-A3B is a high-intelligence, mid-sized model that hits a very compelling price/performance point for async workloads. In Qwen's published benchmarks, this model outperformed GPT-5-mini, GPT-OSS-120B, and Claude Sonnet 4.5.
Thinking Mode:
This model reasons step-by-step before responding by default. To disable thinking, include the following in your request body: "chat_template_kwargs": {"enable_thinking": false}
This model does not support graduated thinking levels. Parameters such as reasoning_effort are not supported and will have no effect.
Pricing
| Priority | Input Tokens (per 1M) | Output Tokens (per 1M) |
|---|---|---|
| Realtime1 | $0.25 | $2.00 |
| High (1h) | $0.07 | $0.30 |
| Standard (24h) | $0.05 | $0.20 |
Playground
Open this model in the Playground.
Footnotes
-
Realtime availability is limited. Doubleword is primarily a batch API. ↩