Model Name
Qwen/Qwen3.5-35B-A3B-FP8-dotjsonQwen3.5 35B A3B dottxt
- Type: Generation
- Capabilities:
reasoning,enhanced_structured_generation,vision
Overview
Qwen3.5-35B-A3B is a high-intelligence, mid-sized model that hits a very compelling price/performance point for async workloads. In Qwen's published benchmarks, this model outperformed GPT-5-mini, GPT-OSS-120B, and Claude Sonnet 4.5.
Thinking Mode:
This model reasons step-by-step before responding by default. To disable thinking, include the following in your request body: "chat_template_kwargs": {"enable_thinking": false}
This model does not support graduated thinking levels. Parameters such as reasoning_effort are not supported and will have no effect.
Pricing
| Priority | Input Tokens (per 1M) | Output Tokens (per 1M) |
|---|---|---|
| Realtime1 | $0.50 | $3.00 |
| High (1h) | $0.14 | $0.60 |
| Standard (24h) | $0.10 | $0.40 |
Playground
Open this model in the Playground.
Footnotes
-
Realtime availability is limited. Doubleword is primarily a batch API. ↩