Model Name
Qwen/Qwen3.5-9BQwen3.5 9B
- Type: Generation
- Capabilities:
vision,reasoning
Overview
Qwen3.5-9B is a compact 9B parameter reasoning model with a 262K token native context length, designed for strong reasoning performance while remaining extremely cost-efficient. Despite its small size, it performs remarkably well on complex tasks and in Qwen's benchmarks outperformed the, much larger, GPT-OSS-120 model.
Thinking Mode:
This model reasons step-by-step before responding by default. To disable thinking, include the following in your request body: "chat_template_kwargs": {"enable_thinking": false}
This model does not support graduated thinking levels. Parameters such as reasoning_effort are not supported and will have no effect.
Pricing
| Priority | Input Tokens (per 1M) | Output Tokens (per 1M) |
|---|---|---|
| Realtime1 | $0.08 | $0.70 |
| High (1h) | $0.04 | $0.35 |
| Standard (24h) | $0.03 | $0.29 |
Playground
Open this model in the Playground.
Footnotes
-
Realtime availability is limited. Doubleword is primarily a batch API. ↩