DoublewordDoubleword

Model Name

Qwen/Qwen3.5-9B

Qwen3.5 9B

  • Type: Generation
  • Capabilities: vision, reasoning

Overview

Qwen3.5-9B is a compact 9B parameter reasoning model with a 262K token native context length, designed for strong reasoning performance while remaining extremely cost-efficient. Despite its small size, it performs remarkably well on complex tasks and in Qwen's benchmarks outperformed the, much larger, GPT-OSS-120 model.


Thinking Mode:

This model reasons step-by-step before responding by default. To disable thinking, include the following in your request body: "chat_template_kwargs": {"enable_thinking": false}

This model does not support graduated thinking levels. Parameters such as reasoning_effort are not supported and will have no effect.

Pricing

PriorityInput Tokens (per 1M)Output Tokens (per 1M)
Realtime1$0.08$0.70
High (1h)$0.04$0.35
Standard (24h)$0.03$0.29

Playground

Open this model in the Playground.

Footnotes

  1. Realtime availability is limited. Doubleword is primarily a batch API.