DoublewordDoubleword

Model Name

Qwen/Qwen3-VL-235B-A22B-Instruct-FP8

Qwen3 VL 235B A22B Instruct

  • Type: Generation
  • Capabilities: vision

Overview

Meet Qwen3-VL-235B - delivering performance similar to GPT-5 Chat and Claude 4 Opus Thinking on challenging tasks including advanced reasoning, mathematics, and complex code generation. Offers frontier-level capabilities at a fraction of the cost. Best for:

  • Tasks requiring maximum intelligence
  • Complex analysis
  • Sophisticated coding projects
  • Scenarios where quality justifies the additional cost over smaller models

Max New Tokens: 16384

Max Total Tokens: 262144

Sampling Parameters:

We have set the default sampling parameters using the recommended values set out by the Qwen team:


We suggest using Temperature=0.7, TopP=0.8, TopK=20, and MinP=0.

For supported frameworks, you can adjust the presence_penalty parameter between 0 and 2 to reduce endless repetitions. However, using a higher value may occasionally result in language mixing and a slight decrease in model performance.


We use a default presence_penalty of 1.5 to bias the model against endless repetitions, if you still notice this behaviour try increasing the presence_penalty.

You can adjust these on a per-request basis by setting the sampling parameters in the request body.

Pricing

PriorityInput Tokens (per 1M)Output Tokens (per 1M)
Realtime1$0.60$1.20
High (1h)$0.15$0.55
Standard (24h)$0.10$0.40

Playground

Open this model in the Playground.

Footnotes

  1. Realtime availability is limited. Doubleword is primarily a batch API.