DoublewordDoubleword

Model Pricing

Doubleword Batch API is priced per model based on token usage. Costs are calculated separately for input tokens (the content you send) and output tokens (the content generated by the model).

The table below outlines pricing on current models we have available. If you are interested in understanding pricing for a model not listed below - please reach out to support@doubleword.ai.

Model NameSLAInput Tokens (per 1M)Output Tokens (per 1M)
Qwen/Qwen3-VL-30B-A3B-Instruct-FP8Realtime1$0.16$0.80
Qwen/Qwen3-VL-30B-A3B-Instruct-FP81hr$0.07$0.30
Qwen/Qwen3-VL-30B-A3B-Instruct-FP824hr$0.05$0.20
Qwen/Qwen3-VL-235B-A22B-Instruct-FP8Realtime1$0.60$1.20
Qwen/Qwen3-VL-235B-A22B-Instruct-FP81hr$0.15$0.55
Qwen/Qwen3-VL-235B-A22B-Instruct-FP824hr$0.10$0.40

If you'd like to estimate the cost of your job, please upload your file in the Doubleword Console to view a cost estimate prior to submitting a batch.

Note

SLA indicates the maximum processing time for batch requests. Actual processing times are typically faster than the stated SLA.

Model Details

Qwen/Qwen3-VL-30B-A3B-Instruct-FP8

Playground

Meet Qwen3-VL-30B, the smaller model of the Qwen3-VL family, delivering performance similar to GPT-4.1-mini and Claude Sonnet 4. This highly capable mid-size model is suited for tasks that are constrained or require high token volumes. Excels at reasoning, coding, and structured output generation.

Best for:

  • Production workloads requiring strong performance without frontier model costs
  • Complex reasoning tasks
  • Code generation

Qwen/Qwen3-VL-235B-A22B-Instruct-FP8

Playground

Meet Qwen3-VL-235B - our most powerful model, delivering performance similar to GPT-5 Chat and Claude 4 Opus Thinking on challenging tasks including advanced reasoning, mathematics, and complex code generation. Offers frontier-level capabilities at a fraction of the cost. Best for:

  • Tasks requiring maximum intelligence
  • Complex analysis
  • Sophisticated coding projects
  • Scenarios where quality justifies the additional cost over smaller models

Max New Tokens: 16384

Max Total Tokens: 262144

Sampling Parameters:

We have set the default sampling parameters using the recommended values set out by the Qwen team:


We suggest using Temperature=0.7, TopP=0.8, TopK=20, and MinP=0.

For supported frameworks, you can adjust the presence_penalty parameter between 0 and 2 to reduce endless repetitions. However, using a higher value may occasionally result in language mixing and a slight decrease in model performance.


We use a default presence_penalty of 1.5 to bias the model against endless repetitions, if you still notice this behaviour try increasing the presence_penalty.

You can adjust these on a per-request basis by setting the sampling parameters in the request body.

Footnotes

  1. Realtime availability is limited. Doubleword is primarily a batch API. 2