DoublewordDoubleword

Model Name

google/gemma-4-31B-it

Gemma 4 31B IT

  • Type: Generation
  • Capabilities: vision, reasoning, enhanced_structured_generation

Overview

Gemma 4 31B is Google DeepMind’s most capable open model, built for advanced reasoning, coding, and multimodal understanding. It sits in the same general tier as Claude 4.5 Haiku and NVIDIA Nemotron 3 Super, with native function calling and structured JSON output for agentic workflows; strong image and video understanding for tasks like OCR and chart analysis; 256K context for long documents and repositories; and support for 140+ languages.


Thinking Mode

To enable reasoning, include the following in your request body: "chat_template_kwargs": {"enable_thinking": false}

———

Multimodal Input

Gemma 4 supports multimodal input, so you can send images or videos together with text in a single request.

Image Example

"messages": [
  {
    "role": "user",
    "content": [
      {
        "type": "image_url",
        "image_url": {
          "url": "https://example.com/image.jpg"
        }
      },
      {
        "type": "text",
        "text": "Describe this image."
      }
    ]
  }
]

Video Example

"messages": [
  {
    "role": "user",
    "content": [
      {
        "type": "video_url",
        "image_url": {
          "url": "https://example.com/sample_video.mp4"
        }
      },
      {
        "type": "text",
        "text": "Summarize what happens in this video."
      }
    ]
  }
]

Pricing

PriorityInput Tokens (per 1M)Output Tokens (per 1M)
Realtime1$0.14$0.40
High (1h)$0.11$0.30
Standard (24h)$0.07$0.20

Playground

Open this model in the Playground.

Footnotes

  1. Realtime availability is limited. Doubleword is primarily a batch API.