DoublewordDoubleword

Autobatcher

What is autobatcher?

autobatcher is a Python and TypeScript client that automatically converts your individual API calls into batched requests. You write normal async code using the familiar OpenAI interface, and autobatcher handles the batching complexity behind the scenes based on timing and volume to get you cost savings!

autobatcher provides two clients:

  • AsyncOpenAI — for async inference. Requests are prioritised ahead of batch but are not real-time. Ideal for agentic workflows, background jobs, and development.
  • BatchOpenAI — for batch inference. Designed for bulk workloads with less time pressure, offering the best price.

Both are drop-in replacements for the OpenAI client. They pass isinstance checks and provide full access to non-batched endpoints (models, files, etc.) out of the box.

Installation

pip install autobatcher
npm install autobatcher

Quick Start

from autobatcher import AsyncOpenAI

client = AsyncOpenAI(
    api_key="{{apiKey}}",
    base_url="https://api.doubleword.ai/v1"
)

response = await client.chat.completions.create(
    model="{{selectedModel.id}}",
    messages=[{"role": "user", "content": "Explain quantum computing"}],
)

print(response.choices[0].message.content)
import { AsyncOpenAI } from "autobatcher";

const client = new AsyncOpenAI({
  apiKey: "{{apiKey}}",
  baseURL: "https://api.doubleword.ai/v1"
});

const response = await client.chat.completions.create({
  model: "{{selectedModel.id}}",
  messages: [{ role: "user", content: "Explain quantum computing" }],
});

console.log(response.choices[0].message.content);

Swap AsyncOpenAI for BatchOpenAI when you have bulk workloads with less time pressure:

from autobatcher import BatchOpenAI

client = BatchOpenAI(
    api_key="{{apiKey}}",
    base_url="https://api.doubleword.ai/v1"
)

response = await client.chat.completions.create(
    model="{{selectedModel.id}}",
    messages=[{"role": "user", "content": "Summarize this document..."}],
)
import { BatchOpenAI } from "autobatcher";

const client = new BatchOpenAI({
  apiKey: "{{apiKey}}",
  baseURL: "https://api.doubleword.ai/v1"
});

const response = await client.chat.completions.create({
  model: "{{selectedModel.id}}",
  messages: [{ role: "user", content: "Summarize this document..." }],
});

How It Works

  1. Requests are collected over a configurable time window or batch size
  2. When the window closes or batch size is reached, requests are submitted as a batch
  3. Results are polled and returned to waiting callers as they complete
  4. Your code sees normal ChatCompletion responses

Configuration

Both AsyncOpenAI and BatchOpenAI accept the same configuration options:

OptionDefaultDescription
batch_size / batchSize1000Maximum requests per batch before auto-flush
batch_window_seconds / batchWindowSeconds10 (Python) / 10 (JS)Seconds to wait before flushing a partial batch
poll_interval_seconds / pollIntervalSeconds5Seconds between poll ticks when waiting for batch completion

Supported Endpoints

  • client.chat.completions.create() — Chat completions
  • client.embeddings.create() — Embeddings

All other OpenAI client methods (e.g., client.models.list(), client.files.create()) pass through unchanged to the underlying API.