Skip to content

Add batch inference pydo and dots examples#1168

Merged
SSharma-10 merged 3 commits intomainfrom
batch_inf_examples
May 5, 2026
Merged

Add batch inference pydo and dots examples#1168
SSharma-10 merged 3 commits intomainfrom
batch_inf_examples

Conversation

@SSharma-10
Copy link
Copy Markdown
Contributor

No description provided.

@@ -0,0 +1,16 @@
lang: Python
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blocker — request body doesn't match batch_create_request.yml.

input_file_id= should be file_id= (line 8 of the example).
Missing required provider (e.g. "openai").
Missing required request_id — it's the idempotency key. Add import uuid and pass request_id=str(uuid.uuid4()).
Suggested:

batch = client.batches.create( body={ "file_id": os.environ["BATCH_INPUT_FILE_ID"], "provider": "openai", "endpoint": "/v1/chat/completions", "completion_window": "24h", "request_id": str(uuid.uuid4()), } )

print("batch_id:", batch.get("batch_id"))

Also batch.get("id") → batch.get("batch_id") per batch.yml:12.

@@ -0,0 +1,44 @@
lang: Python
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blocker — wrong endpoint and wrong response shape.
The spec endpoint is POST /v1/batches/files, which returns { file_id, upload_url, expires_at } per batch_file_create_response.yml. The example instead calls client.files.create(file=input_path, purpose="batch") (OpenAI Files-style: send the bytes + a purpose) and reads uploaded.filename / uploaded.bytes — none of those exist on this response, and purpose isn't on the request schema.

Mirror the dots version: call the batch-files create method with file_name=... and print file_id / upload_url. The actual JSONL bytes belong in inference_upload_batch_file.yml, not here.

@@ -0,0 +1,30 @@
lang: Python
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Misleading lead comment. Lines 1–4 claim client.files.create() "performs both steps for you, prefer it" — that contradicts your create_batch_file example, which only reserves the intent. Drop the comment or rewrite it to say "step 1 reserves file_id+upload_url (see create_batch_file); this example PUTs the bytes."

PUT logic itself looks fine. Minor: avoid printing upload_url-derived state.

client = Client(token=os.environ.get("DIGITALOCEAN_TOKEN"))

batch = client.batches.retrieve(os.environ["BATCH_ID"])

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blocker — batch.get("id") is always None. Per batch.yml, the field is batch_id. Change to batch.get("batch_id").

@@ -0,0 +1,25 @@
lang: Python
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blocker — wrong field name. Line reads links["output_file_id"], but batch_results_response.yml returns output_file_url (a short-lived presigned URL). The endpoint does not return an output file ID.

The follow-up client.files.content(...) call also doesn't compose: you GET the presigned URL with requests.get, you don't pass it through the SDK. Rewrite as:

import requests
links = client.batches.results.retrieve(batch_id)
if not links.get("result_available"):
print("results not ready yet"); raise SystemExit(0)
resp = requests.get(links["output_file_url"], timeout=60)
resp.raise_for_status()
Path("batch_output.jsonl").write_bytes(resp.content)

resp = client.batches.list(limit=20)
for b in resp.get("data") or []:
print(f"{b.get('id'):40} {b.get('status'):12} {b.get('created_at')}")
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Field name. Per batch.yml, use b.get('batch_id'), not b.get('id'). Otherwise the iteration shape (resp.get("data"), has_more, last_id) matches batch_list_response.yml.

@@ -0,0 +1,13 @@
lang: Python
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two blockers.

result.get("id") → result.get("batch_id").
result.get("cancel_requested_at") doesn't exist on batch.yml. Use cancelled_at (or print status only — the cancel response is the full batch and the user mostly cares that status is cancelling / cancelled).

request_id: randomUUID(),
});

console.log("batch_id:", batch.batch_id ?? batch.id);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Field name. batch.batch_id ?? batch.id — drop the ?? batch.id; per spec there's no id. Just batch.batch_id.

Otherwise the request body matches the schema.

@@ -0,0 +1,14 @@
lang: JavaScript
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks right against batch_file_create_response.yml. One nit: client.files.create(...) reads like OpenAI-Files; if the SDK actually exposes this as client.batches.files.create(...) (the URL is /v1/batches/files), prefer that name for clarity.

@@ -0,0 +1,32 @@
lang: JavaScript
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Combines step 1 (reserve intent) and step 2 (PUT bytes) into one snippet. That's fine but it duplicates create_batch_file. Consider trimming step 1 here so each example documents one endpoint, matching the curl pair.


const batch = await client.batches.retrieve(process.env.BATCH_ID);

console.log("batch_id: ", batch.id);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Field name. batch.id → batch.batch_id.


// client.files.content resolves the result envelope and follows the
// presigned URL for you, returning the raw fetch Response.
const resp = await client.files.content(batchId);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likely wrong API call. client.files.content(batchId) passes a batch id to a files helper. The endpoint GET /v1/batches/{batch_id}/results returns presigned URLs in batch_results_response.yml; you then fetch(output_file_url). Should be:
const links = await client.batches.results.retrieve(batchId);
if (!links.result_available) { console.log("not ready"); return; }
const resp = await fetch(links.output_file_url);

@@ -0,0 +1,17 @@
lang: JavaScript
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blocker — wrong pagination shape. Uses page.edges.map(e => e.node) (Relay-style), but batch_list_response.yml is { object, data, has_more, first_id, last_id }. Should be:

for (const b of page.data ?? []) {
console.log(${b.batch_id}\t${b.status}\t${b.created_at});
}
console.log("has_more:", page.has_more, "last_id:", page.last_id);

@@ -0,0 +1,13 @@
lang: JavaScript
source: |-
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two issues.

result.id → result.batch_id.
result.cancel_requested_at doesn't exist; use cancelled_at or just print status.

Copy link
Copy Markdown
Member

@Bala-Nallamilli Bala-Nallamilli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants