Skip to content

[bot] OpenAI: inline moderation results not captured in Chat Completions spans when moderation= parameter is used #505

@braintrust-bot

Description

@braintrust-bot

Summary

OpenAI SDK v2.41.0 (June 3, 2026) added inline moderation for both Chat Completions and the Responses API via a moderation={"model": "omni-moderation-latest"} request parameter. When this parameter is used, the API runs the omni-moderation-latest safety classifier on both input and output as part of the same inference call, and returns moderation scores in the response alongside the generated content. The current Braintrust OpenAI integration does not capture these moderation results in any span.

What is missing

Chat Completions

When moderation={"model": "omni-moderation-latest"} is passed to chat.completions.create(), the ChatCompletion response object gains a moderation field containing:

  • moderation.input — safety classification of the prompt (categories, scores, flagged bool)
  • moderation.output — safety classification of the generated content (same structure)

The current tracing code (ChatCompletionWrapper.create in py/src/braintrust/integrations/openai/tracing.py) only logs choices and usage from the response:

span.log(
    metrics=metrics,
    output=_process_attachments_in_chat_output(log_response["choices"], audio_format=audio_format),
)

The moderation key in log_response is never referenced and its content is silently dropped. Users who enable inline moderation get no visibility into the safety evaluation results in Braintrust.

Responses API

For responses.create(), the non-streaming path routes through ResponseWrapper._parse_event_from_result(), which passes non-output/non-usage keys to metadata:

metadata = {k: v for k, v in result.items() if k not in ["output", "usage"]}

A moderation key in the serialized response dict may appear here, but only if _try_to_dict fully and correctly serializes the nested Pydantic model. The streaming path (_postprocess_streaming_results) explicitly only collects output and usage metrics — moderation results from response.completed events are not extracted.

Current instrumentation coverage

OpenAI Surface Inline moderation captured?
chat.completions.create() (non-streaming) No — only choices + usage are logged
chat.completions.create(stream=True) No — streaming accumulation ignores moderation field
responses.create() (non-streaming) Partially — may appear in metadata passthrough, not verified
responses.create(stream=True) No — streaming path only collects output + usage metrics

Relationship to existing moderation coverage

The integration already has a ModerationsPatcher for standalone client.moderations.create() calls (py/src/braintrust/integrations/openai/patchers.py). This covers the explicit, separate moderation API. The inline moderation feature is different: it runs the safety model as part of the generation call and returns results in the same response object. Neither the ModerationsPatcher nor any existing code handles this inline path.

Braintrust docs status

not_found — The OpenAI integration page documents chat completions, streaming, structured outputs, and function calling. No mention of the moderation= parameter or inline safety scoring. Status: not_found.

Upstream sources

  • OpenAI Python SDK CHANGELOG — v2.41.0 (2026-06-03): api: responses.moderation and chat_completions.moderation
  • OpenAI moderation guide: https://developers.openai.com/api/docs/guides/moderation (covers both inline and standalone moderation)
  • openai/resources/chat/completions/completions.pycreate() accepts moderation parameter, returns ChatCompletion with moderation field
  • openai/resources/responses/responses.pycreate() accepts moderation parameter, returns Response with moderation field

Local files inspected

  • py/src/braintrust/integrations/openai/tracing.py:
    • ChatCompletionWrapper.create (around line 455) — logs only choices + usage, no reference to moderation
    • ResponseWrapper._parse_event_from_result (line 1112) — passes non-output/non-usage keys to metadata; may capture moderation but unverified
    • ResponseWrapper._postprocess_streaming_results (line 1132) — only extracts output items and usage metrics; no moderation extraction
    • _moderation_create_wrapper (line 317) — existing standalone moderation handler, does not cover inline moderation
  • py/src/braintrust/integrations/openai/patchers.pyModerationsPatcher covers client.moderations.create() only; no inline moderation handling
  • py/src/braintrust/integrations/openai/test_openai.py — no tests for moderation= parameter in chat completions or responses

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels
    No fields configured for Feature.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions