[bot] Anthropic Messages API: `usage.output_tokens_details.thinking_tokens` not captured in span metrics

## Summary

Anthropic SDK v1.44.0 added `usage.output_tokens_details` to Messages API responses. This nested object contains `thinking_tokens` — the number of output tokens consumed by extended thinking/reasoning. The Braintrust Ruby SDK does not capture this field. Users who enable extended thinking via `anthropic.messages.create(thinking: {...}, ...)` or via the beta messages API have no visibility into their thinking token consumption.

This is distinct from issue #164 (RubyLLM extended thinking), which covers the `ruby_llm` gem. This issue covers the direct `anthropic` gem instrumentation.

## What is missing

The Anthropic Messages API now returns:

```json
"usage": {
  "input_tokens": 2095,
  "output_tokens": 503,
  "cache_creation_input_tokens": 2051,
  "cache_read_input_tokens": 2051,
  "output_tokens_details": {
    "thinking_tokens": 312
  }
}
```

`output_tokens_details.thinking_tokens` is the count of tokens the model generated as internal reasoning (always ≤ `output_tokens`). Capturing it allows users to:
- Attribute cost to extended thinking vs. standard output
- Diagnose cases where reasoning dominates total output tokens
- Compare thinking token spend across requests

### Why it is dropped today

`Common.parse_usage_tokens` in `lib/braintrust/contrib/anthropic/instrumentation/common.rb` iterates over the top-level usage hash and skips any value that is not `Numeric`:

```ruby
usage_hash.each do |key, value|
  next unless value.is_a?(Numeric)   # ← skips output_tokens_details (a Hash)
  ...
end
```

`output_tokens_details` maps to `{thinking_tokens: 312}`, which fails the `Numeric` check and is silently dropped. No field in the existing `field_map` covers it.

The same gap applies to both the stable Messages API instrumentation (`messages.rb`) and the beta Messages API instrumentation (`beta_messages.rb`), since both delegate to `Common.parse_usage_tokens`.

## Braintrust docs status

**`not_found`** — The Braintrust Anthropic integration docs at `https://www.braintrust.dev/docs/providers/anthropic` list prompt_tokens, completion_tokens, and cache metrics as captured but do not mention thinking tokens or `output_tokens_details`.

## Upstream sources

- Anthropic Messages API reference — `usage.output_tokens_details.thinking_tokens` field: https://platform.claude.com/docs/en/api/messages
- Anthropic Ruby SDK changelog — v1.44.0 added `output_tokens_details` and mid-conversation usage details: https://github.com/anthropics/anthropic-sdk-ruby/blob/main/CHANGELOG.md

## Local files inspected

- `lib/braintrust/contrib/anthropic/instrumentation/common.rb` — `parse_usage_tokens` method (lines 14–48): 4-field `field_map`; `Numeric` guard silently drops nested objects like `output_tokens_details`
- `lib/braintrust/contrib/anthropic/instrumentation/messages.rb` — `set_metrics` (line 132) calls `Common.parse_usage_tokens`; also captures streaming output via `finalize_stream_span`
- `lib/braintrust/contrib/anthropic/instrumentation/beta_messages.rb` — same `parse_usage_tokens` call pattern

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[bot] Anthropic Messages API: `usage.output_tokens_details.thinking_tokens` not captured in span metrics #175

Summary

What is missing

Why it is dropped today

Braintrust docs status

Upstream sources

Local files inspected

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[bot] Anthropic Messages API: usage.output_tokens_details.thinking_tokens not captured in span metrics #175

Description

Summary

What is missing

Why it is dropped today

Braintrust docs status

Upstream sources

Local files inspected

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

[bot] Anthropic Messages API: `usage.output_tokens_details.thinking_tokens` not captured in span metrics #175