feat(model): enable Anthropic prompt caching by default by bussyjd · Pull Request #404 · ObolNetwork/obol-stack

bussyjd · 2026-05-01T06:55:18Z

Summary

Attach cache_control_injection_points: [{location: message, role: system}] to every Anthropic model_list entry LiteLLM emits — both the anthropic/* wildcard and explicit per-model entries.
LiteLLM auto-injects cache_control: {type: ephemeral} on the system message of every request to those entries, the canonical "prompt caching by default" pattern.
Non-anthropic providers (openai, ollama, custom) are intentionally untouched — cache_control is Anthropic-specific and the paid/* route proxies to arbitrary upstreams via the buyer sidecar.

Why

Token cost on Anthropic models is dominated by repeated system / tool / RAG prefixes. Enabling prompt caching by default cuts that cost without any per-call changes on the agent or skill side. Closes a parity gap with Franklin (which advertises Anthropic prompt caching enabled by default).

Test plan

go build ./...
go test ./internal/model/
New subtests in TestBuildModelEntries:
- anthropic_entries_inject_system-message_cache_breakpoint — both wildcard and explicit anthropic entries carry exactly one {Location: "message", Role: "system"} injection point.
- non-anthropic_entries_do_not_inject_cache_control — openai and ollama entries have empty CacheControlInjectionPoints.
Manual cluster check: obol model setup anthropic then kubectl get configmap litellm-config -n llm -o jsonpath='{.data.config\.yaml}' — entries for anthropic/* and any explicit claude models should show cache_control_injection_points.

https://claude.ai/code/session_012FTDF8ofWWCLwU8GEN2SFh

Generated by Claude Code

Add cache_control_injection_points to every Anthropic model_list entry LiteLLM emits (both the anthropic/* wildcard and explicit per-model entries). Pinning the system message as the cache breakpoint makes LiteLLM auto-attach cache_control: {type: ephemeral} on every request to an Anthropic upstream, the canonical "prompt caching by default" pattern. Non-anthropic providers (openai, ollama, custom) are unaffected. https://claude.ai/code/session_012FTDF8ofWWCLwU8GEN2SFh

OisinKyne approved these changes May 1, 2026

View reviewed changes

Merge branch 'main' into claude/competitive-analysis-franklin-NV0xw

6a4b7fb

OisinKyne enabled auto-merge (rebase) May 1, 2026 11:33

OisinKyne merged commit 2bcc1d1 into main May 1, 2026
5 checks passed

OisinKyne deleted the claude/competitive-analysis-franklin-NV0xw branch May 1, 2026 16:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(model): enable Anthropic prompt caching by default#404

feat(model): enable Anthropic prompt caching by default#404
OisinKyne merged 2 commits intomainfrom
claude/competitive-analysis-franklin-NV0xw

bussyjd commented May 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

bussyjd commented May 1, 2026

Summary

Why

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants