Skip to content

docs: update evaluation and compaction documentation#3044

Merged
dgageot merged 2 commits into
mainfrom
docs/auto-update
Jun 10, 2026
Merged

docs: update evaluation and compaction documentation#3044
dgageot merged 2 commits into
mainfrom
docs/auto-update

Conversation

@aheritier

Copy link
Copy Markdown
Contributor

Documentation updates

This PR updates documentation to reflect two recently merged code changes.

Changes

Commit Source PR What changed
First commit #3029 Document custom base image behavior for --base-image eval flag
Second commit #3042 Note that compaction budgets scale with context_size for small windows

Details

docs/features/evaluation/index.md — Added explanation of how the eval harness handles custom base images: the docker-agent binary is injected from docker/docker-agent:edge at build time and the base image's entrypoint is overridden. Users should provide only the runtime environment in their base image.

docs/providers/dmr/index.md (or troubleshooting doc) — Added note that auto-compaction scales summary and keep-tail budgets proportionally to provider_opts.context_size, so small context windows (e.g. 8k local models) no longer lose session history during compaction.

PRs reviewed and found up to date

Source PR Reason
#3036 Docs shipped in the same PR (secrets guide updated)
#3032 Removed catalog server IDs not referenced in docs
#3035 Internal security fix, no user-facing behavior change
#3031 Internal SSRF hardening, allow_private_ips semantics unchanged
#3033 Go dependency bumps
#3039 Go dependency bump
#3005 Go dependency bumps
#3038 Already a docs PR (CHANGELOG)
#3028 Already a docs PR (--session-read-only)

The eval harness copies the docker-agent binary from docker/docker-agent:edge
into custom base images at build time and overrides their entrypoint with its
own /run.sh wrapper. Users need to know their base image's entrypoint will be
replaced and that the base image should only provide the runtime environment.

Ref: #3029
…windows

After the fix in #3042, the summary and keep-tail token budgets used during
session compaction scale proportionally to provider_opts.context_size instead
of using absolute 16k/20k constants. Small-context-window models (≤ ~16k)
no longer have their history wiped during compaction.

Ref: #3042
@aheritier aheritier requested a review from a team as a code owner June 10, 2026 04:06
@aheritier aheritier added the kind/docs Documentation-only changes label Jun 10, 2026

@docker-agent docker-agent left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assessment: 🟢 APPROVE

Documentation-only PR. All factual claims were verified against the source code:

  • docs/features/evaluation/index.md — The new ### Custom Base Images section accurately describes the eval harness behavior: the binary is indeed copied from docker/docker-agent:edge in the Dockerfile template, the entrypoint is correctly overridden unconditionally, and the anchor link #custom-base-images resolves correctly.
  • docs/providers/dmr/index.md — The compaction scaling claim is accurate: summaryTokenBudget = min(16000, contextLimit/4) and keepTokenBudget = min(20000, contextLimit/5) both scale proportionally with context_size, confirming small context windows (e.g. 8k) are handled correctly.

No bugs or inaccuracies found in the added documentation.

@aheritier aheritier added area/agent For work that has to do with the general agent loop/agentic features of the app area/providers/docker-model-runner Docker Model Runner (DMR) local inference labels Jun 10, 2026
@dgageot dgageot merged commit 7da2be1 into main Jun 10, 2026
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/agent For work that has to do with the general agent loop/agentic features of the app area/providers/docker-model-runner Docker Model Runner (DMR) local inference kind/docs Documentation-only changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants