feat(agent): add automatic compaction#82
Merged
Conversation
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
7 tasks
hakula139
added a commit
that referenced
this pull request
May 13, 2026
## Summary A follow-up sweep after the auto-compaction PR (#82). Fixes the model-swap clamp bug, surfaces auto-compaction failures to the user, tightens the jump-to-bottom overlay so it stops bleeding into chat, removes the hardcoded 25-round tool-loop cap in favor of a configurable opt-in, and rolls in a broader style pass over comments, test names, and section organization. ## Design decisions - **Clamp the auto-compaction threshold silently on both ends.** A user-configured threshold above the active model's safe trigger now snaps down instead of erroring on swap, mirroring the existing snap-up at the 50K floor. The percent path already worked this way, so the two arms are now symmetric. - **Centralize the threshold table on the context window, not the model id.** The 200K / 1M window pins live in one `test_thresholds` constants module shared by config, client, and welcome tests; the anchor test derives them from `default_auto_threshold` so a change to the reserve cap or buffer fails one assertion instead of drifting silently across files. - **Make sink event delivery loggable.** New `AgentSink::emit` logs at `error!` when the channel rejects a one-shot or user-facing event. `_ = sink.send(...)` stays only for per-token streaming where a dropped event is harmless. Auto-compaction failures, the breaker trip, and the `SessionRolled` notification all route through the canonical wrapper as user-visible `Error` events or one-shot signals. - **Render the jump-to-bottom overlay inside a sized pill.** The transparent right-aligned `Paragraph` form was leaking chat text through. A `Block`-filled `Rect` with 1-cell padding gives the overlay an opaque surface bg, matching how the rest of the chrome paints. - **Default the per-turn tool-round cap to unbounded.** The hardcoded `MAX_TOOL_ROUNDS = 25` was tripping legitimate multi-hour agentic sessions. The new `[client] max_tool_rounds` (or `OX_MAX_TOOL_ROUNDS`) is `Option<u32>`: `None` runs unbounded; `Some(N)` bails the turn after `N` rounds with the existing runaway-loop error. The cap stays available as a guard against tools stuck in a retry loop, but normal agent behavior no longer needs to fit inside an arbitrary 25-round budget. - **Drop the antithesis / semicolon over-use in prose comments.** Sweep agent.rs and peers for `;` joiners that read as run-on sentences, "X, not Y" framing that negates a strawman, multi-paragraph contracts that should be one tight paragraph, and `Regression:` / `Pin the X arm of Y:` task-narration leads on tests whose names already say what they cover. - **Keep the test suite lean over rigidly mirrored.** Where the test reviewer flagged categorical sections in `markdown/render.rs` (Paragraphs, Headings, Lists, ...) or missing `wrap_line_` prefixes in single-function modules, the existing form is more useful and stays. Where a divider was actively misleading (`agent_turn` covering only fixtures, `paused_counter_*` sitting under `update_layout`, `jump_overlay_label` interleaved into `draw_frame` tests), the section is renamed or moved. ## Changes | File | Description | | ---- | ----------- | | `agent.rs` | `agent_turn` takes `max_tool_rounds: Option<u32>` (`None` = unbounded, `Some(n)` = bail after `n` rounds). Module doc, loop, and bail message updated. Replaces the 25-round const-test with paired `with_some_cap_bails_*` and `with_none_cap_runs_unbounded_*` tests. | | `agent.rs`, `agent/compact_boundary.rs`, `agent/event.rs`, `agent/compaction.rs`, `session/title_generator.rs`, `main.rs` | New `AgentSink::emit`. Auto-compaction failures, breaker trip, and `SessionRolled` route through it as user-visible signals. Comment sweep: drop `;` joiners that read as run-on sentences, tighten verbose docstrings, drop one task-narration lead. | | `config.rs` | Threshold floor and ceiling now both clamp silently. `threshold_from_tokens` returns `u32`. Anchor test pins the 200K / 1M window thresholds through `default_auto_threshold`. `display_auto_compaction` reads `"at {n} tokens"` instead of `"on at {n} tokens"`. New `max_tool_rounds: Option<u32>` field on `Config` and `ConfigSnapshot`, loaded from `[client] max_tool_rounds` or `OX_MAX_TOOL_ROUNDS`. New `display_max_tool_rounds` helper for `/config`. | | `config/file.rs` | `ClientConfig` carries `max_tool_rounds: Option<u32>`. Merge and parser tests extended. | | `client/anthropic.rs` | `stream_message` doc dropped its multi-bullet "Wire shape" form for a tighter contract paragraph. `set_model` tests reuse the new `test_thresholds` constants. New `Client::max_tool_rounds()` getter. | | `main.rs` | Three `agent_turn` call sites (`AgentLoopTask::handle_submit_prompt`, `bare_repl`, `headless`) pass `client.max_tool_rounds()`. The remaining inline `if let Err = sink.send` / `tracing::error!` on `SessionRolled` folds into `sink.emit`. | | `slash/config.rs` | `/config` modal adds a "Max Tool Rounds" row showing `unbounded` or the configured value. Height anchor updated. Test assertions match the new `"at {n} tokens"` phrasing. | | `tui/app.rs` | Jump-to-bottom overlay paints inside a sized pill (`Block` + 1-cell padding) so it no longer bleeds through chat. `render_app` / `rendered_text` / `long_chat_block` hoisted to top-level fixtures since several sections use them. `jump_overlay_label` section moved out of the `draw_frame` interleave. `paint_below_starters_*` and `draw_frame_streaming_*` drop their task-narration leads. | | `tui/components/chat.rs` | Add the missing `// ── bump_paused_counter ──` test divider. Rename `paused_counter_saturates` to `paused_counter_does_not_overflow_at_u32_max`. | | `tui/components/welcome.rs`, `tui/theme/loader.rs` | Drop test-body narration leads. | | `slash/model.rs` | Helper order now follows `resolve_base`'s call sequence (`is_dated_model_id` → `has_dated_suffix` → `is_selectable_known_id` → `candidates` → listing). | | `slash/matcher.rs` | `rank_by_prefix` moved after the `best_match` helper cluster so the cluster stays contiguous. | | `slash/registry.rs` | Rename `empty_metadata_offenders_flags_a_synthetic_violator` to a scenario-keyed name. | | `slash/resume.rs` | Rename `ctrl_d_pushes_*` to `ctrl_d_or_delete_pushes_*` since the body covers both; drop the now-redundant comment. Tighten the `reload_*` comment to the WHY. | | `slash/status.rs` | Test assertions match the new `"at {n} tokens"` phrasing. | | `session/state.rs` | Split `commit_compact_*` tests out of the `compact_entries` section. | | `docs/guide/configuration.md` | Threshold guide now describes the silent clamping behavior on both ends. New `max_tool_rounds` row in the `[client]` table, env-var row, and short prose section explaining the unbounded default and opt-in cap. | ## Test plan - [x] `cargo fmt --all --check` - [x] `cargo build` - [x] `cargo clippy --all-targets -- -D warnings`: zero warnings - [x] `cargo test`: 2005 tests pass - [x] `cargo llvm-cov --ignore-filename-regex 'main\.rs'`: 98.63% line coverage - [x] `pnpm lint` - [x] `pnpm spellcheck`
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds default-on automatic context compaction on top of
/compact: streaming usage from completed turns now acts as the trigger signal, and the agent compacts before recording the next user prompt when the active model is near its safe window.Design decisions
Entry::Compact, reset the file tracker, and replace the live transcript through the same boundary helper./compactremains available.Changes
agent.rs,agent/compaction.rs,agent/compact_boundary.rs,main.rsconfig.rs,config/file.rs,client/anthropic.rs,slash/config.rs,slash/status.rstui/app.rs,agent/event.rsdocs/design/agent/auto-compaction.md,docs/research/agent/auto-compaction.md,docs/guide/configuration.md,docs/roadmap.md,CLAUDE.mdTest plan
cargo fmt --all --checkcargo buildcargo clippy --all-targets -- -D warnings: zero warningscargo test: 2001 tests passcargo llvm-cov --ignore-filename-regex 'main\\.rs': 98.64% line coveragepnpm lintpnpm spellcheck