Skip to content

feat(agent): add automatic compaction#82

Merged
hakula139 merged 13 commits into
mainfrom
feat/auto-compaction
May 13, 2026
Merged

feat(agent): add automatic compaction#82
hakula139 merged 13 commits into
mainfrom
feat/auto-compaction

Conversation

@hakula139
Copy link
Copy Markdown
Owner

@hakula139 hakula139 commented May 12, 2026

Summary

Adds default-on automatic context compaction on top of /compact: streaming usage from completed turns now acts as the trigger signal, and the agent compacts before recording the next user prompt when the active model is near its safe window.

Design decisions

  • Use response usage as the trigger signal. Auto-compaction listens to the API usage values already returned during streaming, keeping the trigger tied to actual request pressure.
  • Compact at prompt boundaries. The trigger runs before the next prompt is recorded, so tool results have already been consumed by the assistant before any transcript replacement.
  • Share the manual compaction boundary. Manual and automatic compaction both persist Entry::Compact, reset the file tracker, and replace the live transcript through the same boundary helper.
  • Keep thresholds model-aware. Defaults come from known context windows, percent overrides are capped by the safe trigger, and explicit token thresholds are rejected when they are too low or unsafe for the active model.
  • Bound automatic failures. Automatic compaction logs failures and stops retrying after repeated errors, while manual /compact remains available.

Changes

File Description
agent.rs, agent/compaction.rs, agent/compact_boundary.rs, main.rs Track streaming usage, compact before prompt recording, preserve queued prompts during summarization, reset the automatic failure breaker on config swaps, and isolate compact-boundary side effects.
config.rs, config/file.rs, client/anthropic.rs, slash/config.rs, slash/status.rs Add model-aware threshold policy, merge threshold modes as one setting, improve percent-threshold diagnostics, re-resolve compaction on model swaps, and display resolved auto-compaction state.
tui/app.rs, agent/event.rs Show automatic compaction as an active compacting state, preserve the triggering prompt after the compact boundary, render stdio compact boundaries, and keep queued prompts from draining early.
docs/design/agent/auto-compaction.md, docs/research/agent/auto-compaction.md, docs/guide/configuration.md, docs/roadmap.md, CLAUDE.md Document the implemented trigger flow, threshold limits, user-facing behavior, and module layout.

Test plan

  • cargo fmt --all --check
  • cargo build
  • cargo clippy --all-targets -- -D warnings: zero warnings
  • cargo test: 2001 tests pass
  • cargo llvm-cov --ignore-filename-regex 'main\\.rs': 98.64% line coverage
  • pnpm lint
  • pnpm spellcheck

@hakula139 hakula139 added documentation Improvements or additions to documentation enhancement New feature or request labels May 12, 2026
@hakula139 hakula139 self-assigned this May 12, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented May 12, 2026

Codecov Report

❌ Patch coverage is 99.73457% with 4 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
crates/oxide-code/src/config.rs 99.39% 2 Missing ⚠️
crates/oxide-code/src/tui/app.rs 99.24% 2 Missing ⚠️

📢 Thoughts on this report? Let us know!

@hakula139 hakula139 merged commit 274616e into main May 13, 2026
4 checks passed
@hakula139 hakula139 deleted the feat/auto-compaction branch May 13, 2026 02:08
hakula139 added a commit that referenced this pull request May 13, 2026
## Summary

A follow-up sweep after the auto-compaction PR (#82). Fixes the
model-swap clamp bug, surfaces auto-compaction failures to the user,
tightens the jump-to-bottom overlay so it stops bleeding into chat,
removes the hardcoded 25-round tool-loop cap in favor of a configurable
opt-in, and rolls in a broader style pass over comments, test names, and
section organization.

## Design decisions

- **Clamp the auto-compaction threshold silently on both ends.** A
user-configured threshold above the active model's safe trigger now
snaps down instead of erroring on swap, mirroring the existing snap-up
at the 50K floor. The percent path already worked this way, so the two
arms are now symmetric.
- **Centralize the threshold table on the context window, not the model
id.** The 200K / 1M window pins live in one `test_thresholds` constants
module shared by config, client, and welcome tests; the anchor test
derives them from `default_auto_threshold` so a change to the reserve
cap or buffer fails one assertion instead of drifting silently across
files.
- **Make sink event delivery loggable.** New `AgentSink::emit` logs at
`error!` when the channel rejects a one-shot or user-facing event. `_ =
sink.send(...)` stays only for per-token streaming where a dropped event
is harmless. Auto-compaction failures, the breaker trip, and the
`SessionRolled` notification all route through the canonical wrapper as
user-visible `Error` events or one-shot signals.
- **Render the jump-to-bottom overlay inside a sized pill.** The
transparent right-aligned `Paragraph` form was leaking chat text
through. A `Block`-filled `Rect` with 1-cell padding gives the overlay
an opaque surface bg, matching how the rest of the chrome paints.
- **Default the per-turn tool-round cap to unbounded.** The hardcoded
`MAX_TOOL_ROUNDS = 25` was tripping legitimate multi-hour agentic
sessions. The new `[client] max_tool_rounds` (or `OX_MAX_TOOL_ROUNDS`)
is `Option<u32>`: `None` runs unbounded; `Some(N)` bails the turn after
`N` rounds with the existing runaway-loop error. The cap stays available
as a guard against tools stuck in a retry loop, but normal agent
behavior no longer needs to fit inside an arbitrary 25-round budget.
- **Drop the antithesis / semicolon over-use in prose comments.** Sweep
agent.rs and peers for `;` joiners that read as run-on sentences, "X,
not Y" framing that negates a strawman, multi-paragraph contracts that
should be one tight paragraph, and `Regression:` / `Pin the X arm of Y:`
task-narration leads on tests whose names already say what they cover.
- **Keep the test suite lean over rigidly mirrored.** Where the test
reviewer flagged categorical sections in `markdown/render.rs`
(Paragraphs, Headings, Lists, ...) or missing `wrap_line_` prefixes in
single-function modules, the existing form is more useful and stays.
Where a divider was actively misleading (`agent_turn` covering only
fixtures, `paused_counter_*` sitting under `update_layout`,
`jump_overlay_label` interleaved into `draw_frame` tests), the section
is renamed or moved.

## Changes

| File | Description |
| ---- | ----------- |
| `agent.rs` | `agent_turn` takes `max_tool_rounds: Option<u32>` (`None`
= unbounded, `Some(n)` = bail after `n` rounds). Module doc, loop, and
bail message updated. Replaces the 25-round const-test with paired
`with_some_cap_bails_*` and `with_none_cap_runs_unbounded_*` tests. |
| `agent.rs`, `agent/compact_boundary.rs`, `agent/event.rs`,
`agent/compaction.rs`, `session/title_generator.rs`, `main.rs` | New
`AgentSink::emit`. Auto-compaction failures, breaker trip, and
`SessionRolled` route through it as user-visible signals. Comment sweep:
drop `;` joiners that read as run-on sentences, tighten verbose
docstrings, drop one task-narration lead. |
| `config.rs` | Threshold floor and ceiling now both clamp silently.
`threshold_from_tokens` returns `u32`. Anchor test pins the 200K / 1M
window thresholds through `default_auto_threshold`.
`display_auto_compaction` reads `"at {n} tokens"` instead of `"on at {n}
tokens"`. New `max_tool_rounds: Option<u32>` field on `Config` and
`ConfigSnapshot`, loaded from `[client] max_tool_rounds` or
`OX_MAX_TOOL_ROUNDS`. New `display_max_tool_rounds` helper for
`/config`. |
| `config/file.rs` | `ClientConfig` carries `max_tool_rounds:
Option<u32>`. Merge and parser tests extended. |
| `client/anthropic.rs` | `stream_message` doc dropped its multi-bullet
"Wire shape" form for a tighter contract paragraph. `set_model` tests
reuse the new `test_thresholds` constants. New
`Client::max_tool_rounds()` getter. |
| `main.rs` | Three `agent_turn` call sites
(`AgentLoopTask::handle_submit_prompt`, `bare_repl`, `headless`) pass
`client.max_tool_rounds()`. The remaining inline `if let Err =
sink.send` / `tracing::error!` on `SessionRolled` folds into
`sink.emit`. |
| `slash/config.rs` | `/config` modal adds a "Max Tool Rounds" row
showing `unbounded` or the configured value. Height anchor updated. Test
assertions match the new `"at {n} tokens"` phrasing. |
| `tui/app.rs` | Jump-to-bottom overlay paints inside a sized pill
(`Block` + 1-cell padding) so it no longer bleeds through chat.
`render_app` / `rendered_text` / `long_chat_block` hoisted to top-level
fixtures since several sections use them. `jump_overlay_label` section
moved out of the `draw_frame` interleave. `paint_below_starters_*` and
`draw_frame_streaming_*` drop their task-narration leads. |
| `tui/components/chat.rs` | Add the missing `// ── bump_paused_counter
──` test divider. Rename `paused_counter_saturates` to
`paused_counter_does_not_overflow_at_u32_max`. |
| `tui/components/welcome.rs`, `tui/theme/loader.rs` | Drop test-body
narration leads. |
| `slash/model.rs` | Helper order now follows `resolve_base`'s call
sequence (`is_dated_model_id` → `has_dated_suffix` →
`is_selectable_known_id` → `candidates` → listing). |
| `slash/matcher.rs` | `rank_by_prefix` moved after the `best_match`
helper cluster so the cluster stays contiguous. |
| `slash/registry.rs` | Rename
`empty_metadata_offenders_flags_a_synthetic_violator` to a
scenario-keyed name. |
| `slash/resume.rs` | Rename `ctrl_d_pushes_*` to
`ctrl_d_or_delete_pushes_*` since the body covers both; drop the
now-redundant comment. Tighten the `reload_*` comment to the WHY. |
| `slash/status.rs` | Test assertions match the new `"at {n} tokens"`
phrasing. |
| `session/state.rs` | Split `commit_compact_*` tests out of the
`compact_entries` section. |
| `docs/guide/configuration.md` | Threshold guide now describes the
silent clamping behavior on both ends. New `max_tool_rounds` row in the
`[client]` table, env-var row, and short prose section explaining the
unbounded default and opt-in cap. |

## Test plan

- [x] `cargo fmt --all --check`
- [x] `cargo build`
- [x] `cargo clippy --all-targets -- -D warnings`: zero warnings
- [x] `cargo test`: 2005 tests pass
- [x] `cargo llvm-cov --ignore-filename-regex 'main\.rs'`: 98.63% line
coverage
- [x] `pnpm lint`
- [x] `pnpm spellcheck`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant