perf(encoding): #124 Phase 4 — HC speculative tail check + MatcherStorage enum dispatch by polaz · Pull Request #125 · structured-world/structured-zstd

polaz · 2026-05-14T23:20:58Z

Summary

Phase 4 of #111. Three commits land donor-parity hot-path improvements + the supporting backend refactor:

Phase 4.1 (82e85fcf, hardened in c16ca32b): port donor zstd_lazy.c:714 speculative tail check into HcMatcher::hash_chain_candidate. Once best is set, the per-byte common_prefix_len walk is gated on a 4-byte tail compare at tail_off = best.match_len − lit_len − 3 (via checked_sub(lit_len + 3)). The − lit_len term accounts for extend_backwards having added up to lit_len backward bytes to best.match_len; a new candidate can in turn replace B_best of its own length with up to lit_len backward bytes, so the worst-case forward length it needs to outscore best (under the LIFO offset-monotonic walk argument) is best.match_len − lit_len + 1. When lit_len == 0 the formula reduces to best.match_len − 3 (no backward extension possible).
Phase 4.3 (8809bc01): collapse the four parallel driver backend fields (match_generator, Option<DfastMatchGenerator>, Option<RowMatchGenerator>, Option<HcMatchGenerator>) plus active_backend: BackendTag into a single storage: MatcherStorage enum. Eliminates dead-variant allocation across backend switches — the old layout kept drained metadata for every backend ever touched. The active backend is derived directly from the storage variant; no separate runtime tag to drift. Variant docs enumerate the full CompressionLevel → backend mapping (Simple = Uncompressed / Fastest / Level(1) / non-positive levels; Dfast = Default / Level(0,2,3); Row = Level(4); HashChain = Better / Best / Level(5..=22)).
Phase 4 Dfast donor parity (c16ca32b): port donor zstd_double_fast.c post-match rep-0 extension into the Dfast fast-loop and general-loop emit paths. After each emit_candidate, the new helper extend_with_repcode_after_match chains additional zero-literal rep sequences while 4 bytes at the new cursor keep matching cursor − offset_hist[1]. Each iteration emits one rep sequence with the swapped-in offset, advances pos and literals_start, updates offset_hist via the donor offset_2 = offset_1; offset_1 = old_offset_2; swap, inserts every position of the rep match into the long/short hash tables, and skips the full hash probe on every extra match. Uses donor's MINMATCH = 4 for the rep threshold (not DFAST_MIN_MATCH_LEN = 6) — a 4-byte rep is always a net win over re-running the full hash search.

Sub-goals from #124

✅ 4.1 HC speculative tail check — landed; correctness bound was tightened in c16ca32b after Copilot review (the original formula used best.match_len directly and was unaware of extend_backwards backward bytes; new formula subtracts lit_len).
✅ 4.2 matchEndIdx-8 skip — audited; already implemented in main (skip_insert_until_abs wired across BT walker + the gate at the top of collect_optimal_candidates_initialized_body! + bt_update_tree_until_* start-from-skip + apply_limited_update_after_long_match). Issue text outdated, no new work needed.
✅ 4.3 Matcher trait wiring — landed as enum dispatch (MatcherStorage) per the user-approved trait-vs-enum trade-off (cleaner for closed set, zero vtable cost, matches the existing HcBackend precedent inside HcMatchGenerator).
✅ 4.4 Residual strategy_tag sweep — audited. Inner loops of compress_block::<S> → start_matching_strategy::<S> → start_matching_optimal::<S> → build_optimal_plan_impl<S,_,_> are fully generic; no match self.strategy_tag / parse_mode / active_backend inside. The remaining match self.active_backend() calls in the driver are per-frame/per-slice setup paths (eviction, dictionary priming, slice commit), not hot path. self.is_btultra2 reads in per-CPU bt_update_tree_until_* kernels were kept — making them S-generic would multiply the binary by 4× across the target_feature umbrellas for a single register-resident bool read.
✅ Dfast donor parity extension — extend_with_repcode_after_match ports donor's post-match rep-0 chain that the existing Rust Dfast skipped (it re-ran the full hash search at every position even after a primary match into a repetitive run). Closes the biggest hot-path feature gap between Rust Dfast and donor ZSTD_compressBlock_doubleFast_*.

Verification

cargo nextest run --profile ci -p structured-zstd --features hash,std,dict_builder 475/475 pass, 3 skipped (incl. level22_sequences_match_donor_on_corpus_proxy, full cross-validation, roundtrip + parity gates for default / better / best / level22).
cargo test --doc -p structured-zstd --features hash,std,dict_builder 10/10 pass.
cargo clippy -p structured-zstd --features hash,std,dict_builder --all-targets -- -D warnings clean.
Local cargo bench --quick '^compress/default' on x86_64-darwin: every default-level scenario improves relative to the c_ffi baseline-noise floor (c_ffi shifted uniformly +25 % on the same run — criterion's stored baseline ran in a colder state). Most relevant numbers: decodecorpus-z000033 (the 27× C-FFI gap target from feat: encoder architecture rewrite (split monolith, const-generic Strategy, arena allocator) #111) gains −11 % vs c_ffi noise; high-entropy-1m −16 %; low-entropy-1m −8 %.
Linux x86_64-gnu sanity-check (Intel i9-9900K, Fedora 44, rustc 1.94 stable, cargo clean -p between commits): Phase 2 / Phase 3 / Phase 4 HEAD all land within ±1 % of 1.52 ms on compress/level22/low-entropy-1m/matrix/pure_rust — i.e. no level22 regression on a fresh clean build. The 1.527 → 2.483 ms CI alert that fired on the Phase 3 merge commit was not reproducible across two architectures (ARM64 darwin + x86_64-gnu) and appears to be GitHub-runner microarch / VM-noise, not a real regression.

Test plan

CI green across (test × {ubuntu,macos,windows}), cross-i686, Fuzz smoke, msrv.
CI level22_sequences_match_donor_on_corpus_proxy ratio gate green.
CI bench-matrix (18 shards) full numbers in PR comment from the benchmark action — looking specifically for the default-level (compress/default/*) shards to confirm the Dfast post-match rep extension wins.

Closes part of #124.

Summary by CodeRabbit

Refactoring
- Optimized compression matching pipeline with enhanced candidate evaluation techniques
- Improved repeated pattern detection and handling during compression operations
- Restructured matcher backend architecture for better code organization and maintainability
Tests
- Added regression tests validating compression matching behavior

Port donor `zstd_lazy.c:714` (`ZSTD_HcFindBestMatch`) speculative 4-byte tail compare into `HcMatcher::hash_chain_candidate`. Once a `best` candidate is set with `match_len >= 4`, gate the per-byte `common_prefix_len` walk on `concat[i+best_ml-3..best_ml+1]` vs `concat[c+best_ml-3..best_ml+1]`. A mismatch proves the candidate cannot reach `best_ml + 1` and the walk would only confirm `len <= best_ml`. Correctness rests on the monotonic-offset property of LIFO chain walks: each subsequent candidate has `new.offset_bits >= best.offset_bits`, so `better_candidate` (gain = `len*4 - offset_bits`) can return a strict winner only when `new.match_len > best_ml`. Equal-length or shorter candidates can only tie or lose, and `better_candidate` keeps the existing best on ties. Bounds: when `current_idx + best_ml + 1 > history_tail`, the current position cannot reach `best_ml + 1` either, so the candidate cannot win — same skip. Verification: - `cargo nextest run -p structured-zstd --features hash,std,dict_builder` 476/476 pass (incl. `level22_sequences_match_donor_on_corpus_proxy` and the cross-validation roundtrip + parity gates). - `cargo test --doc` 10/10 pass. - `cargo clippy -p structured-zstd --features hash,std,dict_builder --all-targets -- -D warnings` clean. - `cargo bench --quick '^compress/better'` (Lazy backend, the actual HC chain-walker user) on x86_64-darwin: one scenario −5.4 % (small-10k-random, p = 0.05), the rest within ±2 % of stored baseline — `--quick` noise floor swallows whatever steady-state win remains. Full baseline-vs-feature bench will be measured on the CI bench-matrix.

Collapse the driver's four parallel backend fields (`match_generator: MatchGenerator`, `dfast_match_generator: Option<DfastMatchGenerator>`, `row_match_generator: Option<RowMatchGenerator>`, `hc_match_generator: Option<HcMatchGenerator>`) plus the `active_backend: BackendTag` discriminator into a single `storage: MatcherStorage` enum with four variants (`Simple` / `Dfast` / `Row` / `HashChain`). The active backend is now derived directly from the storage variant via `MatcherStorage::backend()` — no separate runtime tag that could drift against the variant. Why - The prior layout kept drained-but-allocated inner structures for every backend the driver had ever touched (the lazy `Option`s + the always-present `match_generator` retained their backing metadata across backend switches). With the enum, the inactive variants are dropped on switch, so the driver carries exactly one backend's state at any time. - Every per-frame / per-slice driver operation (`reset` / `prime_with_dictionary` / `commit_space` / `get_last_space` / `skip_matching_with_hint` / `retire_dictionary_budget` / `trim_after_budget_retire` / `skip_matching_for_dictionary_priming`) used to dispatch on `active_backend` to pick the matching field. The match-on-tag + field-access pair is replaced with `match &mut self.storage`, which fuses dispatch and accessor and gives the borrow checker field-disjoint splits "for free" (manual destructure where the closure captures `vec_pool` / `suffix_pool` alongside the backend borrow). - This is non-hot-path dispatch (per block at most, per frame typically). The hot per-block `compress_block::<S>` already dispatches via `match S::BACKEND` over a `const`, which the compiler const-folds out per monomorphisation — that path is unchanged. Notes - Backend transitions inside `Matcher::reset` drain the old variant's allocations into `vec_pool` / `suffix_pool`, then swap `self.storage` to a freshly constructed variant for the new backend. The old variant is dropped at the swap; with the HashChain → Simple transition the pre-swap `hash_table` / `chain_table` / `hash3_table` zeroing kept from the prior field-shaped impl ensures the BtUltra2 hash3 allocation (`1 << HC3_HASH_LOG`) does not survive the switch. - Three tests went obsolete because they exercised the old lazy-init recovery / drained-but-allocated invariant (`driver_reset_from_row_backend_tolerates_missing_row_matcher` flipped `row_match_generator = None` by hand; `driver_best_to_fastest_releases_oversized_hc_tables` and `…reclaims_row_buffer_pool` inspected the post-switch state of a backend that no longer exists). Replaced with one new test proving the new invariant — buffer pool grows across the HashChain → Simple switch, backend tag flips, no field-shaped inspection of the dropped variant. - Immutable accessors (`simple` / `dfast_matcher` / `row_matcher` / `hc_matcher`) are now `#[cfg(test)]` — production code only reads via the mutable counterparts or pattern-matches `storage` directly. Verification - `cargo nextest run -p structured-zstd --features hash,std,dict_builder` 475/475 pass, 3 skipped (`level22_sequences_match_donor_on_corpus_proxy` + full cross-validation + roundtrip + parity gates). - `cargo test --doc` 10/10 pass. - `cargo clippy -p structured-zstd --features hash,std,dict_builder --all-targets -- -D warnings` clean.

coderabbitai · 2026-05-14T23:21:10Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 27c4a3e0-b1b8-4a3a-a0ea-7a52a013bebb

📥 Commits

Reviewing files that changed from the base of the PR and between c16ca32 and 06dbcd1.

📒 Files selected for processing (3)

zstd/src/encoding/dfast/mod.rs
zstd/src/encoding/hc/mod.rs
zstd/src/encoding/match_generator.rs

📝 Walkthrough

Walkthrough

This PR adds donor-style rep‑0 post-match chaining to the Dfast matcher, a speculative 4‑byte tail-equality check to the HC matcher candidate loop, and refactors MatchGeneratorDriver to hold exactly one active backend in a MatcherStorage enum with pattern-matched accessors and reset/drain semantics.

Changes

Backend Storage Consolidation and HC + Dfast Enhancements

Layer / File(s)	Summary
Dfast post-match rep‑0 chaining `zstd/src/encoding/dfast/mod.rs`	Adds `extend_with_repcode_after_match` and calls it from general and fast loops to opportunistically chain zero-literal rep matches, update `literals_start`, insert rep ranges into hash tables, and advance `pos` without re-probing hashes.
HC Matcher speculative 4-byte tail check `zstd/src/encoding/hc/mod.rs`	After a `best` with `match_len >= 4` exists, candidate evaluation performs an early 4-byte tail-equality check (tail at `best_ml - 3`) and skips candidates that fail before calling `common_prefix_len`.
MatcherStorage enum and backend derivation `zstd/src/encoding/match_generator.rs`	Introduces `MatcherStorage` enum (Simple/Dfast/Row/HashChain) with `backend()` to derive active family, replacing per-backend optional fields plus a separate `active_backend` tag.
Driver initialization and accessor helpers `zstd/src/encoding/match_generator.rs`	`MatchGeneratorDriver::new()` initializes `storage` with Simple; adds `active_backend()`, `simple()/simple_mut()`, `dfast_matcher(_mut)`, `row_matcher(_mut)`, `hc_matcher()` accessors that pattern-match the enum.
Core driver operations using new storage model `zstd/src/encoding/match_generator.rs`	Rewrites dictionary priming, budget retirement, trimming, `reset`, `get_last_space`/`commit_space`, skipping, and dispatch to match on `storage` variants; `reset` drains outgoing variant buffers into pools and clears HC tables when switching away.
Test updates for new storage accessors and invariants `zstd/src/encoding/match_generator.rs`	Updates tests to use `driver.active_backend()` and backend accessors (`simple()/simple_mut()/hc_matcher()`), and to validate storage swaps and pool recycling rather than inspecting removed optional fields or persistent HC internals.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related issues

feat(encoding): #111 Phase 4 — donor fast paths + Matcher trait #124: Directly implements the donor-style HC speculative tail check and backend centralization via a MatcherStorage enum.
feat: encoder architecture rewrite (split monolith, const-generic Strategy, arena allocator) #111: Implements donor-style speculative tail check and Dfast rep‑0 chaining described by the issue.
feat(encoding): #111 Phase 3 — const-generic Strategy dispatch #122: Touches MatchGeneratorDriver backend-dispatch machinery that this PR refactors.

Possibly related PRs

structured-world/structured-zstd#34: Related refactor of MatchGeneratorDriver backend switching and Dfast matcher behavior.
structured-world/structured-zstd#84: Overlapping backend-state and reset/control-flow changes for adding the Row backend.
structured-world/structured-zstd#119: Prior changes to HcMatcher::hash_chain_candidate that this PR further optimizes.

Poem

🐰 I hopped through hashes, quick and spry,
Chained rep matches as I bounded by.
Tails now checked with four‑byte sight,
One enum holds the backends tight.
A rabbit's hop—compression done with delight.

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and concisely summarizes the three main changes: HC speculative tail check, MatcherStorage enum dispatch, and relates to `#124` Phase 4.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/#124-phase4-donor-fastpaths

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

codecov · 2026-05-14T23:24:11Z

Codecov Report

❌ Patch coverage is 94.72222% with 19 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
zstd/src/encoding/match_generator.rs	94.18%	10 Missing ⚠️
zstd/src/encoding/dfast/mod.rs	93.87%	9 Missing ⚠️

📢 Thoughts on this report? Let us know!

Copilot

Pull request overview

This PR continues the encoder architecture/performance work by adding an HC speculative tail check and replacing parallel matcher backend fields with a single active MatcherStorage enum.

Changes:

Adds a speculative 4-byte tail comparison in HC hash-chain candidate probing.
Collapses matcher backend storage into a single enum variant plus derived backend dispatch.
Updates driver tests and backend-switch assertions for the new storage model.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File	Description
`zstd/src/encoding/match_generator.rs`	Introduces `MatcherStorage` and rewires driver reset, commit, dictionary, and tests around enum-backed storage.
`zstd/src/encoding/hc/mod.rs`	Adds the HC speculative tail gate before running full prefix matching.

Bundles the Dfast donor port and the HC speculative-gate correctness fix from Copilot review on PR #125. Same Phase 4 hot-path family, shares one retest cycle. ## Dfast post-match rep-0 extension Port donor `zstd_double_fast.c` post-match rep chain into the Dfast fast-loop and general-loop emit paths. After each `emit_candidate`, the new helper `extend_with_repcode_after_match` chains additional zero-literal rep sequences while 4 bytes at the new cursor keep matching `cursor - offset_hist[1]`. Each iteration emits one rep sequence with the swapped-in offset, advances `pos` and `literals_start` past the rep match, updates `offset_hist` via `encode_offset_with_history` (the donor `offset_2 = offset_1; offset_1 = old_offset_2;` swap), inserts every position of the rep match into the long/short hash tables so subsequent positions can hit them, and skips the full hash-table probe entirely on every extra match. Uses donor's `MINMATCH = 4` for the rep threshold rather than the stricter `DFAST_MIN_MATCH_LEN = 6` enforced on the main search — donor accepts any 4-byte rep extension and we mirror that because the rep emission carries no offset cost. ## HC speculative tail check — correctness fix Copilot flagged a backward-extension-unaware bound in the gate introduced by 82e85fc. The old code used `best.match_len` directly as the forward-extent reference, but `extend_backwards` may have added up to `lit_len` backward bytes to that total. A new chain candidate can in turn replace `B_best` of its own length with up to `lit_len` backward bytes, so the worst-case forward length it needs to outscore `best` (under the LIFO offset-monotonic walk argument) is `best.match_len - lit_len + 1`. The 4-byte tail probe now anchors at `tail_off = best.match_len - lit_len - 3` (via `checked_sub(lit_len + 3)`), which covers exactly that boundary. When `best.match_len <= lit_len + 3` the worst-case forward target is so close to `current_idx` that the chain-walk cost is already trivial — the gate is skipped via the `checked_sub`. When `lit_len == 0` the formula reduces to the pre-fix `best.match_len - 3` (no backward extension possible). Comment block in `hash_chain_candidate` now enumerates both the backward-extension bound and the offset-monotonicity argument. ## Driver test tightening (Copilot threads #2 / #3 / #4) - `driver_reset_from_row_backend_recycles_row_buffers_into_pool`: assertion changed from `vec_pool.len() >= before_pool` to `>` so the test fails on the regression it is meant to cover. With `before_pool = 0` on a fresh driver the weaker bound passed trivially even if the Row to Simple drain closure never pushed anything back into `vec_pool`. - `row_get_last_space_and_reset_to_fastest_clears_window` renamed to `..._drops_row_variant` with a doc comment explaining why the original `clears_window` assertion is intentionally gone (the variant itself is dropped on swap; no `row_matcher()` to inspect post-switch). Points at the sibling pool-recycling test by name. - `MatcherStorage` variant docs rewritten to enumerate the full `CompressionLevel` to backend mapping that `resolve_level_params` produces for each variant (Simple covers `Uncompressed`, `Fastest`, `Level(1)`, and any non-positive `Level(n) != 0`; Dfast covers `Default`, `Level(0)`, `Level(2)`, `Level(3)`; Row covers `Level(4)`; HashChain covers `Better`, `Best`, and `Level(5..=22)` across Lazy / BtOpt / BtUltra / BtUltra2). The pre-fix doc said "Level 1" / "Levels 2-3" which was misleading for the named-level variants. ## Verification - nextest 475/475 pass, 3 skipped (incl. `level22_sequences_match_donor_on_corpus_proxy` parity gate and the full cross-validation + roundtrip + ratio gates for default / better / best / level22). - doctests 10/10 pass. - clippy --all-targets -D warnings clean. - bench --quick on `^compress/default` (x86_64-darwin): every default-level scenario improves relative to the c_ffi baseline noise floor. Most relevant numbers: `decodecorpus-z000033` (the 27x C-FFI gap target from #111) gains -11 % vs c_ffi noise; `high-entropy-1m` -16 %; `low-entropy-1m` -8 %.

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@zstd/src/encoding/dfast/mod.rs`:
- Around line 452-460: The check `rep > pos` incorrectly prevents chaining to
retained history across block boundaries; keep the `rep == 0` early break but
remove the `rep > pos` guard and rely on the `cur_idx.checked_sub(rep)`
(`cur_idx`, `cand_idx`, `current_abs_start`, `history_abs_start`, `offset_hist`)
to detect out-of-range repeats so repeated offsets that point into retained
history are allowed; in short, delete the `rep > pos` condition and let the
existing `checked_sub` None branch handle invalid sources.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 52ac21cd-64af-4ece-ad72-447894484c1f

📥 Commits

Reviewing files that changed from the base of the PR and between 8809bc0 and c16ca32.

📒 Files selected for processing (3)

zstd/src/encoding/dfast/mod.rs
zstd/src/encoding/hc/mod.rs
zstd/src/encoding/match_generator.rs

Copilot

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

…culative gate Round-2 review fixes from CodeRabbit / Copilot on PR #125. Bundles the behavioural fix called out by both reviewers (block-local `rep > pos` guard wrongly rejected retained-history offsets) plus targeted regression tests for the helpers that previously had only roundtrip coverage. ## Behavioural fix: `extend_with_repcode_after_match` accepts cross-block offsets CodeRabbit (Major / Quick win) and Copilot both flagged that the helper's `if rep == 0 || rep > pos { break; }` guard rejected valid repeat offsets pointing into retained history whenever the source lived past the current block-local cursor. Since `DfastMatchGenerator` keeps a contiguous `live_history()` buffer covering `history_abs_start..history_abs_end`, the only correct bound is "does the candidate land at or after `history_abs_start`" — which `cur_idx.checked_sub(rep)` already enforces (`None` ⇒ rep points before `history_abs_start`, break). Replaced the guard with `if rep == 0 { break; }` and added a comment explaining that `checked_sub` is the authoritative bound. Near block boundaries this restores the donor-parity chain win the helper is meant to recover. ## Regression tests * `dfast::extend_with_repcode_tests::dfast_repcode_extension_emits_zero_literal_rep_on_constant_run` exercises the post-match rep helper directly with a hand-built post-primary-match state: a constant-byte block, `offset_hist[1] = 1`, and an in-block cursor mid-block. Verifies the helper emits a zero-literal Triple with `offset = 1` (donor parity) and advances `pos` / `literals_start` consistently. Going through `start_matching` was unreliable because the primary `best_match` greedily consumes the entire constant run in a single Triple (`offset = 1, match_len = block - 1`), leaving the helper nothing to extend. * `dfast::extend_with_repcode_tests::dfast_repcode_extension_walks_into_retained_history` builds a two-block fixture and probes with `rep = 40` against a block-local cursor at `pos = 5`. The pre-fix `rep > pos` guard would have rejected this; the post-fix code emits the rep correctly. * `hc::hc_tests::hash_chain_candidate_speculative_gate_handles_lit_len_backward_extension` guards the post-fix `tail_off = best.match_len - lit_len - 3` formula with a monotonicity check: for the same probe position and chain content, the returned `match_len` must be non-decreasing in `lit_len` because more backward-extension headroom can only enlarge (or leave unchanged) every candidate's effective length. A regression where the gate skips candidates that backward extension would have rescued surfaces as `len_high_lit < len_low_lit`. ## Doc / comment clarifications (Copilot threads on `MatcherStorage`) * The post-switch state comment in `driver_best_to_fastest_releases_oversized_hc_tables` previously said the manual `hash_table = Vec::new()` assignments release allocations "immediately rather than waiting for `Drop`". That was misleading — assigning `self.storage = MatcherStorage::Simple(...)` would drop the old `HashChain` variant in the same call regardless. The real reason for the up-front reassignment is to cap peak memory during the swap: by releasing the table allocations *before* constructing the replacement variant, the peak resident set is "old data buffers being drained into `vec_pool` + new `MatchGenerator` skeleton" instead of "old tables still resident + new variant under construction". Comment updated to reflect that. ## Verification - nextest 478/478 pass, 3 skipped (478 = 475 + the three new tests). - doctests 10/10 pass. - clippy --all-targets -D warnings clean.

polaz added 2 commits May 15, 2026 01:49

Copilot AI review requested due to automatic review settings May 14, 2026 23:20

Copilot started reviewing on behalf of polaz May 14, 2026 23:21 View session

Copilot AI reviewed May 14, 2026

View reviewed changes

Comment thread zstd/src/encoding/hc/mod.rs Outdated

Comment thread zstd/src/encoding/match_generator.rs Outdated

Comment thread zstd/src/encoding/match_generator.rs

Comment thread zstd/src/encoding/match_generator.rs Outdated

coderabbitai Bot reviewed May 15, 2026

View reviewed changes

Comment thread zstd/src/encoding/dfast/mod.rs

polaz requested a review from Copilot May 15, 2026 00:20

Copilot started reviewing on behalf of polaz May 15, 2026 00:20 View session

Copilot AI reviewed May 15, 2026

View reviewed changes

Comment thread zstd/src/encoding/dfast/mod.rs Outdated

Comment thread zstd/src/encoding/match_generator.rs Outdated

Comment thread zstd/src/encoding/dfast/mod.rs

Comment thread zstd/src/encoding/hc/mod.rs

polaz requested a review from Copilot May 15, 2026 00:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(encoding): #124 Phase 4 — HC speculative tail check + MatcherStorage enum dispatch#125

perf(encoding): #124 Phase 4 — HC speculative tail check + MatcherStorage enum dispatch#125
polaz wants to merge 4 commits into
mainfrom
feat/#124-phase4-donor-fastpaths

polaz commented May 14, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 14, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related issues

Possibly related PRs

Poem

Uh oh!

codecov Bot commented May 14, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

polaz commented May 14, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Sub-goals from #124

Verification

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related issues

Possibly related PRs

Poem

Uh oh!

codecov Bot commented May 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

polaz commented May 14, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 14, 2026 •

edited

Loading

codecov Bot commented May 14, 2026 •

edited

Loading