Skip to content

feat: kernel-dependent BTRFS UUID collision resolution (temp_fsuid on >=6.7)#674

Open
bfjelds wants to merge 6 commits into
user/bfjelds/mjolnir/acl-cosi-combinedfrom
user/bfjelds/mjolnir/acl-cosi-temp-fsuid
Open

feat: kernel-dependent BTRFS UUID collision resolution (temp_fsuid on >=6.7)#674
bfjelds wants to merge 6 commits into
user/bfjelds/mjolnir/acl-cosi-combinedfrom
user/bfjelds/mjolnir/acl-cosi-temp-fsuid

Conversation

@bfjelds

@bfjelds bfjelds commented Jun 4, 2026

Copy link
Copy Markdown
Member

Summary

Adds kernel-version-dependent mount strategy for ACL BTRFS UUID collisions during A/B updates:

  • Kernel >=6.7: Mount the staging device with -o temp_fsuid, which assigns a temporary in-memory UUID and bypasses the BTRFS global UUID registry. This is the preferred solution as it mounts real staging content without needing verity hash verification.
  • Kernel <6.7 (e.g. 6.6.x): Fall back to the existing bind-mount from active /usr, which requires verity hash matching to prove content is identical.

Note: The temp_fsuid codepath is aspirational. We believe it will work, but until trident A/B update and ACL run on a kernel >6.6, it is untested in production.

Changes

crates/osutils/src/uname.rs

  • Added KernelVersion struct with parse(), running(), and supports_btrfs_temp_fsuid()
  • 6 unit tests covering Azure Linux format, 6.7+, garbage input

crates/trident/src/engine/newroot.rs

  • Split monolithic detect_acl_btrfs_uuid_collision into three focused functions:
    • detect_acl_btrfs_uuid_collision - pure UUID collision detection
    • verify_acl_bind_mount_safety - verity hash check (bind-mount path only)
    • resolve_acl_btrfs_uuid_collision - orchestrator that picks strategy based on kernel version
  • New AclBtrfsCollisionResolution enum: TempFsuid vs BindMountActiveUsr
  • Unknown/unparseable kernel version falls back to bind-mount (safe default)

Testing

  • All 6 KernelVersion unit tests pass
  • All 7 ACL duplicate UUID validation tests pass
  • cargo build and cargo fmt --check clean on Linux

bfjelds and others added 6 commits June 5, 2026 12:35
On kernel >=6.7, use mount -o temp_fsuid to mount the staging device
directly, bypassing the BTRFS global UUID registry. This is the
preferred solution as it mounts real staging content without needing
verity hash verification.

On kernel <6.7 (e.g. 6.6.x), fall back to the existing bind-mount
strategy which requires verity hash matching to prove the active and
staging content are identical.

Changes:
- Add KernelVersion parser to osutils/uname.rs with unit tests
- Split detect_acl_btrfs_uuid_collision into collision detection and
  resolution strategy (AclBtrfsCollisionResolution enum)
- Add verify_acl_bind_mount_safety for the bind-mount path
- Mount handler selects strategy based on kernel version

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The temp_fsuid mount path (kernel >=6.7) is aspirational and untested in
production. Gate it behind the enableAzl4 internal parameter so it only
activates when explicitly opted in. When the flag is absent, the
bind-mount fallback is used. No special warning or fallback from
temp_fsuid failure — mount errors propagate as-is to surface issues.

The enableAzl4 flag is intentionally broad: it will gate additional
Azure Linux 4 behaviors as they are added.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
DR-002: Move BTRFS temp_fsuid domain knowledge out of osutils. Remove
supports_btrfs_temp_fsuid() from KernelVersion (generic layer) and
define BTRFS_TEMP_FSUID_MIN_KERNEL constant in the consumer (newroot.rs).
KernelVersion now relies on derived Ord for version comparisons.

DR-003: Distinguish uname execution failure from parse failure. The
match on KernelVersion::running() now logs different warnings for Err
(uname command failed) vs Ok(None) (output not parseable).

DR-004: Add doc comment explaining why verity hash verification is
intentionally skipped for the temp_fsuid path (it mounts real staging
content, not a bind-mount of active, so no identity assumption to verify).

DR-005: Eliminate double pattern match on AclBtrfsCollisionResolution in
the mount loop. Add collision_uuid() accessor method so the UUID is
extracted once, then dispatch on the resolution variant in a single match.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@bfjelds bfjelds force-pushed the user/bfjelds/mjolnir/acl-cosi-temp-fsuid branch from eed9f51 to e4eaf3f Compare June 5, 2026 19:36
@bfjelds

bfjelds commented Jun 5, 2026

Copy link
Copy Markdown
Member Author

/azp run [GITHUB]-trident-pr

@azure-pipelines

Copy link
Copy Markdown
Azure Pipelines could not run because the pipeline triggers exclude this branch/path.

@bfjelds bfjelds marked this pull request as ready for review June 8, 2026 18:16
@bfjelds bfjelds requested a review from a team as a code owner June 8, 2026 18:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant