Skip to content

feat: add Kubernetes leader election for multi-replica operator#133

Merged
GatewayJ merged 3 commits into
rustfs:mainfrom
GatewayJ:feature/leader-election
Jun 6, 2026
Merged

feat: add Kubernetes leader election for multi-replica operator#133
GatewayJ merged 3 commits into
rustfs:mainfrom
GatewayJ:feature/leader-election

Conversation

@GatewayJ

@GatewayJ GatewayJ commented Jun 6, 2026

Copy link
Copy Markdown
Member

Type of Change

  • New Feature
  • Bug Fix

Related Issues

  • PR review findings from latest pass on leader election behavior and Helm defaults

Summary of Changes

This PR now includes follow-up fixes to make leader-election behavior topology-safe and deployment-safe, and to keep CI pre-commit checks green.

1) Leader election race/stop improvements (already in codebase)

  • LeaseLock::update() now holds cache lock across read/build/update and clears cache on conflict.
  • Controller stop path now uses cancellation-aware graceful shutdown with timeout + abort fallback.
  • on_stopped_leading() is only emitted on final exit, not every transient renew-loss cycle.
  • Admission validation no longer silently accepts value_from env vars for parity-related env keys; it fails fast with explicit error.

2) Helm/dev deployment leader election behavior

  • Fixed default behavior regression: with CLI default leader_elect=false, Helm and dev manifests now explicitly control leader election.
  • Helm chart now injects --leader-elect automatically:
    • enabled by default when operator.replicas > 1
    • disabled when operator.replicas <= 1
    • can be overridden via operator.leaderElect
  • Added operator.leaderElect to deploy/rustfs-operator/values.yaml and README config table.
  • Added examples in deploy/rustfs-operator/README.md for auto-enable and explicit override.
  • deploy/k8s-dev/operator-deployment.yaml explicitly sets --leader-elect=false.

3) Fix make pre-commit e2e formatting blocker

  • e2e/Cargo.toml is explicitly marked as an isolated workspace by adding [workspace].
  • Regenerated e2e/Cargo.lock accordingly.

Verification

  • make pre-commit

Additional Notes

  • Open PR now includes both functional changes and deployment docs/examples updates to avoid split-brain and clarify behavior.

Implement leader election using Kubernetes Lease resources, aligned with
client-go leaderelection semantics. The implementation is in a standalone
crate (kube-leader-election) with Lock trait abstraction and structured
concurrency.

Key changes:
- crates/leader-election/: new crate with elector, LeaseLock, callbacks
- src/lib.rs: integrate LeaderElector with controller lifecycle
- src/main.rs: add CLI flags (--leader-elect, --leader-elect-lease-name, etc.)
- deploy/: add Lease RBAC and POD_NAME downward API env
- crates/leader-election/README.md: crate-level documentation
- 18 tests (10 unit + 8 integration), all passing

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f9d969e0ad

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/lib.rs Outdated
Comment thread deploy/rustfs-operator/templates/deployment.yaml
GatewayJ added 2 commits June 6, 2026 09:57
…ATOR_NAMESPACE

- Abort controller_handle when lease is lost to prevent multiple active
  controllers after transient API failure (P1 fix from PR review)
- Move OPERATOR_NAMESPACE env var outside sts.enabled block so leader
  election uses correct namespace even when STS is disabled (P2 fix)
@GatewayJ

GatewayJ commented Jun 6, 2026

Copy link
Copy Markdown
Member Author

@codex review

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3c61aeef7d

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/lib.rs
Comment on lines +242 to +244
_ = &mut controller_handle => {
info!("controller finished");
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Stop renewing when the controller task exits

When leader election is enabled, if the controller task ever returns or panics (for example because the kube controller stream terminates), this branch lets on_started_leading return while the elector keeps renewing the Lease until renewal fails or the pod is killed. That leaves the elected pod holding leadership but no longer running reconciliation, so standby replicas cannot take over. Treat this branch as fatal/cancel leadership instead of just logging and returning.

Useful? React with 👍 / 👎.

@GatewayJ GatewayJ added this pull request to the merge queue Jun 6, 2026
Merged via the queue into rustfs:main with commit 60d5ac6 Jun 6, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant