Skip to content

feat(helm): support extra volumes/mounts; add external API key mount-path validation#772

Open
JoeDerby wants to merge 16 commits into
Cloudzero:developfrom
JoeDerby:extra-volumes
Open

feat(helm): support extra volumes/mounts; add external API key mount-path validation#772
JoeDerby wants to merge 16 commits into
Cloudzero:developfrom
JoeDerby:extra-volumes

Conversation

@JoeDerby
Copy link
Copy Markdown

Why?

This PR adds first-class support for attaching custom extraVolumes and extraVolumeMounts to CloudZero workloads, while preserving safe API key configuration validation.

It enables external file-based credential flows (for example CSI-driven mounts) without introducing a new feature toggle, and keeps default chart behaviour intact for existing users.

What

Added new chart values:

  • extraVolumes
  • extraVolumeMounts

Wired those values into all relevant workloads:

  • agent-deploy

  • agent-daemonset

  • aggregator-deploy

  • config-loader-job

  • webhook-deploy

  • backfill-job

  • Added reusable Helm helpers for extra volumes/mounts rendering.

  • Updated docs in helm/README.md and value schema definitions.

  • Strengthened schema validation:

    • Default mode: exactly one of apiKey or existingSecretName must be set.
    • External mode: if both are null, extraVolumes and extraVolumeMounts must be non-empty.
  • Added render-time validation guard:

    • In external mode, at least one extraVolumeMount.mountPath must match serverConfig.containerSecretFilePath.
    • This prevents misconfiguration where credentials are mounted but not at the path the app reads.
  • Added/updated Helm unit tests for:

    • Generic extra volume/mount behaviour (non-CSI-specific fixtures)
    • API key source validation logic
    • External mount-path validation guard

Users need to mount additional volumes (including but not limited to CSI-backed sources) across CloudZero components without changing default secret behaviour.
At the same time, we need to avoid silent credential misconfigurations. This PR balances flexibility with clear guardrails.

Backward Compatibility

  • Existing apiKey / existingSecretName flows continue to work as before.
  • No required changes for current users unless they choose external mode.
  • External mode is now explicit and validated to fail fast when mount path configuration is incorrect.

How Tested

All Helm unit tests suite passed.
Deployed from Fork to test agent is working.

Scenarios explicitly verified:

  • extraVolumes/extraVolumeMounts render across deployment, daemonset, and job templates.
  • Default credential validation still enforces valid apiKey/existingSecretName combinations.
  • External mode (apiKey: null, existingSecretName: null) requires non-empty extra volumes/mounts.
  • External mode fails fast if no extraVolumeMount.mountPath matches serverConfig.containerSecretFilePath.
  • External mode succeeds when mount path matches expected credential path.

@JoeDerby JoeDerby requested a review from a team as a code owner April 22, 2026 08:19
@JoeDerby JoeDerby changed the title Extra volumes feat(helm): support extra volumes/mounts; add external API key mount-path validation Apr 22, 2026
@evan-cz
Copy link
Copy Markdown
Contributor

evan-cz commented May 4, 2026

Hey @JoeDerby, thanks for putting this together — we really appreciate the contribution, and especially that you put put real thought into the validation and test coverage! Supporting external credential flows is something we want to
get right, so I'd like to work through the design with you before we merge.

Design considerations

The core issue is that the PR introduces a very generic mechanism (extraVolumes / extraVolumeMounts) but then makes specific assumptions about what those volumes contain — namely the API key. The schema validation
and the validateExternalApiKeyMountPath helper tie these generic fields directly to credential provisioning, which creates a coupling that could be confusing for users and tricky to maintain going forward.

The other thing I want to be careful about is that extraVolumeMounts gets wired into every workload in the chart — including workloads that don't send data to the CloudZero API. The current apiKeyVolumeMount helper already gates itself on whether a secret source is configured, so volumes aren't mounted where
they aren't needed. With extraVolumeMounts, anything a user puts there would appear in every container across every deployment, daemonset, and job in the chart — which exposes the API key to a lot more surface area than I'd like.

Can existingSecretName work for your use case?

Before we go further, I'd like to understand a bit more about your specific setup. The most common pattern we see for external secret management is using an operator (like the External Secrets Operator or the Secrets Store CSI
Driver's secretObjects sync) to sync from an external vault into a regular Kubernetes Secret, and then referencing that Secret via existingSecretName.

We have quite a few customers running this way today, and we actually have a step-by-step guide for this exact flow with AWS Secrets Manager:

https://github.com/Cloudzero/cloudzero-agent/blob/develop/helm/docs/secret-store-operator-aws-example.md

The short version is:

  1. External Secrets Operator (or equivalent) syncs from your vault into a K8s
    Secret
  2. You set existingSecretName to point at that Secret
  3. The chart mounts it normally — no extra volume configuration needed

If you're using the Secrets Store CSI Driver specifically, it also has a secretObjects sync feature that can create a K8s Secret from a CSI-mounted volume, which would also work with existingSecretName.

Could you share a bit more about your setup? Specifically:

  • Which secret backend are you using (AWS Secrets Manager, Azure Key Vault, HashiCorp Vault, something else)?
  • Is there a policy constraint that prevents you from syncing the secret into a Kubernetes Secret object?
  • Any other context about why the existingSecretName path wouldn't work for you?

Path forward

I want to make sure we land something that works for you. If there's a reason that syncing to a K8s Secret isn't viable for your environment (e.g., a security policy that prohibits K8s Secret objects), that's completely
understandable — and in that case I'd be happy to work with you on a more targeted solution, something like a configurable volume source for the API key mount specifically rather than a blanket extra-volumes mechanism across all workloads.

Either way, once I understand your setup a bit better I'm confident we can find the right approach. If you'd prefer not to go into details about your environment in a public forum, that's totally fine — just let your FAM (or Anderson, I know he has context on this) know and he can connect us on Slack.

Thanks again for taking the time on this!

@JoeDerby
Copy link
Copy Markdown
Author

JoeDerby commented May 5, 2026

Thanks for the feedback @evan-cz . We're using AWS Secrets Manager with Secrets Store CSI Driver. existingSecretName can work for us, and we don't have a policy constraint that prohibits Kubernetes Secret objects. The reason we want to take the direct CSI-mounted file path was that secretObjects sync still requires a pod mounting the CSI volume before the Kubernetes Secret exists. As the Secrets Store CSI docs note, "The secrets will only sync once you start a pod mounting the secrets" (https://secrets-store-csi-driver.sigs.k8s.io/topics/sync-as-kubernetes-secret.html). The synced secret lifecycle is also coupled to consuming pods. In our case, existingSecretName would still require another pod/process to materialize and maintain that Secret when using CSI secretObjects sync, which is what we were trying to avoid. Given that a pod volume mount is still required to trigger secretObjects sync, the direct CSI-mount approach felt cleaner.

I also agree that the current generic extraVolumes and extraVolumeMounts wiring is too broad. We can split this so API-key-related mounts are scoped only to components that require them, rather than using a global mount mechanism.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants