Skip to content

Remove EOL'd citrine/perf relays + VMs from pod scheduler and profiles#2169

Merged
LoopedBard3 merged 1 commit into
aspnet:mainfrom
LoopedBard3:loopedbard3/update-eol-relays
May 13, 2026
Merged

Remove EOL'd citrine/perf relays + VMs from pod scheduler and profiles#2169
LoopedBard3 merged 1 commit into
aspnet:mainfrom
LoopedBard3:loopedbard3/update-eol-relays

Conversation

@LoopedBard3
Copy link
Copy Markdown
Contributor

@LoopedBard3 LoopedBard3 commented May 5, 2026

The following Service Bus relays (and their underlying VMs) were end-of-lifed and removed from the aspnetrelay connection list in Azure:

  • perflin (asp-perf-lin, pod intel-perflin)
  • perfwin (asp-perf-win)
  • citrinewin (asp-citrine-win, pod intel-win)
  • citrinelin (asp-citrine-lin, pod intel-lin)
  • citrineamd2 (asp-citrine-amd2, pod amd-lin2)

Changes

build/benchmarks_ci_pods.json - drop the four pods whose SUT is EOL'd (intel-lin, intel-win, intel-perflin, amd-lin2). Every scenario still has at least one valid pod (gold-lin / gold-win), so no scenarios are removed; their pods arrays are pruned accordingly.

build/ci.profile.yml - remove EOL'd profile defs (intel-lin-*, intel-win-*, intel-perflin-app, amd-lin2-*).

build/benchmarks-ci-01.yml, build/benchmarks-ci-02.yml - regenerated via:

python scripts/pod-scheduler/main.py --config build/benchmarks_ci_pods.json --base-name benchmarks-ci --yaml-output build

scenarios/aspnet.profiles.yml, scenarios/aspnet.profiles.standard.yml

  • Remove profiles whose APP endpoint is EOL'd: aspnet-citrine-{lin,win,amd2}[-relay], aspnet-citrine-amd-relay, aspnet-perf-{lin,win}[-relay].

  • Repoint load/db endpoints in profiles whose SUT still works but whose load/db lived on citrineamd2 / asp-citrine-amd2:

    Profile Was Now
    aspnet-citrine-arm-lin[-relay] load = citrineamd2 citrineload
    aspnet-citrine-arm-win[-relay] (yml only) db = citrineamd2 citrinedb
    aspnet-siryn-arm-lin[-relay] load = citrineamd2 citrineload
    aspnet-citrine-ampere (yml only) db & load = asp-citrine-amd2 asp-citrine-db / asp-citrine-load

scenarios/proxy.benchmarks.yml, scenarios/proxy.grpc.benchmarks.yml, src/Benchmarks/json.benchmarks.yml, src/BenchmarksApps/BuildPerformance/buildperformance.yml - remove inline profile defs whose APP endpoint is EOL'd (aspnet-citrine-{lin,win,amd}, aspnet-perf-{lin,win}), plus the misplaced aspnet-citrine-lin entry under scenarios: in json.benchmarks.yml.

scenarios/signalr.benchmarks.yml - example comment updated from --profile asp-perf-lin to --profile aspnet-gold-lin so docs reference a living machine.

Validation

  • All 11 touched JSON/YAML files parse cleanly.
  • python -m unittest discover in scripts/pod-scheduler/tests -> 43/43 pass.
  • The regenerated benchmarks-ci-0{1,2}.yml contain only gold-lin / gold-win runs.
  • Net diff: +176 / -1797.

The following Service Bus relays and their underlying VMs have been
end-of-lifed:
  - perflin   (asp-perf-lin,    intel-perflin pod)
  - perfwin   (asp-perf-win)
  - citrinewin (asp-citrine-win, intel-win pod)
  - citrinelin (asp-citrine-lin, intel-lin pod)
  - citrineamd2 (asp-citrine-amd2, amd-lin2 pod)

Changes:

build/benchmarks_ci_pods.json
  Drop the four pods whose SUT machines are EOL'd (intel-lin,
  intel-win, intel-perflin, amd-lin2). All scenarios still have at
  least one valid pod (gold-lin, gold-win) so no scenarios are
  removed; their pods arrays are pruned accordingly.

build/ci.profile.yml
  Remove EOL'd profile defs that point to dead hostnames:
  intel-lin-app/load, intel-win-app/load, intel-perflin-app,
  amd-lin2-app/load/db.

build/benchmarks-ci-01.yml, build/benchmarks-ci-02.yml
  Regenerated via:
    python scripts/pod-scheduler/main.py --config build/benchmarks_ci_pods.json --base-name benchmarks-ci --yaml-output build

scenarios/aspnet.profiles.yml, scenarios/aspnet.profiles.standard.yml
  - Remove profiles whose APP endpoint is EOL'd (per request, removed
    completely rather than re-pointing): aspnet-citrine-lin[-relay],
    aspnet-citrine-win[-relay], aspnet-citrine-amd2 /
    aspnet-citrine-amd-relay, aspnet-perf-lin[-relay],
    aspnet-perf-win[-relay].
  - Repoint the load/db endpoints in profiles whose SUT still works
    but whose load/db happened to live on an EOL'd machine:
      aspnet-citrine-arm-lin[-relay] secondary: citrineamd2 -> citrineload
      aspnet-citrine-arm-win[-relay] db (yml only): citrineamd2 -> citrinedb
      aspnet-siryn-arm-lin[-relay]   secondary: citrineamd2 -> citrineload
      aspnet-citrine-ampere (yml only) db & load: asp-citrine-amd2 ->
        asp-citrine-db / asp-citrine-load

scenarios/proxy.benchmarks.yml, scenarios/proxy.grpc.benchmarks.yml,
src/Benchmarks/json.benchmarks.yml,
src/BenchmarksApps/BuildPerformance/buildperformance.yml
  Remove inline profile defs whose APP endpoint is EOL'd
  (aspnet-citrine-lin/win/amd, aspnet-perf-lin/win, plus the
  misplaced aspnet-citrine-lin entry under scenarios in
  json.benchmarks.yml).

scenarios/signalr.benchmarks.yml
  Update example-comment profile reference from asp-perf-lin to
  aspnet-gold-lin so docs reflect a still-living machine.

Verified: all touched JSON/YAML files parse, all 43 pod-scheduler unit
tests pass, and the regenerated benchmarks-ci-0{1,2}.yml contain only
gold-lin/gold-win runs.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR removes end-of-life (EOL) citrine/perf Service Bus relays and their backing VMs from the Benchmarks CI pod scheduler configuration and from scenario/profile definitions, ensuring CI and scenario configs no longer reference dead endpoints.

Changes:

  • Pruned EOL pods (intel-lin/intel-win/intel-perflin/amd-lin2) from build/benchmarks_ci_pods.json and regenerated build/benchmarks-ci-01.yml / build/benchmarks-ci-02.yml accordingly.
  • Removed EOL profile definitions and inline profiles referencing retired hosts from scenario/config YAMLs, and repointed remaining profiles’ load/db endpoints away from EOL machines.
  • Updated a SignalR example comment to reference a still-valid profile.

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated no comments.

Show a summary per file
File Description
build/benchmarks_ci_pods.json Removes EOL pods and prunes scenario pod lists to gold-only pods.
build/benchmarks-ci-01.yml Regenerated CI schedule/jobs to eliminate EOL pod runs and dependencies.
build/benchmarks-ci-02.yml Regenerated CI schedule/jobs to eliminate EOL pod runs and dependencies.
build/ci.profile.yml Removes CI profile entries that point at EOL hostnames.
scenarios/aspnet.profiles.yml Drops EOL profiles and repoints remaining profiles’ load/db endpoints off retired machines.
scenarios/aspnet.profiles.standard.yml Drops EOL standard profiles and repoints remaining profiles’ secondary endpoints off retired machines.
scenarios/proxy.benchmarks.yml Removes inline proxy profiles that referenced EOL endpoints.
scenarios/proxy.grpc.benchmarks.yml Removes inline gRPC proxy profiles that referenced EOL endpoints.
scenarios/signalr.benchmarks.yml Updates example comment to a non-EOL profile name.
src/Benchmarks/json.benchmarks.yml Removes misplaced/inline EOL profile definitions.
src/BenchmarksApps/BuildPerformance/buildperformance.yml Removes inline EOL profile definitions.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@LoopedBard3 LoopedBard3 marked this pull request as ready for review May 5, 2026 18:15
@LoopedBard3 LoopedBard3 requested a review from DeagleGross May 13, 2026 16:42
Comment thread scenarios/aspnet.profiles.standard.yml
@DrewScoggins
Copy link
Copy Markdown
Contributor

How have we verified that the updated profiles work and that jobs get dispatched correctly? I want to ensure we don't have any machines used twice in a pod, that kind of thing. Also, we had already stopped running on this hardware, or is this the PR that will actually stop the runs? I ask because I wonder what this is going to do to utilization of machines if we are cutting out this much stuff.

Copy link
Copy Markdown
Contributor

@DrewScoggins DrewScoggins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved with feedback.

@LoopedBard3
Copy link
Copy Markdown
Contributor Author

How have we verified that the updated profiles work and that jobs get dispatched correctly? I want to ensure we don't have any machines used twice in a pod, that kind of thing. Also, we had already stopped running on this hardware, or is this the PR that will actually stop the runs? I ask because I wonder what this is going to do to utilization of machines if we are cutting out this much stuff.

  • I verified that the pipeline starts, a full run takes too much time so I will make sure to watch the next runs for success.
  • For ensuring we don't use machines twice in a pod, the scheduler statically checks for overlapping machine usage.
  • This hardware has been offline since the beginning of April and the relays were deleted in mid-April. The runs have just been failing since then, so this will clean up the pipelines so that only real errors cause issue.
  • The utilization of the existing machines should be very similar or exactly the same as there are no new tests for any of the pods, and the pod structure is what we were using before. Some of the profiles were updated to use comparable machines in place of the removed ones, but the updates in build/ci.profiles.yml were purely reductive.

@LoopedBard3 LoopedBard3 merged commit 1b94d26 into aspnet:main May 13, 2026
6 checks passed
@LoopedBard3 LoopedBard3 deleted the loopedbard3/update-eol-relays branch May 13, 2026 18:23
LoopedBard3 added a commit to LoopedBard3/Benchmarks that referenced this pull request May 13, 2026
…t#2169)

Upstream PR aspnet#2169 removed the four pods whose SUT machines were
end-of-lifed (intel-lin, intel-win, intel-perflin, amd-lin2), leaving
only gold-lin and gold-win on the on-prem CI fleet. This merge applies
the same prune to our YAML-format pod config and regenerates the
pipeline YAMLs.

Conflict resolution:

* build/benchmarks_ci_pods.json -- upstream modified the (now-deleted)
  JSON config; we kept our deletion. The equivalent edits are applied
  directly to build/benchmarks_ci_pods.yml: pods is trimmed to
  [gold-lin, gold-win] and every scenario's pods: list is pruned to
  match upstream's post-aspnet#2169 contents.
* build/benchmarks-ci-01.yml / -02.yml -- regenerated from the updated
  YAML config. Output matches upstream's regenerated pipelines exactly,
  modulo the embedded regen-command header which (correctly) now
  points at the .yml config instead of the deleted .json.

The Azure and Cobalt pod configs are unaffected by aspnet#2169 -- they don't
reference any of the EOL'd machines -- and their pipeline YAMLs are
unchanged.

All 70 pod-scheduler unit tests still pass.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants