Fix stable CI test harness failures#1986
Conversation
🦋 Changeset detectedLatest commit: e3a7f8e The changes in this PR will be included in the next version bump. This PR includes changesets to release 23 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
🧪 E2E Test Results❌ Some tests failed Summary
❌ Failed Tests▲ Vercel Production (1 failed)sveltekit (1 failed):
💻 Local Development (1 failed)hono-stable (1 failed):
📦 Local Production (1 failed)nitro-stable (1 failed):
🌍 Community Worlds (85 failed)mongodb (11 failed):
redis (9 failed):
turso (65 failed):
Details by Category❌ ▲ Vercel Production
❌ 💻 Local Development
❌ 📦 Local Production
✅ 🐘 Local Postgres
✅ 🪟 Windows
❌ 🌍 Community Worlds
✅ 📋 Other
❌ Some E2E test jobs failed:
Check the workflow run for details. |
| expect(port).toBe(fastAddr.port); | ||
| // Should complete reasonably quickly (Windows CI can be slow) | ||
| expect(elapsed).toBeLessThan(2000); | ||
| expect(elapsed).toBeLessThan(5000); |
There was a problem hiding this comment.
5s is too long - what's actually causing the get port to take so long and can we make this faster?
There was a problem hiding this comment.
Yep, that 5s bump was masking the wrong thing. The slow part was Windows netstat / process-port discovery, not the HTTP probe timeout itself.
I changed the test to pass explicit candidatePorts, so it bypasses OS discovery and measures the custom probe timeout directly. The assertion is back to less than 2000ms.
Summary
Fixes CI stability issues observed on the current
stablebranch:HEAD, so health checks no longer need to probe withPOST.step_completed. The Nuxt failure hadreadableStreamWorkflowcomplete before stream chunks were persisted, so the client observed an empty stream.addTenWorkflowruns complete in ~5s, thenworkflow inspect --withDatawas killed by the harness' old 20s subprocess timeout while fetching/decrypting remote run data.Date.now()values returned from replayed workflow code. The Express failure showedrun_startedtowait_completedwas >10s, but the replayed return value measured only the latter portion of the event stream.hook_disposed, so the runtime correctly producedhook_conflict.addTenWorkflowtest timeout aligned with observed Vercel prod queue/cold-start latency where the workflow completed after the old 60s test timeout.Includes a changeset for the touched packages.
CI notes
25894395244failed onlyE2E Vercel Prod Tests (nitro)plus the aggregate required check. The failedwebhookWorkflowrunwrun_01KRMJH5E2V02Y9KC4NJ0EHB0Pwascompleted; event timeline reachedrun_completedat +5.3s, so the remaining failure was in the webhook HTTP response path, not workflow execution.25895081248fixed Nitro and all local e2e jobs passed. The only app-specific failure wasE2E Vercel Prod Tests (vite), where both failedaddTenWorkflowruns were alreadycompleted(wrun_01KRMKS6SH17MSH902CATGVGYHandwrun_01KRMKTGK9AYX2S228DAWD4GHVreachedrun_completedin ~5s). The failure was the e2e harness killingworkflow inspect --withDatawithSIGTERMafter 20s.25898693068fixed Vite. Remaining failures wereE2E Vercel Prod Tests (express)andE2E Vercel Prod Tests (nitro): Express failedsleepingWorkflowbecause replayedDate.now()returned a 5509ms delta while the event timeline showedrun_startedat +0.5s andwait_completedat +11.3s; Nitro failedhookDisposeTestWorkflowbecause workflow 2 started before workflow 1 had disposed the shared token, producing a correcthook_conflict.25902464337fixed Vite, Express, and Nitro. Remaining failure wasE2E Vercel Prod Tests (nuxt):readableStreamWorkflowreturned an empty stream even though the step/run completed; the step event timeline completed in ~1s while the stream-producing step should take ~10s, showingstep_completedwas racing ahead of return stream serialization.e3a7f8e71is in progress: https://github.com/vercel/workflow/actions/runs/25905802133Validation
fnm exec --using v22.18.0 pnpm --filter @workflow/core buildfnm exec --using v22.18.0 pnpm vitest run packages/core/src/runtime/step-handler.test.ts packages/core/src/writable-stream.test.ts packages/core/src/step/writable-stream.test.tsfnm exec --using v22.18.0 pnpm turbo run build --filter='@workflow/nitro' --filter='@workflow/builders' --filter='@workflow/core'fnm exec --using v22.18.0 env NITRO_PRESET=vercel WORKFLOW_PUBLIC_MANIFEST=1 pnpm --dir workbench/nitro-v3 buildfnm exec --using v22.18.0 pnpm exec biome check packages/core/src/runtime/step-handler.ts packages/core/e2e/e2e.test.ts packages/core/e2e/utils.ts(only pre-existing warnings)fnm exec --using v22.18.0 pnpm exec biome check packages/core/src/runtime/resume-hook.ts packages/core/e2e/e2e.test.ts packages/builders/src/base-builder.ts packages/nitro/src/index.ts(only pre-existing warnings)@workflow/utilstests, builder package builds, SvelteKit/Astro production builds, and Fastify dev rebuild smoke coverage.