[safe-output-health] Safe Output Health Report - 2026-04-23 #28086
Closed
Replies: 1 comment
-
|
This discussion has been marked as outdated by Safe Output Health Monitor. A newer discussion is available at Discussion #28271. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Executive Summary
Overall Status: The safe output infrastructure conditional logic is working correctly. The absence of safe output executions is explained entirely by agent-level failures or agent skips — not by any fault in the safe output jobs themselves.
Safe Output Job Statistics
Why no safe output jobs ran: The
safe_outputsjob condition requires(needs.agent.result != 'skipped') && (needs.detection.result == 'success'). Detection only runs when the agent producesoutput_typesorhas_patch. In all runs this period, either the agent was skipped (20 workflow-logs-only runs) or the agent ran but produced no output (Smoke OpenCode, Smoke Gemini), so detection evaluated tofalseand safe output jobs were correctly gated off.Error Clusters
Cluster 1: Docker Container Health Failure —
awf-api-proxyawf-api-proxycontainer failed its health check after ~10 seconds.awf-squidwas healthy. The container sandbox for the Gemini agent could not start, preventing any agent execution.Cluster 2: Missing API Key — Google Generative AI
GOOGLE_GENERATIVE_AI_API_KEYenvironment variable not available to the OpenCode agent. Process exited with code 0 (silent failure).Root Cause Analysis
Infrastructure Issues
awf-api-proxycontainer health check failure (Smoke Gemini): Theawf-api-proxycontainer, which proxies API traffic for the agent sandbox, failed its Docker health check.awf-squid(proxy) was healthy in 7 seconds;awf-api-proxyerrored at the 10-second mark. This appears to be a transient container startup issue, possibly a flaky health check or a race condition. This is a PR-triggered run on branchcopilot/modify-pre-agent-steps-injection(PR #28082).Smoke Crush hang (run-24837114237): Run was still
in_progressat audit time with limited log data. The agent job wasin_progressbut no agent logs were downloaded. May be a slow-running test or a timeout situation. Also triggered by PR #28082.Configuration Issues
Missing Google API key for OpenCode: The Smoke OpenCode workflow runs OpenCode CLI which uses Google Gemini as its LLM backend but does not have the
GOOGLE_GENERATIVE_AI_API_KEYset. This causes a silent failure (exit code 0) with no agent output.Expected Behavior: Label-Filtered Skips
20 workflow-logs-only runs were triggered by a PR
labeledevent (label:smoke, PR #28082) but each individual workflow requires its own specific activation label (e.g.,water,metal,smoke-opencode, etc.). All correctly evaluatedpre_activation.iftofalseand skipped. This is working as designed.Recommendations
Critical Issues (Immediate Action Required)
Investigate
awf-api-proxycontainer health failureawf-api-proxyimage health check configuration; investigate if the PR branchcopilot/modify-pre-agent-steps-injectionchanges affect container startup behavior; add retry logic or extend health check timeoutConfigure
GOOGLE_GENERATIVE_AI_API_KEYfor Smoke OpenCodeGOOGLE_GENERATIVE_AI_API_KEYsecret to the Smoke OpenCode workflow configuration; verify OpenCode is expected to use Google Gemini (not a local/different backend)Investigate Smoke Crush run hang
in_progressat audit time; no agent log artifacts were availableWork Item Plans
Work Item 1: Diagnose and Fix
awf-api-proxyContainer Startup Failureawf-api-proxycontainer becomes unhealthy when the Smoke Gemini workflow runs on thecopilot/modify-pre-agent-steps-injectionPR branch. This prevents any agent execution and is blocking smoke test validation of the PR.awf-api-proxycontainer starts healthy within expected timeoutawf-api-proxyDocker image configuration between the PR branch and main; check if any recent changes to proxy startup or health check logic are involved; consider adding retry logic for container startup inawfbinaryawf-api-proxyWork Item 2: Fix OpenCode Google API Key Configuration
GOOGLE_GENERATIVE_AI_API_KEYis not configured, causing OpenCode's internal Gemini API calls to fail withAI_LoadAPIKeyError. The process exits with code 0 masking the failure.GOOGLE_GENERATIVE_AI_API_KEYis available in Smoke OpenCode workflow environmentGOOGLE_GENERATIVE_AI_API_KEYas a repository or organization secret and wire it into the Smoke OpenCode workflow's environment; alternatively verify if OpenCode should be configured to use a different LLM backendHistorical Context
This is the first audit for this repository. No prior safe output health data in cache memory.
Metrics and KPIs
Next Steps
awf-api-proxycontainer health failure (PR Run pre-agent-steps before MCP gateway startup #28082, Smoke Gemini)GOOGLE_GENERATIVE_AI_API_KEYfor Smoke OpenCode workflowReferences:
Beta Was this translation helpful? Give feedback.
All reactions