Skip to content

[copilot-token-optimizer] Optimize: Documentation Noob Tester — 60-turn run for 3 pages costs 3M tokens #28294

@github-actions

Description

@github-actions

Target Workflow

docs-noob-tester (Documentation Noob Tester) — selected as the highest-token workflow not optimized in the last 14 days, with 3,009,872 tokens in a single 7-day window.

Analysis Period

  • Window: 2026-04-17 → 2026-04-24 (7 days)
  • Runs audited: 1 (2026-04-24, run §24891429625)
  • Conclusion: ✅ success

Token Profile

Metric Value
Total tokens (1 run) 3,009,872
Avg tokens / run 3,009,872
Total input tokens 2,996,495
Total output tokens 13,377
Output/Input ratio 0.004 (extremely low)
Cache efficiency 49.4%
Total turns 60
Avg tokens / turn ~49,942
Action minutes 26
Errors / Warnings 0 / 0

Behavioral signals: resource_heavy, read_only, agentic_fraction=0.50, execution_style=exploratory

⚠️ The workflow's own prompt says "visit exactly 3 pages" — yet 60 LLM turns fired. That's roughly 20× the expected turn count for a 3-page test.


Ranked Recommendations

🥇 Rec 1 — Move setup to bash pre-steps (save ~600K–900K tokens/run)

Action: Extract the deterministic setup sequence into steps before the agent turn:

  • npm install in /docs
  • Start the Astro dev server (npm run dev &)
  • Wait for readiness (`curl --retry 10 --retry-delay 2 (localhost/redacted)
  • Detect bridge IP and write to /tmp/gh-aw/agent/server-url.txt

Evidence: Currently the agent spends ~10–15 turns on setup (npm install, server start, readiness polling, IP detection). With avg 49,942 input tokens/turn, 12 turns = ~600K tokens.

Risk: Low — these are deterministic shell operations with no decision-making required.


🥈 Rec 2 — Add max-turns: 25 (save up to 1,750K tokens in runaway cases)

Action: Add max-turns: 25 to the frontmatter.

max-turns: 25

Evidence: This is a tightly scoped 3-page read-only test. Even accounting for Playwright interactions, 25 turns is generous. Without a cap, a future confused run could far exceed today's 60 turns. At ~50K tokens/turn, 60→25 turns = ~1.75M tokens saved.

Risk: Very low — the task scope is fixed (3 pages + report).


🥉 Rec 3 — Slim the Playwright fallback instructions (save ~50K–100K tokens/run)

Action: The prompt contains ~80 lines of Playwright connectivity guidance (bridge IP detection, fallback curl commands, waitUntil examples, error recovery). Extract this into a new shared import or collapse it to a concise note.

Evidence: At 49,942 avg input tokens/turn × 60 turns, reducing the system prompt by ~1,500 tokens saves ~90K tokens total. These instructions are defensive/conditional and rarely needed.

Risk: Low — this is a documentation improvement, not logic removal.


Rec 4 — Reduce timeout-minutes: 45 → 30 (save ~15 action minutes in worst case)

Action: Lower the timeout from 45 to 30 minutes. The single observed run completed in 25.3 minutes with the current setup; post-optimization (pre-steps reducing agent time) it should fit well within 30 minutes.

Evidence: With setup moved to pre-steps and max-turns capping agent time, 30 minutes provides adequate headroom.

Risk: Low — 5-minute buffer above observed 25.3m run time.


Full run detail — 2026-04-24 run §24891429625
Field Value
Status completed / success
Duration 25.3 minutes
Turns 60
Total tokens 3,009,872
Input tokens 2,996,495
Output tokens 13,377
Cache read tokens 2,926,667
Cache write tokens 0
Cache efficiency 49.4%
Effective tokens 3,342,670
Action minutes 26
Errors 0
Warnings 0
GitHub API calls 5
Model claude-sonnet-4.6 (via copilot)
Workflow configuration snapshot

Configured tools: playwright, edit, bash: ["*"], mount-as-clis: true
Shared imports: daily-audit-base.md, docs-server-lifecycle.md, keep-it-short.md (+ transitively: daily-audit-discussion.md, observability-otlp.md, reporting.md, noop-reminder.md)
Engine: copilot
timeout-minutes: 45
max-turns: not set
Network: defaults + node
safe-outputs: upload-asset (max 10, image types)

Note: bash: ["*"] wildcard is present. All configured tools (playwright, edit) were used in the run per the read_only + exploratory fingerprint.

Estimated Combined Savings

Scenario Turns Tokens saved % reduction
Rec 1 (pre-steps only) ~45 ~700K ~23%
Rec 1 + 2 (pre-steps + max-turns) ~25 ~1,750K ~58%
All 4 recs ~25 ~1,800K ~60%

Caveats

  • Only 1 run was available for analysis — higher confidence would require 3–5 runs.
  • The 49.4% cache efficiency is moderate; better prompt caching could reduce effective tokens further.
  • The edit tool is configured but the run was classified read_only — consider verifying whether edit is ever needed and removing it if not (saves minor per-turn overhead).

References:

Generated by Copilot Token Usage Optimizer · ● 760.7K ·

  • expires on May 1, 2026, 3:10 PM UTC

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions