[copilot-token-optimizer] Optimize: Documentation Noob Tester — 60-turn run for 3 pages costs 3M tokens

### Target Workflow
**`docs-noob-tester`** (Documentation Noob Tester) — selected as the highest-token workflow not optimized in the last 14 days, with 3,009,872 tokens in a single 7-day window.

### Analysis Period
- **Window**: 2026-04-17 → 2026-04-24 (7 days)
- **Runs audited**: 1 (2026-04-24, run [§24891429625](https://github.com/github/gh-aw/actions/runs/24891429625))
- **Conclusion**: ✅ success

### Token Profile

| Metric | Value |
|--------|-------|
| Total tokens (1 run) | 3,009,872 |
| Avg tokens / run | 3,009,872 |
| Total input tokens | 2,996,495 |
| Total output tokens | 13,377 |
| Output/Input ratio | 0.004 (extremely low) |
| Cache efficiency | 49.4% |
| Total turns | 60 |
| Avg tokens / turn | ~49,942 |
| Action minutes | 26 |
| Errors / Warnings | 0 / 0 |

**Behavioral signals**: `resource_heavy`, `read_only`, `agentic_fraction=0.50`, `execution_style=exploratory`

> ⚠️ The workflow's own prompt says "visit **exactly 3 pages**" — yet 60 LLM turns fired. That's roughly 20× the expected turn count for a 3-page test.

---

### Ranked Recommendations

#### 🥇 Rec 1 — Move setup to bash pre-steps (save ~600K–900K tokens/run)

**Action**: Extract the deterministic setup sequence into `steps` before the agent turn:
- `npm install` in `/docs`
- Start the Astro dev server (`npm run dev &`)
- Wait for readiness (`curl --retry 10 --retry-delay 2 (localhost/redacted)
- Detect bridge IP and write to `/tmp/gh-aw/agent/server-url.txt`

**Evidence**: Currently the agent spends ~10–15 turns on setup (npm install, server start, readiness polling, IP detection). With avg 49,942 input tokens/turn, 12 turns = ~600K tokens.

**Risk**: Low — these are deterministic shell operations with no decision-making required.

---

#### 🥈 Rec 2 — Add `max-turns: 25` (save up to 1,750K tokens in runaway cases)

**Action**: Add `max-turns: 25` to the frontmatter.

```yaml
max-turns: 25
```

**Evidence**: This is a tightly scoped 3-page read-only test. Even accounting for Playwright interactions, 25 turns is generous. Without a cap, a future confused run could far exceed today's 60 turns. At ~50K tokens/turn, 60→25 turns = ~1.75M tokens saved.

**Risk**: Very low — the task scope is fixed (3 pages + report).

---

#### 🥉 Rec 3 — Slim the Playwright fallback instructions (save ~50K–100K tokens/run)

**Action**: The prompt contains ~80 lines of Playwright connectivity guidance (bridge IP detection, fallback curl commands, `waitUntil` examples, error recovery). Extract this into a new shared import or collapse it to a concise note.

**Evidence**: At 49,942 avg input tokens/turn × 60 turns, reducing the system prompt by ~1,500 tokens saves ~90K tokens total. These instructions are defensive/conditional and rarely needed.

**Risk**: Low — this is a documentation improvement, not logic removal.

---

#### Rec 4 — Reduce `timeout-minutes: 45 → 30` (save ~15 action minutes in worst case)

**Action**: Lower the timeout from 45 to 30 minutes. The single observed run completed in 25.3 minutes with the current setup; post-optimization (pre-steps reducing agent time) it should fit well within 30 minutes.

**Evidence**: With setup moved to pre-steps and max-turns capping agent time, 30 minutes provides adequate headroom.

**Risk**: Low — 5-minute buffer above observed 25.3m run time.

---

<details>
<summary>Full run detail — 2026-04-24 run §24891429625</summary>

| Field | Value |
|-------|-------|
| Status | completed / success |
| Duration | 25.3 minutes |
| Turns | 60 |
| Total tokens | 3,009,872 |
| Input tokens | 2,996,495 |
| Output tokens | 13,377 |
| Cache read tokens | 2,926,667 |
| Cache write tokens | 0 |
| Cache efficiency | 49.4% |
| Effective tokens | 3,342,670 |
| Action minutes | 26 |
| Errors | 0 |
| Warnings | 0 |
| GitHub API calls | 5 |
| Model | claude-sonnet-4.6 (via copilot) |

</details>

<details>
<summary>Workflow configuration snapshot</summary>

**Configured tools**: `playwright`, `edit`, `bash: ["*"]`, `mount-as-clis: true`  
**Shared imports**: `daily-audit-base.md`, `docs-server-lifecycle.md`, `keep-it-short.md` (+ transitively: `daily-audit-discussion.md`, `observability-otlp.md`, `reporting.md`, `noop-reminder.md`)  
**Engine**: copilot  
**timeout-minutes**: 45  
**max-turns**: not set  
**Network**: defaults + node  
**safe-outputs**: upload-asset (max 10, image types)

> Note: `bash: ["*"]` wildcard is present. All configured tools (`playwright`, `edit`) were used in the run per the `read_only` + `exploratory` fingerprint.

</details>

### Estimated Combined Savings

| Scenario | Turns | Tokens saved | % reduction |
|----------|-------|-------------|-------------|
| Rec 1 (pre-steps only) | ~45 | ~700K | ~23% |
| Rec 1 + 2 (pre-steps + max-turns) | ~25 | ~1,750K | ~58% |
| All 4 recs | ~25 | ~1,800K | ~60% |

### Caveats

- Only **1 run** was available for analysis — higher confidence would require 3–5 runs.
- The 49.4% cache efficiency is moderate; better prompt caching could reduce effective tokens further.
- The `edit` tool is configured but the run was classified `read_only` — consider verifying whether `edit` is ever needed and removing it if not (saves minor per-turn overhead).

**References:**
- [§24891429625](https://github.com/github/gh-aw/actions/runs/24891429625) — Documentation Noob Tester, 2026-04-24







> Generated by [Copilot Token Usage Optimizer](https://github.com/github/gh-aw/actions/runs/24896181567/agentic_workflow) · ● 760.7K · [◷](https://github.com/search?q=repo%3Agithub%2Fgh-aw+is%3Aissue+%22gh-aw-workflow-call-id%3A+github%2Fgh-aw%2Fcopilot-token-optimizer%22&type=issues)
> - [x] expires  on May 1, 2026, 3:10 PM UTC

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[copilot-token-optimizer] Optimize: Documentation Noob Tester — 60-turn run for 3 pages costs 3M tokens #28294

Target Workflow

Analysis Period

Token Profile

Ranked Recommendations

🥇 Rec 1 — Move setup to bash pre-steps (save ~600K–900K tokens/run)

🥈 Rec 2 — Add `max-turns: 25` (save up to 1,750K tokens in runaway cases)

🥉 Rec 3 — Slim the Playwright fallback instructions (save ~50K–100K tokens/run)

Rec 4 — Reduce `timeout-minutes: 45 → 30` (save ~15 action minutes in worst case)

Estimated Combined Savings

Caveats

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Metric	Value
Total tokens (1 run)	3,009,872
Avg tokens / run	3,009,872
Total input tokens	2,996,495
Total output tokens	13,377
Output/Input ratio	0.004 (extremely low)
Cache efficiency	49.4%
Total turns	60
Avg tokens / turn	~49,942
Action minutes	26
Errors / Warnings	0 / 0

Field	Value
Status	completed / success
Duration	25.3 minutes
Turns	60
Total tokens	3,009,872
Input tokens	2,996,495
Output tokens	13,377
Cache read tokens	2,926,667
Cache write tokens	0
Cache efficiency	49.4%
Effective tokens	3,342,670
Action minutes	26
Errors	0
Warnings	0
GitHub API calls	5
Model	claude-sonnet-4.6 (via copilot)

Scenario	Turns	Tokens saved	% reduction
Rec 1 (pre-steps only)	~45	~700K	~23%
Rec 1 + 2 (pre-steps + max-turns)	~25	~1,750K	~58%
All 4 recs	~25	~1,800K	~60%

[copilot-token-optimizer] Optimize: Documentation Noob Tester — 60-turn run for 3 pages costs 3M tokens #28294

Description

Target Workflow

Analysis Period

Token Profile

Ranked Recommendations

🥇 Rec 1 — Move setup to bash pre-steps (save ~600K–900K tokens/run)

🥈 Rec 2 — Add max-turns: 25 (save up to 1,750K tokens in runaway cases)

🥉 Rec 3 — Slim the Playwright fallback instructions (save ~50K–100K tokens/run)

Rec 4 — Reduce timeout-minutes: 45 → 30 (save ~15 action minutes in worst case)

Estimated Combined Savings

Caveats

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

🥈 Rec 2 — Add `max-turns: 25` (save up to 1,750K tokens in runaway cases)

Rec 4 — Reduce `timeout-minutes: 45 → 30` (save ~15 action minutes in worst case)