Skip to content

Stop committing the 57MB dashboard blob; resolve the published artifact#71

Merged
MaxGhenis merged 1 commit into
mainfrom
drop-data-blob
Jun 10, 2026
Merged

Stop committing the 57MB dashboard blob; resolve the published artifact#71
MaxGhenis merged 1 commit into
mainfrom
drop-data-blob

Conversation

@MaxGhenis

Copy link
Copy Markdown
Contributor

Why

Every data refresh committed a fresh 57MB data.json blob — repo history bloat (~290MB .git already), conflict-magnet refresh PRs, and zero gating between "an export produced bytes" and "the site ships them". With the publish/pointer machinery proven in #65 and PolicyBench not yet launched, this is the moment to flip.

What

  • app/src/data.json deleted and gitignored; builds resolve the committed pointer (data.artifact.json) — download, sha256-verify, cache.
  • The snapshot manifest pins the published artifact (published_dashboard_artifact), and the integrity chain is now machine-checked instead of prose: pointer == manifest pin (new test), and the combined committed per-country run exports serialize to bytes hashing to that same pin (replacing the committed-blob equality test, still fully offline). The score-reproduction test runs on the same recombination.
  • A test asserts git ls-files app/src/data.json stays empty.
  • Docs updated (artifacts.md refresh flow, paper.md, results.md).

Refreshes are a 9-line pointer diff + manifest pin from here. The open refresh branches (#54/#55/#59/#60) will hit modify/delete on data.json when rebasing — resolution is: don't commit the export, run policybench publish-dashboard --tag dashboard-data-<date> and commit the pointer instead.

Verification

357 Python tests pass (16 snapshot-integrity tests incl. the new pin checks), ruff clean; app builds from a clean tree with no blob and no cache (downloads the release asset, verifies, splits, compiles), 50 bun tests and lint pass.

🤖 Generated with Claude Code

…fact

app/src/data.json (57MB per refresh) is deleted and gitignored. Builds
resolve app/src/data.artifact.json: prepare-data downloads the release
asset and verifies its sha256 (the path proven end-to-end in #65).

The snapshot integrity chain survives without the blob, machine-checked
instead of prose: the manifest pins the published artifact's sha256, a
test asserts the committed pointer matches the pin, and the old
committed-blob equality test now recombines the committed per-country
run exports and asserts their serialized bytes hash to the same pin.
The score-reproduction test runs on the same recombination.

Refresh flow from here: export locally, policybench publish-dashboard
--tag dashboard-data-<date>, commit the 9-line pointer plus the
manifest pin. In-flight refresh branches that modify data.json should
adopt that flow on rebase.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@vercel

vercel Bot commented Jun 10, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
policybench-site Ready Ready Preview, Comment Jun 10, 2026 9:01pm

Request Review

@MaxGhenis MaxGhenis merged commit 58b6510 into main Jun 10, 2026
6 checks passed
@MaxGhenis MaxGhenis deleted the drop-data-blob branch June 10, 2026 21:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant