Skip to content

ci: fix Skin Designer device DB scrape#4865

Merged
shai-almog merged 1 commit intomasterfrom
fix/skin-designer-device-db-scrape
May 5, 2026
Merged

ci: fix Skin Designer device DB scrape#4865
shai-almog merged 1 commit intomasterfrom
fix/skin-designer-device-db-scrape

Conversation

@shai-almog
Copy link
Copy Markdown
Collaborator

Summary

  • Scheduled trickle scrape has been hitting a 404 since Add website-targeted JavaScript Skin Designer integration #4758 (latest-mobiles.php3 doesn't exist on GSMArena), so every 6h run pulled zero phones — yet still opened PRs (e.g. Refresh Skin Designer device database #4862) because the JSON envelope churned between runs.
  • Replace the broken trickle path with a weekly run of the verified per-brand walk.
  • Make the script skip writing devices.json when no device records actually changed, so envelope drift alone can't open PRs again.

Changes

  • Workflow: weekly schedule (0 3 * * 1), single --delay 2.0 --max-pages 12 brands scrape, timeout bumped 25 → 75 min for cold-cache first run. Dropped the mode dispatch input and the trickle/full split.
  • Script: removed dead --mode latest path (walk_latest, RE_LATEST_LINK, --mode flag). Added _devices_changed() so the file is only rewritten when records actually differ (id-keyed, order-insensitive). Removed fresh_count from the persisted envelope — it's per-run, doesn't belong in a committed artifact.
  • README: document the new weekly cadence.

Test plan

  • python3 -m py_compile build_devices_json.py passes
  • --help shows the simplified flag set (no --mode)
  • Local dry run against the current 5072-device file with empty fresh batch → "No device records changed; leaving file untouched" and exits 0 without rewriting
  • Local dry run with a synthetic new device record → _devices_changed returns true (file would be rewritten)
  • Next scheduled run (Mon 03:00 UTC) walks the curated brands and either no-ops cleanly or opens a PR with real device additions

🤖 Generated with Claude Code

The scheduled trickle scrape has been hitting a 404 since #4758 landed
(latest-mobiles.php3 doesn't exist on GSMArena), so every 6h run pulled
zero phones and only opened PRs because the JSON envelope churned
between runs (e.g. #4862). Replace the trickle path with a weekly run
of the verified per-brand walk, and make the script skip writing the
file when no device records actually change so envelope drift can't
spam PRs again.

- Workflow: weekly Mon 03:00 UTC, single brands scrape, 75m timeout
  (cold cache). Drop the mode dispatch input and trickle/full split.
- Script: drop dead --mode latest path (walk_latest, RE_LATEST_LINK,
  --mode flag). Add _devices_changed() so the file is only rewritten
  when actual records added/removed/modified, ignoring envelope diffs.
- README: document the new weekly cadence.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 5, 2026

✅ Continuous Quality Report

Test & Coverage

Static Analysis

  • SpotBugs [Report archive]
    • ByteCodeTranslator: 0 findings (no issues)
    • android: 0 findings (no issues)
    • codenameone-maven-plugin: 0 findings (no issues)
    • core-unittests: 0 findings (no issues)
    • ios: 0 findings (no issues)
  • PMD: 0 findings (no issues) [Report archive]
  • Checkstyle: 0 findings (no issues) [Report archive]

Generated automatically by the PR CI workflow.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 5, 2026

Cloudflare Preview

@shai-almog shai-almog merged commit 1ce9a42 into master May 5, 2026
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant