manage: extract _fetch_image_info to reduce duplication by ideaship · Pull Request #2217 · osism/python-osism

ideaship · 2026-04-27T14:04:57Z

ImageClusterapi, ImageClusterapiGardener, and ImageOctavia all share
the same marker-fetch and checksum-fetch sequence: fetch a marker
file, parse the date and image filename, construct the image URL,
fetch the .CHECKSUM file, and log each step. This identical block
was repeated verbatim in all three take_action() implementations.

Extract this sequence into a private _fetch_image_info(base_url,
marker_url) helper that returns (date, image_filename, url, checksum).
Callers that need the image filename for version extraction
(ImageClusterapi, ImageClusterapiGardener) unpack it; ImageOctavia
discards it with _.

ImageGardenlinux is deliberately excluded: it constructs the image
URL directly from a known pattern rather than fetching a marker file,
so it shares only the checksum-fetch half of the pattern and does not
fit this helper without contortion.

AI-assisted: Claude Code
Signed-off-by: Roger Luethi luethi@osism.tech

_{Stack created with GitHub Stacks CLI • Give Feedback 💬}

Four `osism manage image` commands (octavia, clusterapi, clusterapi-gardener, gardenlinux) fetch marker and checksum files from nbg1.your-objectstorage.com using bare requests.get() with no error handling and no retry. When the Ceph RGW backend transiently returns an XML S3 error document, the code parses <?xml as the checksum and openstack-image-manager rejects it with 'sha256:<?xml' is not a valid checksum. Analysis of 295 .CHECKSUM fetch events logged across testbed builds in the window 2025-12-14 – 2026-04-27 shows 84 XML failures (28.5 % of fetches). 94 % of those failures returned in ≤ 2 s (fast canned RGW 503 response); the remaining 6 % returned in 8–60 s. Zero failures were connection-level errors; every failure returned an HTTP response. Successful fetches span 0–53 s (p99 = 9 s). New module osism/utils/http.py exports fetch_text, which wraps requests.get with: - Retry on {408, 429} ∪ 5xx (covers the observed RGW 503) - Retry on non-HTTPError RequestException (connection / DNS / TLS) - Retry when an optional validate callback rejects the body (guards against HTTP 200 with unexpected content) - Immediate HTTPError propagation on non-retryable 4xx (404, 403) - Structured INFO log lines per attempt for observability in Zuul Default schedule: 3 retries, 2 s / 4 s / 8 s sleeps (14 s budget). Two validators are added to manage.py: - _validate_marker: generic YYYY-MM-DD <name>.qcow2 contract; rejects XML bodies without hard-coding any image-name prefix, so production deployments with unfamiliar names pass through to downstream validation rather than burning the retry budget. - _is_sha256: requires a 64-char lowercase hex first token, matching sha256sum(1) output; accepting uppercase would mask a downstream mismatch rather than surface it. All seven requests.get call sites in manage.py are replaced: clusterapi: marker + .CHECKSUM (take_action lines 110, 125) clusterapi-gardener: marker + .CHECKSUM (take_action lines 229, 245) gardenlinux: .sha256 (take_action line 354) octavia: marker + .CHECKSUM (take_action lines 440, 451) The checksum_url_status log line added to octavia in ce844a0 is removed; fetch_text emits the status code on every attempt. No per-attempt timeout is added. The distributions of slow failures (8–60 s) and slow successes (9–53 s) overlap — a 41 s duration appears as both a failure and a success in the data. No timeout value cleanly separates the two populations without introducing false positives on legitimate slow responses. 34 unit tests across three new files cover the retry helper (test_http.py, 15 tests), the validators (test_manage_validators.py, 15 tests), and the call-site wiring (test_manage_wiring.py, 4 tests). AI-assisted: Claude Code Signed-off-by: Roger Luethi <luethi@osism.tech>

ImageClusterapi, ImageClusterapiGardener, and ImageOctavia all share the same marker-fetch and checksum-fetch sequence: fetch a marker file, parse the date and image filename, construct the image URL, fetch the .CHECKSUM file, and log each step. This identical block was repeated verbatim in all three take_action() implementations. Extract this sequence into a private _fetch_image_info(base_url, marker_url) helper that returns (date, image_filename, url, checksum). Callers that need the image filename for version extraction (ImageClusterapi, ImageClusterapiGardener) unpack it; ImageOctavia discards it with _. ImageGardenlinux is deliberately excluded: it constructs the image URL directly from a known pattern rather than fetching a marker file, so it shares only the checksum-fetch half of the pattern and does not fit this helper without contortion. AI-assisted: Claude Code Signed-off-by: Roger Luethi <luethi@osism.tech>

sourcery-ai

Hey - I've left some high level feedback:

In _fetch_image_info, consider validating the number of whitespace-separated fields in the marker body before doing strip().split()[:2] so that a malformed marker produces a clear, custom error instead of an unhandled ValueError from tuple unpacking.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- In `_fetch_image_info`, consider validating the number of whitespace-separated fields in the marker body before doing `strip().split()[:2]` so that a malformed marker produces a clear, custom error instead of an unhandled `ValueError` from tuple unpacking.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

ideaship added 2 commits April 27, 2026 15:59

ideaship marked this pull request as draft April 27, 2026 14:06

sourcery-ai Bot reviewed Apr 27, 2026

View reviewed changes

Base automatically changed from checksum-fetch-retry to main May 1, 2026 07:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

manage: extract _fetch_image_info to reduce duplication#2217

manage: extract _fetch_image_info to reduce duplication#2217
ideaship wants to merge 2 commits intomainfrom
manage-image-info-refactor

ideaship commented Apr 27, 2026

Uh oh!

sourcery-ai Bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ideaship commented Apr 27, 2026

Uh oh!

sourcery-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants