Skip to content

Plugin Directory: Extract asset image dimensions during plugin import#628

Closed
dd32 wants to merge 18 commits into
WordPress:trunkfrom
dd32:add/claude/extract-asset-resolutions-on-import
Closed

Plugin Directory: Extract asset image dimensions during plugin import#628
dd32 wants to merge 18 commits into
WordPress:trunkfrom
dd32:add/claude/extract-asset-resolutions-on-import

Conversation

@dd32
Copy link
Copy Markdown
Member

@dd32 dd32 commented May 11, 2026

Reads width / height from each plugin asset (screenshots, banners, icons) at import time and stores them alongside the existing per-asset metadata, so callers can read dimensions straight from post meta instead of fetching and decoding the image at request time.

Primarily intended to make #622 easier — that PR currently has to probe screenshot dimensions on every page render. With this in place, the shortcode can just read width / height from the asset record.

What changes

  • cli/class-import.php: export_and_parse_plugin() now runs each asset record through a new Import::enrich_asset_dimensions() helper. The helper reuses cached values when the SVN revision is unchanged, otherwise reads dimensions via getimagesize() (from disk for /trunk/ screenshots that are already exported, or from Template::get_asset_url() for /assets/-located files). Remote fetches use wp_remote_get() with limit_response_size capped at 128 KB, falling back to a full read only when the prefix is not enough.
  • bin/backfill-asset-dimensions.php (new): walks every plugin with non-empty asset meta and fills in missing dimensions. Flags: --plugin=slug, --limit=N, --dry-run. Prints per-100 progress and a summary of failures.

Notes

  • The new width / height keys ride along inside the existing assets_* records — no schema change.
  • SVG and other non-raster formats are left untouched.

Reads width/height from screenshots, banners and icons in the importer and stores them alongside the existing per-asset metadata. Reuses the cached dimensions when an asset SVN revision is unchanged, falling back to a 128 KB Range request before a full download to keep import bandwidth in check. Records that fail to parse are flagged so a later run can audit them without re-fetching every file.

Includes a bin/backfill-asset-dimensions.php script that walks every plugin with non-empty asset metadata and fills in dimensions in bulk.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 11, 2026 07:17
@github-actions
Copy link
Copy Markdown

The following accounts have interacted with this PR and/or linked issues. I will continue to update these lists as activity occurs. You can also manually ask me to refresh this list by adding the props-bot label.

Core Committers: Use this line as a base for the props when committing in SVN:

Props dd32.

To understand the WordPress project's expectations around crediting contributors, please review the Contributor Attribution page in the Core Handbook.

dd32 and others added 2 commits May 11, 2026 17:20
Failures are no longer persisted to post meta. The importer simply omits width/height when extraction fails; the next import retries from scratch and the backfill script still surfaces a list of unparseable files at the end of each run.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…Range header.

Drops the Range/206 plumbing and the fetch_asset_bytes helper. The 128 KB-then-full retry now lives inline as a two-step loop driven by the limit_response_size arg.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds per-asset pixel dimensions (width/height) to Plugin Directory asset metadata at import time, so downstream consumers (themes/templates/API) can read dimensions from post meta instead of fetching/decoding images on demand.

Changes:

  • Loads previously-imported asset meta and reuses cached dimensions when the SVN revision matches.
  • Adds Import::enrich_asset_dimensions() (plus an HTTP fetch helper) to extract raster image dimensions via getimagesize*(), with a Range-first strategy for remote assets and a dimensions_failed sentinel.
  • Introduces a CLI backfill script to populate missing dimensions across existing plugin asset meta.

Reviewed changes

Copilot reviewed 1 out of 2 changed files in this pull request and generated 1 comment.

File Description
wordpress.org/public_html/wp-content/plugins/plugin-directory/cli/class-import.php Caches prior asset records and enriches asset metadata with width/height during import; adds HTTP byte-fetch helper.
wordpress.org/public_html/wp-content/plugins/plugin-directory/bin/backfill-asset-dimensions.php New CLI script to backfill missing asset dimensions across existing plugins, reusing the same enrichment helper.
Comments suppressed due to low confidence (1)

wordpress.org/public_html/wp-content/plugins/plugin-directory/cli/class-import.php:1212

  • When requesting a byte range, this relies solely on the Range header. If the SVN server (or an intermediary) ignores Range and responds 200 with the full body, the “prefix” request can end up downloading the entire image anyway (and potentially downloading it twice if the subsequent full request runs). Consider also setting the HTTP API limit_response_size (and/or verifying Content-Range) when $range > 0 so the prefix fetch is actually capped at the intended size.
	 */
	public static function find_readme_file( $directory ) {
		$files = Filesystem::list_files( $directory, false /* non-recursive */, '!(?:^|/)readme\.(txt|md)$!i' );

		// prioritize readme.txt
		foreach ( $files as $f ) {
			if ( '.txt' == strtolower( substr( $f, -4 ) ) ) {

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

dd32 and others added 2 commits May 11, 2026 17:22
Drops the manual sprintf + add_query_arg in enrich_asset_dimensions and reuses the existing helper, which already honors the assets/trunk location split and the revision cache-buster. The helper now takes the plugin post rather than the slug.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 2 changed files in this pull request and generated 2 comments.

dd32 and others added 2 commits May 11, 2026 17:29
- Trim one extra space in the enrich_asset_dimensions() docblock so the parameter type column aligns correctly.
- Fall through to the remote fetch when getimagesize() on the local export fails (per Copilot review feedback).
- Mark the new CLI backfill script with phpcs:ignoreFile, matching the convention in .github/bin/phpcs-branch.php; CLI scripts intentionally echo unescaped output and shadow WP global names.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Switches the prefix read to a proper Range: bytes=0-131071 request so the
server stops sending after the requested bytes, accepting both 200 and
206 responses. limit_response_size stays as a memory backstop when the
server ignores Range and streams the full body.

Per Copilot review feedback.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 11, 2026 07:31
dd32 and others added 2 commits May 11, 2026 17:34
Bulk runs are more tolerant of slow SVN responses than the per-commit import path. Uses http_request_args (not http_request_timeout) so the override applies after enrich_asset_dimensions() sets its own timeout.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 2 changed files in this pull request and generated 1 comment.

dd32 and others added 2 commits May 11, 2026 17:35
…builder.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…docblock.

Template::get_asset_url() dereferences the post and would fatal on null. Both callers pass a non-null post (importer after the early throw, backfill after a guard), so the contract should reflect that.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 12, 2026 02:30
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 2 changed files in this pull request and generated 3 comments.

$size = false;

if ( $local && file_exists( $local ) ) {
$size = @getimagesize( $local );
Comment on lines +1158 to +1161
}

$size = @getimagesizefromstring( $body );
if ( $size ) {
Comment on lines +1140 to +1147
// PNG/GIF and most JPEGs. Fall back to a full read only when
// the prefix isn't enough to decode the header.
foreach ( array( 131072, 0 ) as $limit ) {
$args = array( 'timeout' => 15 );
if ( $limit > 0 ) {
$args['headers'] = array( 'Range' => 'bytes=0-' . ( $limit - 1 ) );
$args['limit_response_size'] = $limit;
}
dd32 and others added 2 commits May 12, 2026 12:39
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…WP_Post.

Both callsites pass a WP_Post object; document the actual contract.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 12, 2026 02:40
$this->plugin is guaranteed non-null by import_from_svn() before this point, so the conditional fill is unnecessary.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 2 changed files in this pull request and generated 1 comment.

dd32 and others added 2 commits May 12, 2026 12:45
Let the warnings propagate to the error log so malformed assets show up in reporting rather than being silently swallowed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…rt errors.

Spell out that the implicit retry (full read) only fires on prefix-too-short-to-decode, and that wp_error / non-2xx / empty body deliberately exit instead of retrying — re-requesting without Range will not help any of those failure modes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 12, 2026 02:53
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 2 changed files in this pull request and generated 1 comment.

Comment on lines +1126 to +1161
if ( $local && file_exists( $local ) ) {
$size = getimagesize( $local );
}

if ( ! $size ) {
$url = Template::get_asset_url( $post, $record, false /* no CDN */ );

// Range the first read to 128 KB — enough for the headers of
// PNG/GIF and most JPEGs. Fall back to a full read only when
// the prefix isn't enough to decode the header — the falsy
// `$size` at the bottom of the loop is the implicit retry.
// Transport errors / non-2xx / empty body intentionally bail
// out via `break`: those failure modes (network down, 4xx,
// 5xx, empty file) won't be helped by re-requesting the same
// URL without Range.
foreach ( array( 128 * KB_IN_BYTES, 0 ) as $limit ) {
$args = array( 'timeout' => 15 );
if ( $limit > 0 ) {
$args['headers'] = array( 'Range' => 'bytes=0-' . ( $limit - 1 ) );
$args['limit_response_size'] = $limit;
}

$response = wp_remote_get( $url, $args );
$code = wp_remote_retrieve_response_code( $response );
if ( is_wp_error( $response ) || ( 200 !== $code && 206 !== $code ) ) {
break;
}

$body = wp_remote_retrieve_body( $response );
if ( '' === $body ) {
break;
}

$size = getimagesizefromstring( $body );
if ( $size ) {
break;
wp_getimagesize() prefers Imagick when available, covering SVG/WebP and other formats the plain getimagesize() / getimagesizefromstring() pair misses. Drops the PNG/JPG/GIF extension filter — let wp_getimagesize() decide what it can read. Remote fetches now stream into a temp file via wp_safe_remote_get( ... stream => true ) instead of buffering the body in memory; the same 128 KB Range / full-read two-pass loop is preserved against the on-disk file. The bin backfill drops the "skipped" counter accordingly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@bazza bazza closed this in 359e8b0 May 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants