feat(extensions,presets): authenticate GitHub-hosted catalog and download requests with GITHUB_TOKEN/GH_TOKEN#2331
Conversation
…load requests with GITHUB_TOKEN/GH_TOKEN Squashed from github#2087 (original author: @anasseth). Adds GitHub-token authentication to extension and preset catalog fetching and ZIP downloads so private GitHub repos work when GITHUB_TOKEN/GH_TOKEN is set, while preventing credential leakage to non-GitHub hosts. - Introduces shared _github_http module with build_github_request() and open_github_url() helpers - Routes ExtensionCatalog and PresetCatalog network calls through GitHub-auth-aware opener - Adds comprehensive unit/integration tests for auth header behavior - Updates user docs for both extensions and presets Co-authored-by: anasseth <16745089+anasseth@users.noreply.github.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Fix redirect handler to preserve Authorization on GitHub-to-GitHub redirects (e.g. github.com → codeload.github.com). The previous implementation relied on super().redirect_request() which strips auth on cross-host redirects, breaking private repo archive downloads. - Add codeload.github.com to documented host lists in both EXTENSION-USER-GUIDE.md and presets/README.md - Add redirect auth-preservation and auth-stripping tests Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Adds GitHub token–authenticated fetching for extension and preset catalogs/downloads, closing the gap where private GitHub-hosted resources failed due to unauthenticated urlopen() calls.
Changes:
- Introduces a shared
specify_cli._github_httphelper module to build auth-awareRequests and open URLs with redirect-safe auth handling. - Routes extension + preset catalog fetch and ZIP download call sites through the shared auth-aware opener.
- Expands unit/integration coverage for token selection, hostname allowlisting/spoofing, and redirect behavior; updates user docs to include
codeload.github.com.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
src/specify_cli/_github_http.py |
New shared helpers for GitHub-host allowlisting, token header injection, and redirect-safe auth stripping/preservation. |
src/specify_cli/extensions.py |
Replaces direct urlopen() usage with _open_url() helper for catalog fetch + extension ZIP download. |
src/specify_cli/presets.py |
Replaces direct urlopen() usage with _open_url() helper for catalog fetch + preset pack ZIP download. |
tests/test_extensions.py |
Adds extensive auth/host validation tests plus redirect handler tests and integration-style request capture checks. |
tests/test_presets.py |
Adds analogous auth/host validation tests and integration-style request capture checks for presets. |
extensions/EXTENSION-USER-GUIDE.md |
Documents token usage and adds codeload.github.com to the GitHub host list with a private-catalog example. |
presets/README.md |
Documents token usage and adds codeload.github.com to the GitHub host list with a private-catalog example. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
mnriem
left a comment
There was a problem hiding this comment.
Please address Copilot feedback. Thank you for picking this one up!
Aligns with the rest of the codebase (e.g. __init__.py:1721) and GitHub's current API guidance. Updates all test assertions accordingly. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 7 out of 7 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Fix docstring to say Bearer instead of token (matches implementation)
- Remove unused imports/fixtures from redirect tests (GITHUB_HOSTS,
MagicMock, temp_dir, monkeypatch)
- Replace __import__('io').BytesIO() with normal import io pattern
in test_presets.py
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Copilot's findings
Tip
Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Files reviewed: 7/7 changed files
- Comments generated: 0 new
Description
Supersedes #2087 (original author: @anasseth). Fixes #2037.
Closes the authentication gap introduced when multi-catalog support landed in #1707. Before this change, all network requests in
ExtensionCatalogandPresetCatalogused bareurllib.request.urlopen(url)with no headers. Any catalog or extension/preset ZIP hosted in a private GitHub repository would silently fail with HTTP 404, regardless of whetherGITHUB_TOKENorGH_TOKENwas set.What's new
_github_httpmodule —build_github_request()andopen_github_url()helpers used by bothExtensionCatalogandPresetCatalogurlparse().hostnameagainst an exact allowlist (not substring matching) to prevent token leakage to lookalike hosts_StripAuthOnRedirectpreserves theAuthorizationheader on GitHub-to-GitHub redirects (e.g.github.com→codeload.github.com) while stripping it for non-GitHub redirect targetsChanges from #2087
This PR includes all changes from #2087 plus fixes for the outstanding review feedback:
redirect_request()relied onsuper()which already strips auth on cross-host redirects. Redirects between GitHub hosts (e.g.github.com→codeload.github.com) would lose the token, breaking private repo archive downloads. Fixed by capturing the original auth header before callingsuper()and re-attaching it for trusted GitHub domains.codeload.github.comto the documented host lists in bothEXTENSION-USER-GUIDE.mdandpresets/README.mdAffected call sites
ExtensionCatalog._fetch_single_catalog— catalog JSON fetchExtensionCatalog.fetch_catalog— legacy single-catalog pathExtensionCatalog.download_extension— extension ZIP downloadPresetCatalog._fetch_single_catalog— preset catalog JSON fetchPresetCatalog.fetch_catalog— legacy single-catalog pathPresetCatalog.download_pack— preset ZIP downloadNo behavior change for users without a token set.
Testing
_make_request: no-token, GITHUB_TOKEN, GH_TOKEN fallback, precedence, whitespace handling, non-GitHub URL, spoofing cases (lookalike host, github.com in path/query), api.github.com, codeload.github.comurlopenand assertRequestcarries auth headerAI Disclosure
This PR was prepared with Copilot CLI, building on @anasseth's original implementation from feat(extensions): authenticate GitHub-hosted catalog and download requests with GITHUB_TOKEN/GH_TOKEN #2087.