Skip to content

Who let conda in? #717

@cailmdaley

Description

@cailmdaley

Context

After #706 merged, the Dockerfile on develop looks quite different from what it was a couple weeks ago. We'd like to understand the intent and raise a few concerns so we can agree on direction before building more on top.

What changed

Before v2.0 (develop, up through commit 7f81aabe):

FROM python:3.12-slim-bookworm
LABEL Description="Conda-Free ShapePipe Docker Image"
# ... apt deps, manual cdsclient build ...
RUN pip install --no-cache-dir astropy==6.1.0 galsim==2.5.3 ... ngmix @ git+...

Now on develop:

FROM images.canfar.net/skaha/astroml:latest
# ... apt deps, weightwatcher from source ...
ENV PATH /opt/conda/bin:$PATH
RUN pip install --no-cache-dir ipython jupyterlab snakemake
RUN pip install --no-cache-dir -e ".[fitsio]" ...

The scientific stack (numpy, matplotlib, astropy, galsim, mccd, ...) is now implicit: part comes pre-baked from astroml:latest, part is installed on top via pip install -e ".[fitsio]" + pyproject.toml.

Claude traced the base-image swap to 2026-04-07 (8e7cd1ad9261488c), where Dockerfile.base was created pointing at astroml:latest to be shared with Dockerfile.jupyter, then deleted back into a single inlined Dockerfile.

Things worth talking about

  1. Was the base-image swap a deliberate architectural decision? From the commit trail, it looks like astroml:latest arrived as a convenient base for the Dockerfile / Dockerfile.jupyter sharing experiment rather than as a "let's adopt CANFAR's base image" decision in its own right. If the intent is deployment-coupling to Skaha, fair — we just want to make that explicit.

  2. Conda + pip in the same environment. astroml:latest is a conda image; pip install -e ".[fitsio]" then installs on top. When a package is declared in both (numpy, matplotlib, astropy, ...), you can end up with two copies in the environment, with import order deciding which wins. This is a long-standing source of subtle breakage. We don't know how aggressive the overlap is in practice — worth checking.

  3. :latest is a floating tag. The same docker build today vs next month can produce different runtime stacks depending on what CANFAR pushes upstream. For a pipeline where numerical reproducibility matters, this is worth pinning — either to a specific astroml image digest or to a version tag.

  4. Deployment surface. images.canfar.net is CANFAR-hosted, which means builds outside CANFAR need registry access and the full image pull (multi-GB). We recall you prefer running tests on Candide over Skaha — the current setup does couple us to Skaha's base.

Options we see

  • (a) Status quo: keep astroml:latest, treat the Dockerfile as a Skaha deployment spec, accept the conda+pip overlap and :latest drift.
  • (b) Lockfile + pinned astroml: stay on astroml but pin to a specific image digest, plus add a uv.lock-style lockfile so the pip side is also frozen. Addresses reproducibility without changing the deployment story.
  • (c) Pure-pip reproducible base: revert to a slim Python base (e.g. python:3.12-slim-bookworm), install everything via pip + lockfile. Simpler and conda-free, but loses the Skaha convenience.
  • (d) Two Dockerfiles: a reproducible Dockerfile (option c) for general use, plus Dockerfile.skaha (option a) for Skaha-specific deployments.

We're happy to draft whichever direction makes sense — no urgency, and we don't want to add to the firehose. Let's talk through it when you're free; Monday works well on our end.

@martinkilbinger

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions