Unified authorization service for neuroscience datasets.
DatasetGateway is a single Django service that centralizes dataset access control across multiple platforms:
- CAVE — drop-in replacement for middle_auth with compatible API endpoints
- Neuroglancer — implements the ngauth protocol for GCS token-based access
- Clio and neuprint — provides authorization APIs these services call to check user permissions
- WebKnossos — planned; will require building compatible APIs based on their open source code, similar to the CAVE integration approach
- pixi
- Docker (for production deployment only)
- A Google OAuth 2.0 client (for login — the setup wizard walks you through it)
cd dsg
pixi install
pixi run setup # interactive wizard — generates .env, runs migrationspixi run serveStarts the Django dev server. If .env doesn't exist yet, the setup wizard
runs automatically.
To run detached (survives logout, logs to dsg/serve.log, PID in
dsg/serve.pid):
pixi run serve-bg
pixi run stop-serve # to stoppixi run deployBuilds the Docker image, starts the container, runs migrations and seed commands. Put a reverse proxy (nginx/caddy) in front for TLS.
The Django admin is at /admin/.
Login requires a Google OAuth 2.0 client. Without one the server runs but
all login/authorize links will fail with a client_id error. The setup
wizard (pixi run setup) will walk you through creating one if
secrets/client_credentials.json is missing.
Alternatively, you can set it up manually:
- Go to the Google Cloud Console and create an OAuth 2.0 Client ID (type: Web application).
- Add
http://localhost:8200/accounts/google/login/callback/as an authorized redirect URI (and your production URI if known). - Download the JSON credentials and save them:
mkdir -p dsg/secrets
cp ~/Downloads/client_secret_*.json dsg/secrets/client_credentials.jsonThe secrets/ directory is gitignored. Alternatively, you can set environment
variables instead of using the JSON file:
export GOOGLE_CLIENT_ID="your-client-id.apps.googleusercontent.com"
export GOOGLE_CLIENT_SECRET="your-client-secret"All users authenticate via Google OpenID Connect. On successful login,
the server creates a DB-stored API key and sets it as the dsg_token
cookie. This single cookie is shared by all services in the ecosystem.
API requests are authenticated by checking for the token in this order:
dsg_tokencookieAuthorization: Bearer {token}header?dsg_token=query parameter
CAVE services (MaterializationEngine, AnnotationEngine, etc.) call
DatasetGateway's /api/v1/user/cache endpoint on every request to validate
the user's token and retrieve their permissions. This is a drop-in
replacement for CAVE's original middle_auth server — CAVE services
only need their AUTH_URL environment variable pointed at DatasetGateway.
Users log in via /api/v1/authorize, which redirects through Google
OAuth and sets the dsg_token cookie.
Neuroglancer uses the ngauth protocol.
Users log in via a popup that hits /auth/login → Google OAuth →
dsg_token cookie. Because Neuroglancer runs on a different origin
(e.g., neuroglancer.org), it cannot read the cookie directly. Instead
it calls POST /token, which reads the cookie server-side and returns a
short-lived token. Neuroglancer then exchanges that token for a
time-limited GCS access credential via POST /gcs_token, which grants
read access to the specific cloud storage bucket holding the dataset.
Other services (neuPrint, celltyping-light, Clio) validate users by
calling /api/v1/user/cache with the dsg_token value, the same way
CAVE services do. When all services share a cookie domain (configured
via AUTH_COOKIE_DOMAIN), users log in once and are authenticated
everywhere.
cd dsg
pixi run -e dev python -m pytestDatasetGateway is designed for a single-server Docker deployment behind a reverse proxy that handles TLS.
cd dsg
pixi run setup # generates .env interactively (set DJANGO_DEBUG=False for production)
pixi run deploy # builds Docker image, starts container, runs migrations + seedsThen create an admin user:
docker compose -f docker-compose.yml exec dsg python manage.py make_admin user@example.comPut a reverse proxy (nginx or Caddy) in front for TLS, pointed at
localhost:8080. The setup wizard defaults SECURE_SSL_REDIRECT=False
since most deployments terminate TLS at the proxy.
The SQLite database and static files are stored in Docker volumes
(dsg-data and dsg-static) so they survive container
restarts. If you need PostgreSQL or Redis, swap the DATABASES / CACHES
settings and add services to docker-compose.yml.
| Variable | Default | Description |
|---|---|---|
DJANGO_SECRET_KEY |
insecure dev key | Secret key for sessions and CSRF. Set in production. |
DJANGO_DEBUG |
True |
Set to False in production. |
DJANGO_ALLOWED_HOSTS |
* |
Comma-separated list of allowed hostnames. |
DATABASE_PATH |
db.sqlite3 |
Path to SQLite database file. |
SECURE_SSL_REDIRECT |
True (prod) |
Set to False if reverse proxy handles TLS. |
DSG_ORIGIN |
(empty) | Public origin for CSRF trusted origins (e.g., https://dataset-gateway.mydomain.org). |
DSG_PORT |
8200 |
Port for the development server. |
GOOGLE_CLIENT_ID |
(empty) | Google OAuth 2.0 client ID (overrides client_credentials.json). |
GOOGLE_CLIENT_SECRET |
(empty) | Google OAuth 2.0 client secret (overrides client_credentials.json). |
NGAUTH_ALLOWED_ORIGINS |
^https?://.*\.neuroglancer\.org$ |
Regex for allowed CORS origins. |
AUTH_COOKIE_DOMAIN |
(empty) | Cookie domain for cross-subdomain auth (e.g., .example.org). |
PORT |
8080 |
Port for gunicorn (Docker). |
GUNICORN_WORKERS |
2 |
Number of gunicorn worker processes. |
LOG_LEVEL |
info |
Gunicorn log level. |
- User manual — setup, admin workflows, user workflows, management commands
- Architecture — system design, authorization model, deployment strategy
- CAVE auth endpoints — CAVE API compatibility reference and SCIM 2.0 provisioning
- Implementation record — what was built, with retrospective notes on deviations from the original plan
- Admin manual — administration and operational reference