AtomGraph · namedgraph · Apr 6, 2026 · Apr 6, 2026 · Apr 8, 2026 · Apr 9, 2026
diff --git a/CLAUDE.md b/CLAUDE.md
@@ -80,9 +80,18 @@ find ./document-hierarchy/ -name '*.sh' -exec bash {} \;
 - `ServiceContext` decouples HTTP infrastructure from `Service`, holding dataspace and service metadata separately
 - Dataspace metadata and service metadata are split in configuration; types for `lapp:endUserApplication`/`lapp:adminApplication` are inferred on the fly from `system.trig`
 
+### Dataspaces
+Since v5.1.0, a single LDH instance supports multiple **dataspaces**, each identified by a distinct subdomain (origin). Each dataspace is a pair of applications: an end-user app (`<subdomain>`) and an admin app (`admin.<subdomain>`), routed by nginx via wildcard subdomain matching.
+
+Configuration is split across two files:
+- `config/dataspaces.trig` — public metadata: origins (`lapp:origin`), ontologies (`ldt:ontology`), stylesheets (`ac:stylesheet`)
+- `config/system.trig` — internal wiring: maps apps to SPARQL services (`ldt:service`) and assigns types (`lapp:AdminApplication`/`lapp:EndUserApplication`)
+
+Multiple dataspaces can share the same backend SPARQL service.
+
 ### Service Architecture
 The application runs as a multi-container setup:
-- **nginx**: Reverse proxy and SSL termination
+- **nginx**: Reverse proxy and SSL termination (wildcard subdomain routing for dataspaces)
 - **linkeddatahub**: Main Java application (Tomcat)
 - **fuseki-admin/fuseki-end-user**: Separate SPARQL stores
 - **varnish-frontend/varnish-admin/varnish-end-user**: Caching layers
@@ -91,8 +100,28 @@ The application runs as a multi-container setup:
 1. Requests come through nginx proxy
 2. Varnish provides caching layer
 3. LinkedDataHub application handles business logic
-4. Data persisted to appropriate Fuseki triplestore
-5. XSLT transforms data for client presentation
+4. RDF data is read/written via the **Graph Store Protocol** — each document in the hierarchy corresponds to a named graph in the triplestore; the document URI is the graph name
+5. Data persisted to appropriate Fuseki triplestore
+6. XSLT transforms data for client presentation
+
+### Linked Data Proxy and Client-Side Rendering
+
+LDH includes a Linked Data proxy that dereferences external URIs on behalf of the browser. The original design rendered proxied resources identically to local ones — server-side RDF fetch + XSLT. This created a DDoS/resource-exhaustion vector: scraper bots routing arbitrary external URIs through the proxy would trigger a full server-side pipeline (HTTP fetch → XSLT rendering) per request, exhausting HTTP connection pools and CPU.
+
+The current design splits rendering by request origin:
+
+- **Browser requests** (`Accept: text/html`): `ProxyRequestFilter` bypasses the proxy entirely. The server returns the local application shell. Saxon-JS then issues a second, RDF-typed request (`Accept: application/rdf+xml`) from the browser.
+- **RDF requests** (API clients, Saxon-JS second pass): `ProxyRequestFilter` fetches the external RDF, parses it, and returns it to the caller. No XSLT happens server-side.
+- **Client-side rendering**: Saxon-JS receives the raw RDF and applies the same XSLT 3 templates used server-side (shared stylesheet), so proxied resources look almost identical to local ones.
+
+Key implementation files:
+- `ProxyRequestFilter.java` — intercepts `?uri=` and `lapp:Dataset` proxy requests; HTML bypass; forwards external `Link` headers
+- `ApplicationFilter.java` — registers external proxy target URI in request context (`AC.uri` property) as authoritative proxy marker
+- `ResponseHeadersFilter.java` — skips local-only hypermedia links (`sd:endpoint`, `ldt:ontology`, `ac:stylesheet`) for proxy requests; external ones are forwarded by `ProxyRequestFilter`
+- `client.xsl` (`ldh:rdf-document-response`) — receives the RDF proxy response client-side; extracts `sd:endpoint` from `Link` header; stores it in `LinkedDataHub.endpoint`
+- `functions.xsl` (`sd:endpoint()`) — returns `LinkedDataHub.endpoint` when set (external proxy), otherwise falls back to the local SPARQL endpoint
+
+The SPARQL endpoint forwarding chain ensures ContentMode blocks (charts, maps) query the **remote** app's SPARQL endpoint, not the local one. `LinkedDataHub.endpoint` is reset to the local endpoint by `ldh:HTMLDocumentLoaded` on every HTML page navigation, so there is no stale state when navigating back to local documents.
 
 ### Key Extension Points
 - **Vocabulary definitions** in `com.atomgraph.linkeddatahub.vocabulary`

diff --git a/Dockerfile b/Dockerfile
@@ -109,6 +109,8 @@ ENV MAX_TOTAL_CONN=40
 
 ENV MAX_REQUEST_RETRIES=3
 
+ENV CONNECTION_REQUEST_TIMEOUT=30000
+
 ENV IMPORT_KEEPALIVE=
 
 ENV MAX_IMPORT_THREADS=10

diff --git a/README.md b/README.md
@@ -153,7 +153,13 @@ The following tools are required for CLI scripts in the `bin/` directory:
 
   ### Dataspaces
 
-  Dataspaces are configured in [`config/system.trig`](https://github.com/AtomGraph/LinkedDataHub/blob/master/config/system.trig). Relative URIs will be resolved against the base URI configured in the `.env` file.
+  Since version 5.1.0, a single LinkedDataHub instance supports multiple **dataspaces**, each identified by a distinct subdomain (origin). Each dataspace consists of a pair of applications: an end-user app (e.g. `https://northwind-traders.demo.localhost:4443`) and an admin app on the `admin.` subdomain (e.g. `https://admin.northwind-traders.demo.localhost:4443`).
+
+  Dataspace configuration is split across two files:
+  - [`config/dataspaces.trig`](https://github.com/AtomGraph/LinkedDataHub/blob/master/config/dataspaces.trig) — public metadata: origins (`lapp:origin`), ontologies, stylesheets
+  - [`config/system.trig`](https://github.com/AtomGraph/LinkedDataHub/blob/master/config/system.trig) — internal wiring: SPARQL service bindings and application types (`lapp:AdminApplication`/`lapp:EndUserApplication`)
+
+  To add a new dataspace, add corresponding entries to both files. Relative URIs will be resolved against the base URI configured in the `.env` file.
 
 _:warning: Do not use blank nodes to identify applications or services. We recommend using the `urn:` URI scheme, since LinkedDataHub application resources are not accessible under their own dataspace._
 

diff --git a/docker-compose.yml b/docker-compose.yml
@@ -65,6 +65,7 @@ services:
       - SIGN_UP_CERT_VALIDITY=180
       - MAX_CONTENT_LENGTH=${MAX_CONTENT_LENGTH:-2097152}
       - ALLOW_INTERNAL_URLS=${ALLOW_INTERNAL_URLS:-}
+      - CONNECTION_REQUEST_TIMEOUT=${CONNECTION_REQUEST_TIMEOUT:-}
       - NOTIFICATION_ADDRESS=LinkedDataHub <notifications@localhost>
       - MAIL_SMTP_HOST=email-server
       - MAIL_SMTP_PORT=25
@@ -204,6 +205,20 @@ configs:
               ssl_verify_client ${NGINX_SSL_VERIFY_CLIENT:-optional_no_ca};
 
               location / {
+                  add_header Access-Control-Allow-Origin "*" always;
+                  add_header Access-Control-Allow-Methods "GET, POST, PUT, DELETE, PATCH, HEAD, OPTIONS" always;
+                  add_header Access-Control-Allow-Headers "Accept, Content-Type, Authorization" always;
+                  add_header Access-Control-Expose-Headers "Link, Content-Location, Location" always;
+
+                  if ($$request_method = OPTIONS) {
+                      add_header Access-Control-Allow-Origin "*";
+                      add_header Access-Control-Allow-Methods "GET, POST, PUT, DELETE, PATCH, HEAD, OPTIONS";
+                      add_header Access-Control-Allow-Headers "Accept, Content-Type, Authorization";
+                      add_header Access-Control-Expose-Headers "Link, Content-Location, Location";
+                      add_header Access-Control-Max-Age "1728000";
+                      return 204;
+                  }
+
                   proxy_pass http://linkeddatahub;
                   #proxy_cache backcache;
                   limit_req zone=linked_data burst=30 nodelay;
@@ -215,11 +230,14 @@ configs:
 
                   proxy_set_header Client-Cert '';
                   proxy_set_header Client-Cert $$ssl_client_escaped_cert;
-
-                  # add_header Cache-Control "public, max-age=86400";
               }
 
               location ^~ /uploads/ {
+                  gzip on;
+                  gzip_proxied any;
+                  gzip_types *;
+                  gzip_min_length 1024;
+
                   proxy_pass http://linkeddatahub;
                   limit_req zone=static_files burst=20 nodelay;
 
@@ -235,9 +253,15 @@ configs:
               }
 
               location ^~ /static/ {
+                  gzip on;
+                  gzip_proxied any;
+                  gzip_types *;
+                  gzip_min_length 1024;
+
                   proxy_pass http://linkeddatahub;
                   limit_req zone=static_files burst=50 nodelay;
 
+                  add_header Access-Control-Allow-Origin "*" always;
                   add_header Cache-Control "public, max-age=604800, immutable";
               }
           }
@@ -253,6 +277,20 @@ configs:
               ssl_verify_client optional_no_ca;
 
               location / {
+                  add_header Access-Control-Allow-Origin "*" always;
+                  add_header Access-Control-Allow-Methods "GET, POST, PUT, DELETE, PATCH, HEAD, OPTIONS" always;
+                  add_header Access-Control-Allow-Headers "Accept, Content-Type, Authorization" always;
+                  add_header Access-Control-Expose-Headers "Link, Content-Location, Location" always;
+
+                  if ($$request_method = OPTIONS) {
+                      add_header Access-Control-Allow-Origin "*";
+                      add_header Access-Control-Allow-Methods "GET, POST, PUT, DELETE, PATCH, HEAD, OPTIONS";
+                      add_header Access-Control-Allow-Headers "Accept, Content-Type, Authorization";
+                      add_header Access-Control-Expose-Headers "Link, Content-Location, Location";
+                      add_header Access-Control-Max-Age "1728000";
+                      return 204;
+                  }
+
                   proxy_pass http://linkeddatahub;
                   #proxy_cache backcache;
                   limit_req zone=linked_data burst=30 nodelay;
@@ -267,8 +305,15 @@ configs:
               }
 
               location ^~ /static/ {
+                  gzip on;
+                  gzip_proxied any;
+                  gzip_types *;
+                  gzip_min_length 1024;
+
                   proxy_pass http://linkeddatahub;
                   limit_req zone=static_files burst=50 nodelay;
+
+                  add_header Access-Control-Allow-Origin "*" always;
               }
           }
 

diff --git a/http-tests/dataspaces/non-existent-dataspace-accept-param.sh b/http-tests/dataspaces/non-existent-dataspace-accept-param.sh
@@ -0,0 +1,18 @@
+#!/usr/bin/env bash
+set -euo pipefail
+
+# Regression: ?accept= param must be honoured even when the dataspace does not exist
+
+# admin app
+content_type=$(curl -k -s -G -w "%{content_type}" -o /dev/null \
+  --data-urlencode "accept=text/turtle" \
+  "https://admin.non-existing.localhost:4443/")
+
+echo "$content_type" | grep -q "text/turtle"
+
+# end-user app
+content_type=$(curl -k -s -G -w "%{content_type}" -o /dev/null \
+  --data-urlencode "accept=text/turtle" \
+  "https://non-existing.localhost:4443/")
+
+echo "$content_type" | grep -q "text/turtle"
diff --git a/http-tests/misc/cors-jaxrs.sh b/http-tests/misc/cors-jaxrs.sh
@@ -7,7 +7,7 @@ purge_cache "$END_USER_VARNISH_SERVICE"
 purge_cache "$ADMIN_VARNISH_SERVICE"
 purge_cache "$FRONTEND_VARNISH_SERVICE"
 
-# Test JAX-RS CORSFilter on dynamic content (GET request)
+# Test nginx CORS headers on dynamic content (GET request)
 
 response=$(curl -i -k -s \
   -H "Origin: https://example.com" \

diff --git a/http-tests/misc/gzip-sefjson.sh b/http-tests/misc/gzip-sefjson.sh
@@ -0,0 +1,18 @@
+#!/usr/bin/env bash
+set -euo pipefail
+
+# Test that nginx gzip compression is active for static JSON (SEF file)
+
+response=$(curl -k -s -D - -o /dev/null \
+  -H "Accept-Encoding: gzip" \
+  "${END_USER_BASE_URL}static/com/atomgraph/linkeddatahub/xsl/client.xsl.sef.json")
+
+if ! echo "$response" | grep -qi "Content-Encoding: gzip"; then
+  echo "Content-Encoding: gzip not found on client.xsl.sef.json"
+  exit 1
+fi
+
+if ! echo "$response" | grep -q "HTTP/.* 200"; then
+  echo "client.xsl.sef.json did not return 200 OK"
+  exit 1
+fi
diff --git a/http-tests/proxy/GET-proxied-accept-forwarded.sh b/http-tests/proxy/GET-proxied-accept-forwarded.sh
@@ -0,0 +1,39 @@
+#!/usr/bin/env bash
+set -euo pipefail
+
+initialize_dataset "$END_USER_BASE_URL" "$TMP_END_USER_DATASET" "$END_USER_ENDPOINT_URL"
+initialize_dataset "$ADMIN_BASE_URL" "$TMP_ADMIN_DATASET" "$ADMIN_ENDPOINT_URL"
+purge_cache "$END_USER_VARNISH_SERVICE"
+purge_cache "$ADMIN_VARNISH_SERVICE"
+purge_cache "$FRONTEND_VARNISH_SERVICE"
+
+# add agent to the readers group to be able to read documents
+
+add-agent-to-group.sh \
+  -f "$OWNER_CERT_FILE" \
+  -p "$OWNER_CERT_PWD" \
+  --agent "$AGENT_URI" \
+  "${ADMIN_BASE_URL}acl/groups/readers/"
+
+# Regression: ProxyRequestFilter must forward the client's Accept header verbatim to the
+# upstream, NOT substitute its own readable-types list. Previously the filter built its
+# outbound Accept from MediaTypes.getReadable(Model.class) + getReadable(ResultSet.class)
+# (everything Jena could ingest, all q=1.0), discarding what the client actually asked for.
+# The upstream then content-negotiated against that broad list and could legally pick any
+# RDF format — e.g. application/rdf+thrift — even when the client (e.g. SaxonJS document())
+# explicitly requested application/rdf+xml or application/xml.
+#
+# Verify by requesting one specific RDF type and asserting the response matches it.
+
+for accept in 'application/rdf+xml' 'text/turtle' 'application/n-triples'; do
+    content_type=$(curl -k -f -s -G -w "%{content_type}" -o /dev/null \
+      -E "$AGENT_CERT_FILE":"$AGENT_CERT_PWD" \
+      -H "Accept: $accept" \
+      --data-urlencode "uri=${END_USER_BASE_URL}" \
+      "$ADMIN_BASE_URL")
+
+    case "$content_type" in
+        "$accept"*) ;;
+        *) exit 1 ;;
+    esac
+done
diff --git a/http-tests/proxy/GET-proxied-accept-html-not-preferred.sh b/http-tests/proxy/GET-proxied-accept-html-not-preferred.sh
@@ -0,0 +1,40 @@
+#!/usr/bin/env bash
+set -euo pipefail
+
+initialize_dataset "$END_USER_BASE_URL" "$TMP_END_USER_DATASET" "$END_USER_ENDPOINT_URL"
+initialize_dataset "$ADMIN_BASE_URL" "$TMP_ADMIN_DATASET" "$ADMIN_ENDPOINT_URL"
+purge_cache "$END_USER_VARNISH_SERVICE"
+purge_cache "$ADMIN_VARNISH_SERVICE"
+purge_cache "$FRONTEND_VARNISH_SERVICE"
+
+# add agent to the readers group to be able to read documents
+
+add-agent-to-group.sh \
+  -f "$OWNER_CERT_FILE" \
+  -p "$OWNER_CERT_PWD" \
+  --agent "$AGENT_URI" \
+  "${ADMIN_BASE_URL}acl/groups/readers/"
+
+# Regression: when a client lists application/xhtml+xml (or text/html) in Accept at a
+# LOWER q-value than another supported type, the proxy must treat the request as
+# API-client intent and forward — not as browser navigation that wants the app shell.
+# Previously, ProxyRequestFilter bypassed on anyMatch(HTML or XHTML in Accept) without
+# checking q-rank, so it false-fired on any Accept that mentioned HTML at all and
+# returned the local app shell instead of the proxied response.
+#
+# Discriminator is HTTP status — content-type cannot tell bypass from forward because
+# admin and end-user share writer configs (same Accept → same negotiated type on both).
+# A UUID-named path that doesn't exist on either origin disambiguates:
+#   - bypass: ApplicationFilter strips ?uri= → request URI becomes admin root → 200
+#   - forward: proxy forwards the actual UUID path to end-user → 404
+
+accept_header='application/xml, text/xml;q=0.9, application/xhtml+xml;q=0.8, */*;q=0.7'
+non_existing_uri="${END_USER_BASE_URL}$(cat /proc/sys/kernel/random/uuid 2>/dev/null || uuidgen)/"
+
+status=$(curl -k -s -G -o /dev/null -w "%{http_code}" \
+  -E "$AGENT_CERT_FILE":"$AGENT_CERT_PWD" \
+  -H "Accept: $accept_header" \
+  --data-urlencode "uri=${non_existing_uri}" \
+  "$ADMIN_BASE_URL")
+
+[ "$status" = "$STATUS_NOT_FOUND" ] || exit 1
diff --git a/http-tests/proxy/GET-proxied-external-502.sh b/http-tests/proxy/GET-proxied-external-502.sh
@@ -19,6 +19,7 @@ add-agent-to-group.sh \
 
 curl -k -w "%{http_code}\n" -o /dev/null -s \
   -G \
+  -H "Accept: application/n-triples" \
   -E "$AGENT_CERT_FILE":"$AGENT_CERT_PWD" \
   --data-urlencode "uri=http://f1d2d4cf-90bb-4f5b-ae4b-921e584b6edd.org" \
   "$END_USER_BASE_URL" \

diff --git a/http-tests/proxy/GET-proxied-ontology-ns.sh b/http-tests/proxy/GET-proxied-ontology-ns.sh
@@ -0,0 +1,66 @@
+#!/usr/bin/env bash
+set -euo pipefail
+
+initialize_dataset "$END_USER_BASE_URL" "$TMP_END_USER_DATASET" "$END_USER_ENDPOINT_URL"
+initialize_dataset "$ADMIN_BASE_URL" "$TMP_ADMIN_DATASET" "$ADMIN_ENDPOINT_URL"
+purge_cache "$END_USER_VARNISH_SERVICE"
+purge_cache "$ADMIN_VARNISH_SERVICE"
+purge_cache "$FRONTEND_VARNISH_SERVICE"
+
+# add agent to the readers group to be able to read documents
+
+add-agent-to-group.sh \
+  -f "$OWNER_CERT_FILE" \
+  -p "$OWNER_CERT_PWD" \
+  --agent "$AGENT_URI" \
+  "${ADMIN_BASE_URL}acl/groups/readers/"
+
+# use a made-up hash-based namespace: not mapped as a static file, not a registered app
+namespace_uri="http://made-up-test-ns.example/ns"
+class1="${namespace_uri}#ClassOne"
+class2="${namespace_uri}#ClassTwo"
+ontology_doc="${ADMIN_BASE_URL}ontologies/namespace/"
+namespace="${END_USER_BASE_URL}ns#"
+
+# add two classes with URIs in the made-up namespace to the app's ontology
+
+add-class.sh \
+  -f "$OWNER_CERT_FILE" \
+  -p "$OWNER_CERT_PWD" \
+  -b "$ADMIN_BASE_URL" \
+  --uri "$class1" \
+  --label "Class One" \
+  "$ontology_doc"
+
+add-class.sh \
+  -f "$OWNER_CERT_FILE" \
+  -p "$OWNER_CERT_PWD" \
+  -b "$ADMIN_BASE_URL" \
+  --uri "$class2" \
+  --label "Class Two" \
+  "$ontology_doc"
+
+# clear the in-memory ontology so the new classes are present on next request
+
+clear-ontology.sh \
+  -f "$OWNER_CERT_FILE" \
+  -p "$OWNER_CERT_PWD" \
+  -b "$ADMIN_BASE_URL" \
+  --ontology "$namespace"
+
+# request the namespace document URI (without fragment) via ?uri= proxy.
+# the namespace document is not DataManager-mapped and not a registered app,
+# so ProxyRequestFilter falls through to the OntModel DESCRIBE path, which
+# returns descriptions of all #-fragment terms in that namespace.
+
+response=$(curl -k -f -s \
+  -G \
+  -E "$AGENT_CERT_FILE":"$AGENT_CERT_PWD" \
+  -H "Accept: application/n-triples" \
+  --data-urlencode "uri=${namespace_uri}" \
+  "$END_USER_BASE_URL")
+
+# verify both class descriptions are present in the response
+
+echo "$response" | grep -q "$class1"
+echo "$response" | grep -q "$class2"