Problem
While reconciling material facets between progressive_globe and notebook queries, I found two material URIs that appear in facet counts but have no matching pref_label in vocab_labels.parquet.
This causes blank/None friendly labels in diagnostics and forces fallback to URI-tail rendering.
Repro (DuckDB)
-- URIs present in material facet counts
SELECT facet_value, count
FROM read_parquet('https://data.isamples.org/isamples_202601_facet_summaries.parquet')
WHERE facet_type='material'
AND (facet_value ILIKE '%organicanimalproduct%'
OR facet_value ILIKE '%plantmaterial%');
Returns:
https://w3id.org/isample/opencontext/material/0.1/organicanimalproduct (261)
https://w3id.org/isample/opencontext/material/0.1/plantmaterial (1)
-- No matching labels in vocab_labels
WITH facet AS (
SELECT DISTINCT facet_value AS uri
FROM read_parquet('https://data.isamples.org/isamples_202601_facet_summaries.parquet')
WHERE facet_type='material'
), labels AS (
SELECT DISTINCT uri
FROM read_parquet('https://data.isamples.org/vocab_labels.parquet')
WHERE lang='en'
)
SELECT f.uri
FROM facet f
LEFT JOIN labels l USING (uri)
WHERE l.uri IS NULL
ORDER BY f.uri;
Returns exactly those same 2 URIs.
Additional context
scripts/build_vocab_labels.py currently builds labels from the TTL list including:
opencontext_material_extension.ttl
But that TTL appears to contain:
.../organicplantmaterial
.../organicanimalmaterial
and not the two URIs present in facet data (.../plantmaterial, .../organicanimalproduct).
So this looks like a vocabulary/data-term drift (or legacy aliases), not a rendering bug.
Why this issue fits here
This repo owns:
scripts/build_vocab_labels.py
- tutorial consumers (
progressive_globe.qmd, isamples_explorer.qmd)
- substrate docs (
SERIALIZATIONS.md, data.qmd, how-to-use.qmd)
Even if canonical fix is upstream (vocabulary terms/aliases), this repo is the right integration point to track and guard against missing label coverage.
Proposed actions
- Add a CI/data check: every
facet_summaries(facet_type IN material/context/object_type).facet_value must resolve to vocab_labels.uri.
- Decide canonical handling for these two OpenContext URIs:
- map to canonical terms during export, or
- add alias/deprecated concept coverage in upstream vocab, and ensure
vocab_labels includes them.
- Add temporary fallback map in UI or label-build pipeline so users do not see missing labels.
- Document label-coverage expectation in
SERIALIZATIONS.md.
Acceptance criteria
LEFT JOIN coverage query above returns 0 missing URIs for material facets.
- Both Search Explorer and Progressive Globe render friendly labels for all material facet values.
Problem
While reconciling material facets between
progressive_globeand notebook queries, I found two material URIs that appear in facet counts but have no matchingpref_labelinvocab_labels.parquet.This causes blank/None friendly labels in diagnostics and forces fallback to URI-tail rendering.
Repro (DuckDB)
Returns:
https://w3id.org/isample/opencontext/material/0.1/organicanimalproduct(261)https://w3id.org/isample/opencontext/material/0.1/plantmaterial(1)Returns exactly those same 2 URIs.
Additional context
scripts/build_vocab_labels.pycurrently builds labels from the TTL list including:opencontext_material_extension.ttlBut that TTL appears to contain:
.../organicplantmaterial.../organicanimalmaterialand not the two URIs present in facet data (
.../plantmaterial,.../organicanimalproduct).So this looks like a vocabulary/data-term drift (or legacy aliases), not a rendering bug.
Why this issue fits here
This repo owns:
scripts/build_vocab_labels.pyprogressive_globe.qmd,isamples_explorer.qmd)SERIALIZATIONS.md,data.qmd,how-to-use.qmd)Even if canonical fix is upstream (vocabulary terms/aliases), this repo is the right integration point to track and guard against missing label coverage.
Proposed actions
facet_summaries(facet_type IN material/context/object_type).facet_valuemust resolve tovocab_labels.uri.vocab_labelsincludes them.SERIALIZATIONS.md.Acceptance criteria
LEFT JOINcoverage query above returns 0 missing URIs for material facets.