Skip to content

feat: UC Metric Views on Analytics#341

Draft
atilafassina wants to merge 34 commits intomainfrom
analytics-exp
Draft

feat: UC Metric Views on Analytics#341
atilafassina wants to merge 34 commits intomainfrom
analytics-exp

Conversation

@atilafassina
Copy link
Copy Markdown
Contributor

WiP

Lands the integration spine for UC Metric View consumption inside the
analytics plugin: metric.json config + JSON Schema, type-gen pipeline
emitting MetricRegistry augmentation (DESCRIBE TABLE EXTENDED ... AS JSON
mocked in tests), POST /api/analytics/metric/:key route with zod body
validation, basic SQL constructor (SELECT MEASURE(<m>) FROM <fqn> [LIMIT n]),
metric: cache-key namespace reserved, useMetricView React hook, and tests
at every layer (94 test files, 1825 tests passing).

knip.json: removed obsolete entries (**/*.css, json-schema-to-typescript,
tailwindcss, tw-animate-css) that the metric-source schema generation +
existing CSS Tailwind imports made redundant.

Per prd/analytics-metric-view-source and tasks/.../Phase 1.
xavier loop iterations 1+2.

Co-authored-by: Isaac
Signed-off-by: Atila Fassina <atila@fassina.eu>
Extend Phase 1's walking skeleton with dimension support and time-grain
truncation. The result row narrows at the call site via Pick<MetricRow<K>,
M[number] | D[number]>; TimeGrain<K> narrows per-metric-view to YAML-allowed
grains.

Server: validator rejects unknown dimensions, unknown grains, and timeGrain-
without-time-typed-dim. SQL constructor adds GROUP BY ALL when dimensions
are present and date_trunc('<grain>', <col>) for time-typed dims when
timeGrain is set. Cache key composition extended with sorted dims and grain.

Type-gen: extractMetricColumns reads YAML 1.1 time_grain attribute (tolerant
parser, accepts list / metadata.time_grain / { grains: [...] } shapes);
emits per-metric timeGrains union and @timeGrain JSDoc on time-typed dim
entries.

Hook: useMetricView<K, const M, const D, F> widens args with dimensions and
timeGrain. Type-level tests (expectTypeOf) cover narrowing across measures-
only, dims-only, and combined cases.

Tests: 1878 total (95 files), +49 from Phase 1. Backpressure (build, docs,
check:fix, typecheck, test, knip) all green.

Per prd/analytics-metric-view-source and tasks/.../Phase 2.
xavier loop iteration 3.

Co-authored-by: Isaac
Signed-off-by: Atila Fassina <atila@fassina.eu>
Add the recursive Filter<K> type to the metric-view request body, validator,
and SQL constructor. 12 v1 operators, AND/OR composition, member restricted
to dimensions, all values bound via the existing Statement Execution
parameter path. No string concatenation of user input.

Filter<K> = Predicate<K> | { and: Filter<K>[] } | { or: Filter<K>[] }
Predicate<K> = { member: DimensionKey<K>; operator; values? }

Operators: equals, notEquals, in, notIn, gt, gte, lt, lte, contains,
notContains, set, notSet. Member must resolve in DimensionKey<K>; op⇄type
compatibility (range ops on numeric/date only, string ops on string only,
others any-type) enforced when knownDimensionTypes is populated. Cardinality
rules per op (single / list / null). Recursion depth cap = 8.

SQL: recursive renderFilter emits parameterized fragments with named binds
(:f_<idx>). AND/OR groups parenthesized. WHERE clause omitted when filter
is empty. Cache key composer extends with sorted filter children inside
each AND/OR group (sort-before-hash for the filter sub-tree; full args sort
lands in Phase 4).

Tests: +58 (1936 total). 12 op cases, AND/OR composition, depth-cap, value-
not-in-SQL parameterization assertions, validator rejections, type-level
narrowing for Predicate.member and Filter recursion, hook payload assembly
preserves recursive structure.

Generated-file diffs in registry/types.generated.ts and shared/schemas/
*.generated.ts are benign Biome formatting picked up by check:fix during
the build cycle — no logic change.

Per prd/analytics-metric-view-source and tasks/.../Phase 3.
xavier loop iteration 4.

Co-authored-by: Isaac
Signed-off-by: Atila Fassina <atila@fassina.eu>
…sition

Activate the OBO execution lane in metric.json (Phase 1 reserved the schema
slot but only SP-lane was wired) and finalize the cache-key composition
with full sort-before-hash.

Server: _handleMetricRoute now dispatches via this.asUser(req) when the
metric registration's lane is "obo", reusing the existing Plugin.asUser
Proxy that .obo.sql files already use. New deriveMetricExecutorKey helper
returns the literal "sp" for SP-lane entries and a sha256 hex digest of
the user identity for OBO-lane entries; the raw x-forwarded-user value is
hashed at the boundary and never reaches the cache key, telemetry, or
error messages. Empty/whitespace identities collapse to an "anonymous"
sentinel before hashing to prevent silent shared-cache scope across
unauthenticated callers.

Cache key invariants finalized: same args in any order produce the same
key (sort-before-hash on measures, dimensions, and filter predicates within
each AND/OR group); different args, lanes, or users produce different keys;
SP literal "sp" cannot collide with an OBO user named "sp" because OBO
identities are always hashed.

dev-playground gains an OBO sample entry (customer_metrics) alongside
Phase 1's SP entry (revenue).

Tests: 1958 total (+22 from Phase 3's 1936). Coverage: deriveMetricExecutorKey
(SP literal, OBO digest stability + uniqueness + privacy + sentinel), cache
key invariants (8 tests), full route handler integration (cross-user, cross-
lane, cache hit, default 1-hour TTL, registry rejects duplicate keys cross-
lane). Backpressure (build, docs, check:fix, typecheck, test, knip) all
green.

Per prd/analytics-metric-view-source and tasks/.../Phase 4.
xavier loop iteration 5.

Co-authored-by: Isaac
Signed-off-by: Atila Fassina <atila@fassina.eu>
…ties

Make UC YAML 1.1 semantic metadata (display names, format specs, descriptions)
available to consumers via build-time bundling and library-agnostic format
utilities. No changes to existing chart components — the BI story uses Plotly
or comparable third-party libs and consumes metadata via the new helpers.

Type-gen: extractMetricColumns now captures display_name (and camelCase
displayName), format / format_spec, and description with tolerant parsers
that read from direct fields, metadata.<name>, and the YAML's own attribute
shapes. New buildMetricsMetadataBundle + generateMetricsMetadataJson emit
shared/appkit-types/metrics.metadata.json next to metric.d.ts. The .d.ts
adds a typed metadata field on each MetricRegistry entry so call-sites
narrow on literal display_name / format / time_grain.

Hook: useMetricView return shape extends to { data, metadata, loading,
error }. metadata is the per-metric subset, available before data loads,
stable across re-renders for the same key. Distribution via runtime
registerMetricsMetadata(bundle) singleton — customers call once at startup
because hooks can't statically import customer-side build artifacts.

Format utilities at @databricks/appkit-ui/format:
- formatValue(value, format?) — printf-like format string parser (currency
  prefix/suffix, percent, plain number); falls back to localized
  Intl.NumberFormat for unrecognized formats; tolerant for non-numeric and
  null/NaN
- formatLabel(name, columnMetadata?) — display_name when present;
  camelCase / snake_case / SCREAMING_SNAKE / PascalCase humanization
  fallback
- toD3Format(format?) — converts UC printf strings to d3-format-compatible
  strings ($,.2f, .1%, ,.0f); returns identity for unknown shapes so
  Plotly / ECharts can no-op gracefully

Tests: 2041 total (+83 from Phase 4's 1958). Coverage: 18 type-gen tests,
48 format unit tests across all three utilities + library-agnostic Plotly
/ ECharts / table / KPI consumption flows, 8 registry tests, 9 hook
metadata stability tests, 5 type-level narrowing tests.

Per prd/analytics-metric-view-source and tasks/.../Phase 5.
xavier loop iteration 6.

Co-authored-by: Isaac
Signed-off-by: Atila Fassina <atila@fassina.eu>
Add a top-level `metric` command to the AppKit CLI with a `sync` subcommand
that exposes the same syncMetrics() core that the Vite plugin runs in dev
mode. Useful for CI checks, non-Vite builds, manual refresh after pulling
teammate's metric.json, and pre-commit hooks.

Surface:
  npx appkit metric sync [--warehouse-id <id>] [--metric-json-path <path>]
                         [--output-dir <dir>] [--root-dir <dir>] [--silent]
                         [--json]

Resolution order: CLI flags > env vars (DATABRICKS_*) > @clack/prompts.
--silent / --json modes skip prompts and fail-fast on missing inputs.

Validates metric.json against the JSON Schema (AJV) before invoking
syncMetrics(); rejects malformed configs early without any network/SDK
load. Wraps the Phase 1 fetcher to capture per-entry errors first-failure-
wins, then classifies via heuristic message-substring match into 5 error
modes:

  exit 0  success
  exit 1  missing-fqn        (404 / "not found" / "does not exist")
  exit 2  warehouse-unreach  (ECONNREFUSED / ETIMEDOUT)
  exit 3  malformed-config   (JSON parse / schema validation / bare-string)
  exit 4  auth-failed        (401 / 403 / token expired)
  exit 5  unknown            (catch-all, original message preserved)

Lazy-loads @databricks/appkit/type-generator (mirrors generate-types.ts)
so the CLI compiles without appkit being built. MetricSyncDependencies
test seam for unit tests. metric-source.schema.json copied into
packages/shared/dist/schemas via tsdown so the validator finds it at
runtime.

Tests: 2075 total (+34 from Phase 5's 2041). 22 sync tests covering all
five exit codes, stdout snapshots for success and multi-issue error,
empty-metric.json no-op, flag-override paths. 12 validator tests covering
8 acceptance/rejection cases plus 4 error-formatter cases.

Per prd/analytics-metric-view-source and tasks/.../Phase 6.
xavier loop iteration 7.

Co-authored-by: Isaac
Signed-off-by: Atila Fassina <atila@fassina.eu>
…integration

Wrap the framework feature with documentation and a working reference app
demo. With this commit, the Metric View Source feature is shippable as v1
end to end.

docs/docs/plugins/analytics-metric-views.md covers all 8 sections per the
PRD: metric.json shape, route contract, useMetricView hook reference,
filter spec (12 ops + AND/OR examples), time-grain semantics, metadata
flow + format utilities (Plotly + ECharts examples), CLI command reference
(npx appkit metric sync), and security model.

apps/dev-playground/client/src/routes/metrics.route.tsx demonstrates the
full stack:
- SP-lane revenue Plotly line chart with toD3Format / formatLabel helpers
  wiring metadata into Plotly's tickformat and trace name
- OBO-lane customer metrics with current-user header
- Hardcoded structured filter (region in EMEA/APAC/AMER) showing the spec
- Graceful "could not load" fallback when the underlying UC metric view
  doesn't exist live in the dev workspace (the demo route still compiles
  and renders against the typed surfaces — the metric view existing in
  UC is decoupled from the feature working in code)

apps/dev-playground/shared/appkit-types/{metrics.d.ts, metrics.metadata.json}
are hand-authored equivalents of what `npx appkit metric sync` would emit
if the demo metric views existed in the dev workspace. registerMetricsMetadata
called once at startup in main.tsx; format utilities resolve correctly.

Plotly (react-plotly.js + plotly.js + types) added to dev-playground/client
only — not to @databricks/appkit-ui core. BI customers wire their own chart
library; the framework stays library-agnostic.

5 Playwright tests in apps/dev-playground/tests/metrics.spec.ts cover the
happy path (page load, chart render with formatted axis, OBO header) and
the error path (graceful fallback). Both /api/analytics/metric/{revenue,
customer_metrics} routes are exercised via mocked SSE responses.

Backpressure: 2075/2075 tests, all 6 commands green. No regression in
existing dev-playground routes.

Per prd/analytics-metric-view-source and tasks/.../Phase 7.
xavier loop iteration 8 — final.

Co-authored-by: Isaac
Signed-off-by: Atila Fassina <atila@fassina.eu>
When the metric route fails at the connection layer (server-thrown 500,
stream close before SSE error event, etc.) onError previously dropped
err.message and replaced it with the generic "Unable to load data, please
try again". This matched the production UX intent but was hostile to
development — schema-not-found, auth-failed, and other diagnosable
failures were invisible to the developer.

In dev (import.meta.env.DEV), surface err.message verbatim. In prod,
keep the generic line. Full error stays in console.error in both modes.

Mirrors the same pattern useAnalyticsQuery uses (and which has the same
bug at use-analytics-query.ts:177); not fixing the analytics hook in this
commit because the analytics surface is unchanged scope from this loop.

Tests: +2 cases covering dev-mode passthrough and prod-mode generic
fallback (vi.stubEnv("DEV", "")). 20/20 pass.

Co-authored-by: Isaac
Signed-off-by: Atila Fassina <atila@fassina.eu>
…n artifact

Phase 7 hand-authored metrics.d.ts (plural) as a fallback type-registry
augmentation while the dev workspace lacked the underlying UC metric views.
Phase 1's type-gen always emitted metric.d.ts (singular). The naming
diverged.

Now that the seed creates the actual views, `npx appkit metric sync`
emits the canonical singular file. Drop the plural hand-authored stand-in;
the auto-generated singular file is the source of truth.

Also fixes vi.stubEnv("DEV", "") → vi.stubEnv("DEV", false) — vitest's
typed signature requires a boolean for the DEV key (the empty-string
worked at runtime but failed typecheck).

Updates one stale `metrics.d.ts` reference in the dev-playground demo
route's UI text.

Co-authored-by: Isaac
Signed-off-by: Atila Fassina <atila@fassina.eu>
Without an alias, Databricks returns measure columns named "measure(arr)"
literal rather than "arr". Frontend consumers reading row.<measureName>
got undefined; the customer_metrics OBO panel in dev-playground rendered
empty cells for active_accounts and churn_rate while the segment
dimension (which doesn't go through MEASURE()) rendered fine.

Add `AS <name>` to each MEASURE() emission. The measure name is already
validated against MEASURE_NAME_PATTERN and the registry's known measure
list before this point, so safe to interpolate.

Tests: snapshot updates for buildMetricSql + 7 string assertions in the
route-handler tests. 131/131 metric tests pass; full backpressure (build,
docs, check:fix, typecheck, test, knip) green.

Co-authored-by: Isaac
Signed-off-by: Atila Fassina <atila@fassina.eu>
…EAM_ERROR

The SQL warehouse client (executeStatement and getStatement polling
catches at client.ts:249, 389) unconditionally re-wrapped non-AppKitError
errors as ExecutionError.statementFailed(error.message). When the error
is the SDK's native AbortError, the wrap destroys name === "AbortError"
and stamps statusCode 500 — the downstream stream-manager._categorizeError
then matched the "statusCode" duck-type fallback and returned
SSEErrorCode.UPSTREAM_ERROR, bypassing the "client cancellation is
normal" short-circuit. Result: every React 19 StrictMode dev-mode
double-mount (or any legitimate client cancel) logged at error level
as if the warehouse failed.

Defense-in-depth fix at both layers:

1. connectors/sql-warehouse/client.ts (both wrap sites): re-throw native
   AbortError as-is before the ExecutionError wrap.

2. stream/stream-manager.ts._categorizeError: message-substring check for
   "operation was aborted" / "the request was aborted" before falling
   through to the statusCode-based UPSTREAM_ERROR classification.
   Catches any future code path where AbortError identity gets laundered.

Tests: new "error categorization" describe block in stream.test.ts
covering native AbortError → STREAM_ABORTED, wrapped AbortError (whose
name was overwritten) → STREAM_ABORTED via message match, real upstream
error with statusCode → UPSTREAM_ERROR, timeout, ECONNREFUSED, opaque
fallback. 6 new tests; full backpressure (build, docs, check:fix,
typecheck, test, knip) green.

Also: switch the dev-playground /metrics route's error fallbacks from
amber-saturated alerts to neutral border-border + bg-muted/40 with the
underlying error rendered in destructive-token monospace. Customer
metrics table headers gain text-muted-foreground + uppercase tracking,
numeric columns right-align with tabular-nums, rows gain a hover state.

See investigations/appkit_2026-04-30_aborted-misclassified-as-upstream-error.md
for the 5-remora investigation that produced this fix.

Co-authored-by: Isaac
Signed-off-by: Atila Fassina <atila@fassina.eu>
…ent YAML attribute

Phase 2 originally read a `time_grain` attribute on each column in the
DESCRIBE TABLE EXTENDED ... AS JSON response, with a "tolerant parser"
accepting list / metadata.time_grain / { grains: [...] } shapes. The
field does not exist in the UC metric-view schema:

  - Rust serde at universe/reyden/metric-view-serde/src/v11/column.rs
    enumerates the 7 known column properties: window, expr, format,
    display_name, name, comment, synonyms.
  - CREATE rejects unknown fields with "Unrecognized field 'time_grain'
    (class com.databricks.sql.serde.v11.Column), not marked as
    ignorable".
  - Grep for "time_grain" / "TimeGrain" across the entire serde tree
    (Rust port + Scala source under sql/catalyst/.../serde/) returns
    zero hits.

So the previous implementation was reading a phantom field — when fed
real DESCRIBE output, every dimension's `timeGrains` came back empty,
which surfaces as `timeGrains: never` in the generated metric.d.ts and
causes the server validator to reject `timeGrain: "month"` requests.

Fix: replace `extractTimeGrains(obj)` (YAML-attribute lookup) with
`inferTimeGrains(type)` (SQL-type-based inference):

  TIMESTAMP / TIMESTAMP_LTZ / TIMESTAMP_NTZ → 7 grains (minute..year)
  DATE → 5 grains (day..year, no sub-day)
  everything else → undefined (not time-typed)

Inference is gated on `isMeasure: false` — measures aren't grouped on
even when they resolve to a temporal type. Type matching is
case-insensitive and strips parameterized suffixes (TIMESTAMP(6)).

Updates 6 broken extractMetricColumns tests (which passed `time_grain:
[...]` fixtures testing imaginary behavior) to type-driven equivalents,
plus 5 downstream tests that fed `time_grain` into syncMetrics /
buildMetricsMetadataBundle / generateMetricTypeDeclarations /
generateMetricsMetadataJson fixtures. Snapshots regenerated. 41/41
type-gen tests pass; full backpressure (build, docs, check:fix,
typecheck, test, knip) green.

Co-authored-by: Isaac
Signed-off-by: Atila Fassina <atila@fassina.eu>
…egistry

Phase 5 emitted a build-time metadata bundle (`metrics.metadata.json`)
but the server's `loadMetricRegistry()` was called without it. Result:
`knownTimeGrainsByDim` was always `{}` at runtime, so the validator
rejected every legit `timeGrain: "month"` request with "no time-typed
dimension is included in 'dimensions'" — the user's regenerated bundle
sat on disk unread.

Two fixes:

1. **Auto-discover the metadata bundle** — `loadMetricRegistry()` now
   reads `<cwd>/shared/appkit-types/metrics.metadata.json` (matching the
   default `metricMetadataOutFile` from the type-gen Vite plugin) when
   the caller doesn't pass metadata explicitly. New `readMetricsMetadataBundle()`
   helper reads, parses, and transforms the bundle's per-column shape
   (`{ measures: { [k]: { type, ... } }, dimensions: { [k]: { type, time_grain?, ... } } }`)
   into the simpler `MetricBuildTimeMetadata` shape the registry expects.
   Also captures `dimensionTypes` from the bundle into `knownDimensionTypes`
   so the filter validator's op-vs-type compatibility checks light up.

2. **Fall-open behavior on the cross-field timeGrain rule** — when
   `knownTimeGrainsByDim` is empty (no metadata available), the validator
   no longer rejects requests carrying `timeGrain`. Mirrors the existing
   dimensions-fall-open behavior at `metric.ts:274-281`. The warehouse
   handles incompatible grains itself. Same fall-open applied to
   `buildMetricSql`'s mirror check.

Behavior on disk now matches the v1 PRD intent end-to-end: type-gen
emits the bundle → server reads it on boot → validator narrows
`timeGrain` to the YAML-allowed set per dimension. When the bundle is
absent, the request is accepted and warehouse rejection becomes the
late-binding signal.

Tests: existing "rejects timeGrain when no time-typed dim" updated to
test the metadata-available case + new "falls open when metadata empty"
test covering the new contract. 2086/2086 tests pass; full backpressure
green.

Co-authored-by: Isaac
Signed-off-by: Atila Fassina <atila@fassina.eu>
… access

The /metrics demo route accessed `metadata?.measures.arr.format` and the
sibling field for OBO measures via `metadata?.measures.<name>.format`.
The optional-chain guards `metadata` itself, but a real-world bundle can
ship with `measures: {}` (e.g., when `metric sync` couldn't extract
columns from a particular view's DESCRIBE response — see the warning
this commit adds upstream). In that case `metadata.measures.<name>` is
undefined and the `.format` access throws, the React ErrorBoundary
catches the crash, and BOTH panels disappear behind a generic "Something
went wrong" page even though only one panel is actually broken.

Add `?.` between the column lookup and `.format`. formatLabel and
formatValue / toD3Format already handle undefined gracefully (humanized
fallback for labels, identity for tickformat), so the chart still
renders — it just shows bare labels until the bundle is regenerated
with column metadata.

Also add a warn log to syncMetrics when extractMetricColumns succeeds
but returns zero columns. Previously the path was silent; the only
signal was a downstream React crash. Now misshapen DESCRIBE responses
surface at sync time with a hint about the expected shape.

Co-authored-by: Isaac
Signed-off-by: Atila Fassina <atila@fassina.eu>
DESCRIBE TABLE EXTENDED ... AS JSON for a metric view stores the YAML 1.1
`format` attribute as a structured object, not a printf string:

  {
    "currency": {
      "decimal_places": { "type": "EXACT", "places": 2 },
      "currency_code": "USD"
    }
  }
  { "percent": { "decimal_places": { "places": 1 } } }
  { "number":  { "decimal_places": { "places": 0 } } }

Phase 5's extractor only accepted strings, so the bundle's `format` field
was never populated even after a successful sync. `formatValue` and
`toD3Format` then fell back to default localized formatting.

New extractFormatString() handles both shapes:
  - String: passthrough (legacy / hand-authored bundles)
  - Object: dispatch on outer key (currency / percent / number) and
    translate to a printf string consumable by the existing format
    utilities.

Currency code → symbol map covers USD ($), EUR (€), GBP (£), JPY/CNY (¥),
INR (₹), BRL (R$). Unknown codes fall back to "<CODE> " prefix so the
information isn't lost — formatValue and d3-format render either way.

Defaults: 2 decimal places + USD when the structured currency object
omits those fields. Tolerates `decimal_places` as either a nested
{ places: n } object (canonical) or a bare number (legacy/short form).

9 new tests covering: USD currency, EUR with 0 places, unknown code
fallback, percent with 1 place, percent with 0 places, number with comma
grouping, unrecognized format object → undefined, currency defaults
when fields missing, decimal_places as bare number. 50/50 type-gen
tests pass; full backpressure (build, docs, check:fix, typecheck, test,
knip) green.

Co-authored-by: Isaac
Signed-off-by: Atila Fassina <atila@fassina.eu>
Plotly assumes a light background by default — text and gridlines stayed
near-black even when the playground was in dark mode, leaving the chart
nearly unreadable. Wire the chart layout to the theme via a small
useIsDarkMode() hook that watches the `dark` class on <html> (set by
the existing ThemeSelector component).

Backgrounds (paper + plot) are transparent so the Card's theme-aware
bg shows through unchanged. Font / grid / axis colors swap between
zinc-200/zinc-800 (dark) and zinc-900/zinc-200 (light) — close enough
to the design tokens to feel native without plumbing CSS-var reads
through Plotly's layout object.

Co-authored-by: Isaac
Signed-off-by: Atila Fassina <atila@fassina.eu>
Three review-followup fixes on the client side of the metric-view surface:

- Defend against unstable args: serialize once per render via
  argsKey = JSON.stringify(args) and read fresh values through a ref. Inline
  args objects no longer cause an infinite refetch loop; consumers who
  memoize args see the same path with no extra cost. Docs tip is downgraded
  from required to optional.
- Scrub raw warehouse / server error text in production. The SSE error path
  sets a generic React error in prod; dev mode keeps the passthrough so
  [TABLE_OR_VIEW_NOT_FOUND] etc. still surface for diagnosis. Full message
  stays in console.error for ops.
- Drop source (UC FQN) and lane from the public MetricMetadata /
  MetricSemanticMetadata shape. Those are server-side concerns; the client
  bundle should carry only display-name / format / time-grain hints. Tests
  assert these fields are no longer reachable.

Co-authored-by: Isaac
…ctness)

Bundles the server / CLI fixes from the same review pass so the metric-view
surface is consistent with the appkit-ui changes that just landed.

Security:
- Fail-closed registry. POST /api/analytics/metric/:key returns 503
  METRIC_REGISTRY_NOT_READY when the registered metric has no build-time
  measure list. Without this gate the validator falls open and arbitrary
  identifiers reach the warehouse, which both leaks schema and runs
  unintended SQL.
- Generic public 400 for validation failures. validateMetricRequest now
  returns the offending field paths only (e.g. fields: filter.member);
  the full Zod issue list, including allowlist enumerations, stays in
  the AppKitError context for telemetry.
- Drop source (UC FQN) and lane from the build-time
  metrics.metadata.json bundle. Server reads them from metric.json
  directly; the client bundle is now display-name / format /
  time-grain only.

Correctness:
- Reject empty { or: [] } at validation time; SQL builder renders an
  empty or to 1 = 0 as defense in depth so a future validator bypass
  cannot turn it into match-everything.
- Op-vs-value type checks: contains/notContains require a string;
  range ops on a numeric dim require a numeric value. Surfaces 400
  instead of letting buildMetricSql throw a 500.
- canonicalizeFilter uses JSON.stringify on each value segment so
  ["a", "b"] no longer collides with ["a|string:b"] in the cache key.
- Tighten timeGrain rejection: when the registry knows the metric's
  dims but none are time-typed, timeGrain is meaningless and is now
  rejected at validation time. Pure fall-open is reserved for the
  no-metadata-at-all path, which the route's fail-closed gate prevents
  in production.
- Wrap OBO executor setup in the route's try/catch so asUser /
  resolveUserId throws (e.g. AuthenticationError on missing token)
  surface through the canonical error envelope.
- syncMetrics now returns { schemas, failures }; the CLI surfaces
  failures (parse errors, zero-column extraction) as MetricSyncError so
  CI catches a silently-broken metric view instead of shipping an
  empty bundle entry.

Tests updated to assert path-only public messages, the new fail-closed
behavior, the cache-key non-collision, and the per-entry sync failure
propagation. Snapshot regenerated.

Co-authored-by: Isaac
The previous lockfile had 268 resolved URLs pointing at the internal
npm-proxy.dev.databricks.com proxy (plotly.js transitive deps — mapbox,
choojs, etc.). Those got baked in when plotly was added on a developer
machine whose ~/.npmrc routes through the internal proxy.

CI runners cannot reach npm-proxy.dev.databricks.com — when pnpm's
lifecycle hook ran `cd client && npm install`, npm tried to fetch from
the proxy URLs in the lockfile, hung for 8 minutes, and crashed with
`Exit handler never called!`. Both prior CI runs on this PR failed at
this step.

Regenerating with --package-lock-only (post-rewriting the host portion
of the @esbuild/* platform-optional entries that retained explicit
`resolved` URLs) produces a lockfile with zero proxy URLs and 59 entries
on registry.npmjs.org. The other 695 packages defer to whatever
registry the install user has configured at install time.

Co-authored-by: Isaac
…ertions

The previous test attached `page.on("request")` after `beforeEach` had
already registered `page.route()` mocks. With Playwright's per-page
listener attachment combined with same-page route mocks fulfilling
synchronously, the `request` event for the very first navigation can
slip past a listener attached inside the test body — captured 0 calls
even though all 4 sibling tests in the same file rely on those exact
requests firing (chart renders, OBO error banner, etc.).

`waitForRequest` is the established idiom in this repo (see
reconnect.spec.ts) and is the form Playwright reliably fires for mocked
routes — the matcher is registered as a one-shot promise before the
navigation triggers it, so listener-attachment ordering cannot drop
events. Each await doubles as the "received >= 1" assertion: it throws
on timeout otherwise.

Co-authored-by: Isaac
Two security blockers from the round-2 review:

- The catch fallback in `_handleMetricRoute` was echoing `err.message` for
  any non-AppKitError throw site. Hard-code the public message; route the
  raw detail to telemetry only. Closes the recurring
  `error-messages-leaking-details` pattern at the validation boundary.
- The SSE error envelope was forwarding raw warehouse errors verbatim for
  4xx-shaped responses (e.g. `[TABLE_OR_VIEW_NOT_FOUND] catalog.schema.fqn`).
  The round-1 client-side scrub depends on the React hook running; any
  non-React consumer (curl, alternate SDK) would still see the raw text.
  Wrap the executeStream callback so any error inside the metric query is
  caught, telemetered with full detail, and re-thrown as a sanitized
  ExecutionError. Production gets a generic message; dev keeps the
  original for diagnostics.

Co-authored-by: Isaac
…ocations

Three blockers from the round-2 review:

- Cap `measures`, `dimensions`, `filter.values`, and `limit` at the
  validator level. Closes the recurring `unbounded-request-parameters`
  pattern (3+ past reviews) — without these caps, a hostile caller could
  send `values: [...10M items...]` and force the validator + named
  bind-var step to chew through unbounded work. Limits are deliberately
  generous (50 measures, 20 dimensions, 1000 filter values, 100k row
  limit) so legitimate BI traffic never trips them.
- Memoize the per-registration Zod schema via a WeakMap.
  `validateMetricRequest` was rebuilding a recursive schema with ~10
  chained refinements on every request. The WeakMap lets the cache
  empty automatically when the registry is reloaded (dev hot-reload of
  metric.json) — old registration objects become unreferenced and the
  entry is GC'd.
- Memoize `Intl.NumberFormat` and parsed format specs in formatValue.
  Charts/tables call this per cell; allocation is notoriously slow in
  V8. A 1000-row × 5-col table was paying ~5000 instantiations per
  render. Module-level Maps keyed on serialized options / format
  strings; cardinality is tiny in practice (one entry per metric-view
  measure with a format).

Co-authored-by: Isaac
Round-3 review surfaced four issues in the catch I added in 38bf39a and
the surrounding error-response shape. All confirmed by the recurring
error-messages-leaking-details pattern (3+ past reviews):

- Flip NODE_ENV polarity to fail-closed. The previous check
  `process.env.NODE_ENV === "production"` left the prod scrub disabled
  in any environment that doesn't set NODE_ENV explicitly (containers,
  serverless). Switch to `!== "development"` so anything but explicit
  dev mode treats as prod.
- Stop fast-pathing 5xx-class AppKitErrors past the scrub. The SQL
  connector throws ExecutionError (statusCode 500) on warehouse
  failures, and its message carries raw warehouse text including
  catalog/schema FQNs. Sanitize 5xx AppKitErrors in production; let
  4xx-class through (ValidationError/AuthenticationError messages are
  constructed by us with known-clean content).
- Let AbortError pass through unwrapped. The new try/catch was
  swallowing client-driven cancellations and re-emitting them as
  ExecutionError, polluting telemetry and changing the framework's
  cancellation semantics.
- Drop the user-supplied key from the 404 body. Echoing
  `Metric "X" not registered` lets an unauthenticated probe enumerate
  registered keys by elimination. The 404 status stays for tooling;
  the body becomes a generic "Metric not found" and the actual key
  goes to telemetry only.

Co-authored-by: Isaac
Zod's union/object parsers walk the filter tree recursively (via z.lazy)
before our `validateFilterTree` superRefine depth cap fires. A
pathologically deep `{ and: [{ and: [...] }] }` payload — say 10000
levels — would stack-overflow the Node process during parse, never
reaching the validator's depth check.

Add an iterative pre-parse walk in `validateMetricRequest` that uses an
explicit stack and aborts as soon as `METRIC_FILTER_MAX_DEPTH` is
exceeded. Predicate leaves don't count toward depth — only nested `and`
/ `or` wrappers, matching the existing in-tree validator's rule.

Closes the recurring `unbounded-request-parameters` pattern at the
call-stack vector that round-2's array-size caps did not cover.

New regression test: 10000-deep AND tree previously crashed the worker;
now rejects cleanly with the canonical "fields: filter" 400.

Co-authored-by: Isaac
…Grains

Two small follow-ups from the round-3 perf review:

- `useMetricView` was allocating a `Blob` purely to count UTF-8 bytes
  inside the payload memo. Swap to `new TextEncoder().encode(s).length`
  — same byte count, no `Blob` object, hot enough on dashboards with
  many metric tiles to be worth fixing.
- `collectAllowedGrains(grainsByDim)` rebuilt a `Set` and re-sorted on
  every `buildMetricSql` call. The input is static per
  `MetricRegistration`, so memoize via `WeakMap` keyed on the
  `grainsByDim` reference. Steady-state SQL construction reuses the
  sorted array; reload of the registry frees the entry automatically.

Co-authored-by: Isaac
Round-4 review surfaced a critical bypass in the round-3 pre-check that
both reviewers (security + perf) reproduced independently:

The previous pre-check used `if (obj.and) ... else if (obj.or) ...` and
walked only one branch. A payload like `{ and: [], or: <10000-deep> }`
slid past the empty-`and` walk and Zod's union recursion then
stack-overflowed on the `or` branch — the exact DoS the pre-check was
designed to prevent.

Two fixes:

- Inspect BOTH `and` and `or` on every node. Zod's `.strict()` will
  reject the multi-key shape downstream, but the pre-check has to walk
  every branch Zod might recurse into.
- Cap per-group child count at 100 in both the iterative pre-check and
  the Zod schema. Without it, a flat `{ and: [...10M items...] }`
  would push 10M frames onto the explicit stack — OOM before validation
  even reached Zod.

Two regression tests: the `{ and: [], or: <deep> }` bypass case (which
previously crashed with `RangeError: Maximum call stack size exceeded`)
and a flat-breadth case to hold the new `.max(100)` group cap.

Co-authored-by: Isaac
…ationale

Two follow-ups from round 4:

- Extend the metric-route fail-closed gate to trip on either empty
  `knownMeasures` OR empty `knownDimensions`. The round-1 gate only
  guarded the measure list; a metric whose DESCRIBE produced measures
  but zero dimensions still admitted arbitrary `dimensions` and
  `filter.member` identifiers, and those flow directly into the SQL
  `WHERE`/`GROUP BY` clauses. Same class of fall-open as round 1, just
  on the sibling field. Regression test: empty knownDimensions on a
  registration with non-empty knownMeasures must produce 503
  METRIC_REGISTRY_NOT_READY.
- Hoist `TextEncoder` to module scope and correct the misleading
  "constant-time" comment in `useMetricView` — `encode()` is O(bytes),
  same big-O as Blob's internal encoding; the win was avoiding the
  Blob wrapper allocation, not constant-time-ness. The shared encoder
  is stateless and safe to reuse.

Co-authored-by: Isaac
…ll-open

Round-4 correctness review surfaced that the round-4 fail-closed gate
widening I shipped in 4b6d07f broke a legitimate shape: a measure-only
KPI metric registers with `knownDimensions: []`, and the gate then
returned 503 for every request — including the request shapes that
contract types declare as legal (`dimensions?` is optional).

Two GPT runs reached opposing conclusions in r4:
- Security: empty knownDimensions is a fall-open vector
- Correctness: empty knownDimensions is a legitimate shape

Both correct. The clean fix is to separate the two concerns:

- Revert the route gate to `knownMeasures.length === 0` only.
  `knownMeasures` is unambiguous (every metric has at least one measure
  by definition), so an empty array always means metadata is missing.
- Tighten the validator instead of the gate: when `knownDimensions` is
  empty, `dimensions` items are typed as `z.never()` and any filter
  predicate's `member` is rejected. Measure-only metrics still accept
  requests that omit dimensions/filter; the security fall-open path
  (where empty knownDimensions let arbitrary identifiers through to
  SQL) is now closed at validation time, not at the gate.

Replaced the round-4 503 regression test with three new ones: KPI
request accepted, dimensions rejected on measure-only, filter rejected
on measure-only. The pre-existing "falls open on timeGrain when
metadata is empty" test depended on dimensions ALSO falling open, so
it's now unreachable — deleted with a comment explaining the new
state.

Co-authored-by: Isaac
…ey collision

Three round-5 blockers landed in one go:

- timeGrain z.never() when no time-typed dimensions are registered.
  Mirrors the round-4 dimensions tightening: the previous fall-open
  schema accepted any token; the SQL came out identical (no time-typed
  dim → no `date_trunc`), but `composeMetricCacheKey` salts the cache
  entry with the raw token. A hostile caller could vary
  `month/week/foo_bar/...` to force unbounded cache misses + warehouse
  re-execution. CWE-400 / CWE-20.

- Re-check `signal.aborted` in the executeStream catch before scrubbing.
  The connector throws `ExecutionError.canceled()` (name
  "ExecutionError", message "Statement was canceled") on cancellation.
  The round-3 fast-path keyed on `name === "AbortError"` and missed it,
  the AppKitError 5xx scrub then masked the message, and stream-manager's
  substring fallback (`"operation was aborted"`) no longer matched —
  client cancellations surfaced as opaque UPSTREAM_ERROR. The
  `signal.aborted` re-check is class-agnostic: any error observed while
  the abort signal was already fired is a cancellation, regardless of
  the error's class or message.

- OBO `deriveMetricExecutorKey` throws AuthenticationError on missing /
  whitespace-only identity instead of falling back to a shared
  "anonymous" sentinel. Two distinct misconfigured OBO callers were
  otherwise hashing to the same cache scope — a cross-user data leak.
  The route's existing try/catch wraps this call, so the throw lands on
  the canonical 401 envelope.

Tests: measure-only timeGrain rejection; OBO empty-identity
AuthenticationError (covering null/undefined/empty/whitespace).

Co-authored-by: Isaac
… by error shape

The round-5 `signal?.aborted` bypass I added in 614c212 created a security
race: if `TimeoutInterceptor` aborted the signal during a warehouse error
response (e.g., `TABLE_OR_VIEW_NOT_FOUND` carrying the FQN), the bypass
rethrew the raw warehouse text unscrubbed and the framework's stream
layer broadcast it to the still-connected client. Round-6 security
review caught it (CWE-209 + CWE-362).

Stop trusting `signal.aborted` as a generic bypass. Identify cancellations
by error class/message instead — two narrow shapes:

- real `AbortError` from fetch / `AbortController.abort()`
- `ExecutionError` with the static message constructed by
  `ExecutionError.canceled()` ("Statement was canceled"). Match the
  message exactly so warehouse errors that happen to contain "canceled"
  cannot trick the bypass.

Also fixes round-6 correctness: rethrowing `ExecutionError.canceled()`
unchanged still classifies as UPSTREAM_ERROR in `StreamManager`, since
its detection checks `name === "AbortError"` or
`message.includes("operation was aborted")`. Normalize the connector's
canceled variant to AbortError-shaped before rethrow so disconnects
land on STREAM_ABORTED, not UPSTREAM_ERROR.

Co-authored-by: Isaac
… on the route

Two follow-ups from the review-pr pass:

- `resolveUserId()` and `asUser()` now `.trim()` the
  `x-forwarded-user` / `x-forwarded-access-token` headers before the
  truthiness check. A whitespace-only value previously passed through
  as a valid identity, so the cache layer collapsed distinct
  misconfigured callers into a shared scope. The round-5 fix in
  `deriveMetricExecutorKey` only caught this for metrics; tightening
  the shared helper covers all OBO paths (`.obo.sql` queries
  included).

- `AnalyticsPlugin.setup()` now latches a `loadMetricRegistry()`
  failure on `metricRegistryLoadError` and `_handleMetricRoute`
  returns 503 `METRIC_REGISTRY_LOAD_FAILED` until it clears. The
  previous behavior — empty registry plus a warn log — turned a
  malformed `metric.json` into "every metric returns 404 Metric not
  found", which masked deployment errors. The 503 + distinct code
  give deployment pipelines + the route surface itself a clear
  signal. Other analytics routes (`/query/:key`, `/arrow-result`)
  stay available so a single-route configuration error doesn't break
  the whole plugin. Public message is generic; the parser-failure
  detail goes to `event.setContext()` only.

Co-authored-by: Isaac
…y fetcher

`createWorkspaceDescribeFetcher` was calling
`client.statementExecution.executeStatement(...)` without `wait_timeout`.
On a cold (or recently-resumed) SQL warehouse, the API returns
synchronously with `state: "PENDING"` and an empty `result.data_array`
before the statement has actually run. `parseDescribeTableExtendedJson`
reads that as "DESCRIBE TABLE EXTENDED returned no rows", marks the
metric as a failure, and the build-time `metrics.metadata.json` ships
with empty `measures` / `dimensions` for the affected metric. The
runtime fail-closed gate (`knownMeasures.length === 0`) then 503s every
metric request — even for views that exist in the warehouse.

Add `wait_timeout: "30s"` to match the pattern in the SDK's own
example (`examples/workspace/sql/execute-query.ts`). The API now
blocks for up to 30 seconds for the statement to complete; once the
warehouse is warm, DESCRIBE returns synchronously with populated
`data_array`. `extractMetricColumns` then sees real columns and the
build-time bundle ships with proper measure / dimension metadata.

Symptom this resolves: dev-playground `/metrics` route showed
"Could not load the revenue metric." for both demo metrics even
though `appkit_demo.public.revenue_metrics` and
`appkit_demo.public.customer_metrics` exist in the workspace and
DESCRIBE-by-curl returns full schema.

Co-authored-by: Isaac
useMetricView's `onError` callback bails silently when
`abortController.signal.aborted` is true, but `onMessage` had no
matching guard. Under React StrictMode's double-mount, the first
mount's cleanup aborted controller A, but the server-side SSE writer
still emitted a final `event: error` envelope on the already-open
stream (cancellation hand-off). That envelope arrived at the hook's
onMessage, sailed past the missing guard, hit the
`parsed.type === "error"` branch, and surfaced a user-visible error
before the second mount's data arrived. Production builds (StrictMode
no-op) didn't see it; dev did. Pages with concurrent useMetricView
subscriptions to the same metric (e.g. acmecorp-arr's /home with
rollup + at-risk on the same view) amplified the visible window.

Mirror the `onError` guard at the top of `onMessage`. Once the
controller is aborted, drop ALL events from that stream uniformly
(`result`, `arrow`, `error`) — the next mount/refetch creates a fresh
controller and runs cleanly.

Regression test simulates a late "error" envelope arriving after
manual abort and asserts `error` stays `null`.

Co-authored-by: Isaac
…ive DESCRIBE

These artifacts had been generated against an unpopulated DESCRIBE
response (the empty-`data_array` PENDING-state case fixed in
cc5639a). With `wait_timeout: "30s"` now in place, the build-time
type-generator produces complete metadata for both demo metric views:

- revenue: 4 measures (mrr, arr, new_arr, churned_arr), 3 dimensions
  (region, segment, created_at) with full time-grain enum on
  created_at.
- customer_metrics: 3 measures (active_accounts, churn_rate, avg_ltv),
  3 dimensions (segment, region, csm_email).

Co-authored-by: Isaac
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant