From b7df01454cb8aec45fd709cb334e13a27fabfae7 Mon Sep 17 00:00:00 2001 From: Zack Asofsky Date: Fri, 5 Jun 2026 10:21:22 -0400 Subject: [PATCH] Add CLI/golden-validation docs and skills; open validate read-only MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Documentation: - README: document all seven CLI commands (was only export/report) and switch examples to the built MemorySnapshotDataTools binary on PATH instead of `dotnet run --project ...`. - Add docs/golden-validation.md covering the Unity golden extractor → export → validate workflow, compared metrics, tolerances, JSON schemas, and failure formats; register it in the mkdocs nav. - Add `add-cli-command` skill describing the CliOptions → CommandLineBuilder → Program → Core runner wiring, with README documentation as a required step. - Add `validate-golden` skill; cross-link it and the doc from the memory-snapshot-report skill and README. Code: - GoldenValidationRunner now opens the database read-only (ACCESS_MODE=READ_ONLY / Mode=ReadOnly), matching the report backends per CLAUDE.md rule 5 (validation only reads). Co-Authored-By: Claude Opus 4.8 (1M context) --- .claude/skills/add-cli-command/SKILL.md | 178 +++++++++++++++ .../skills/memory-snapshot-report/SKILL.md | 4 + .claude/skills/validate-golden/SKILL.md | 92 ++++++++ Core/Validation/GoldenValidationRunner.cs | 6 +- README.md | 120 +++++++++- docs/golden-validation.md | 210 ++++++++++++++++++ mkdocs.yml | 1 + 7 files changed, 597 insertions(+), 14 deletions(-) create mode 100644 .claude/skills/add-cli-command/SKILL.md create mode 100644 .claude/skills/validate-golden/SKILL.md create mode 100644 docs/golden-validation.md diff --git a/.claude/skills/add-cli-command/SKILL.md b/.claude/skills/add-cli-command/SKILL.md new file mode 100644 index 0000000..b01973e --- /dev/null +++ b/.claude/skills/add-cli-command/SKILL.md @@ -0,0 +1,178 @@ +--- +name: add-cli-command +description: Add a new subcommand to the Memory Snapshot Data Tool CLI (the .NET 10 `MemorySnapshotDataTools` exe), wiring it through CliOptions → CommandLineBuilder → Program → a Core runner, then documenting it in README.md. Use when asked to add, create, or expose a new CLI command/subcommand (alongside export, batch-export, report, multi-report, validate, summary, upgrade), or a new flag/argument on an existing one. +--- + +# Add a new CLI command + +The CLI is built with `System.CommandLine`. A subcommand is wired through **three files in +`Cli/`** plus a **Core runner** that holds the logic, and is then **documented in `README.md`**. +Match the existing commands (`export`, `batch-export`, `report`, `multi-report`, `validate`, +`summary`, `upgrade`) — don't invent a new shape. + +**Paths below are relative to the repo root** (the directory containing `MemorySnapshotDataTools.sln`). + +## When to use + +- Adding a brand-new subcommand (e.g. `diff`, `prune`, `export-csv`). +- Adding a new argument or `--option` to an existing subcommand. +- Exposing an existing Core capability through the CLI. + +For SQL-touching logic inside the new command, also follow [`CLAUDE.md`](../../../CLAUDE.md) and +[`docs/sql-safety.md`](../../../docs/sql-safety.md) — SQL safety is a first-class rule here. + +## The files you touch + +| File | What you add | +|------|--------------| +| [`Cli/CliOptions.cs`](../../../Cli/CliOptions.cs) | A `CommandKind` enum value + any new option/arg properties on `CliOptions`. | +| [`Cli/CommandLineBuilder.cs`](../../../Cli/CommandLineBuilder.cs) | The `Command`, its `Argument<>`/`Option<>`s, the `SetAction` handler, `root.Add(...)`, and a new `Func` parameter on `Build`. | +| [`Cli/Program.cs`](../../../Cli/Program.cs) | A `RunXxx(CliOptions)` handler that calls the Core runner, plus threading it through the `Build(...)` call in `Main`. | +| `Core/...` (a runner) | The actual logic — a `XxxRunner` + a `XxxRunOptions`, mirroring `Core/Report/SummaryReportRunner.cs`, `Core/Export/ExportRunner.cs`, etc. | +| [`Tests/`](../../../Tests) | Tests for the **Core runner / calculator** (there are no `CommandLineBuilder` tests; test the logic, not the parser). | +| [`README.md`](../../../README.md) | **Required** — a usage subsection + example for the new command. See [Document in README.md](#document-in-readmemd). | + +## Step by step + +### 1. `Cli/CliOptions.cs` + +- Add a value to the `CommandKind` enum. +- Add a property to `CliOptions` for each new argument/option. Reuse existing ones where they fit + (`Verbose`, `ReportDbPath`, `Destination`, …) before adding new fields. + +### 2. `Cli/CommandLineBuilder.cs` + +Inside `Build(...)`, following the existing blocks: + +```csharp +// ---- mycommand ---- +var myCmd = new Command("mycommand", "One-line description shown in --help."); +var inputArg = new Argument("input") +{ + Description = "Path to ...", + Arity = ArgumentArity.ExactlyOne, +}; +myCmd.Add(inputArg); + +var someOpt = new Option("--mode") +{ + Description = "...: a, b, or c.", + DefaultValueFactory = _ => "a", +}; +someOpt.AcceptOnlyFromAmong("a", "b", "c"); // validate enum-like options at parse time +myCmd.Add(someOpt); + +myCmd.SetAction((ParseResult parseResult) => +{ + var inputPath = ExpandPath(parseResult.GetValue(inputArg)!); // ALWAYS ExpandPath path args + if (!File.Exists(inputPath)) // validate existence, return 1 + { + Console.Error.WriteLine($"Input file not found: {inputPath}"); + return 1; + } + var options = new CliOptions + { + Command = CommandKind.MyCommand, + // ... map parsed values onto CliOptions ... + }; + return runMyCommand(options); +}); +``` + +Then: +- Register it: `root.Add(myCmd);` near the bottom of `Build`. +- Add a `Func runMyCommand` parameter to the `Build(...)` signature (the params are + positional — keep the order consistent with `Program.Main`). + +### 3. `Cli/Program.cs` + +- Add a `RunMyCommand(CliOptions options)` static handler that delegates to your Core runner. + - If the command **reads an exported database**, call `SchemaGate.Check(path)` first (see + `RunReport`/`RunSummary`). + - If it does cancellable work, use `CreateCancellationSource()` and catch + `OperationCanceledException` → return `2` (see `RunExport`). +- Add `RunMyCommand` to the `CommandLineBuilder.Build(...)` call in `Main` in the matching position. + +### 4. Core runner (the logic) + +Put real work in `Core/...`, not in `Cli/`. Mirror an existing runner +(`SummaryReportRunner`, `ExportRunner`): a `XxxRunOptions` record + a `static int Run(...)` that +returns an exit code and reports via `IProgressReporter`. If it builds SQL, parameterize values and +validate identifiers per [`docs/sql-safety.md`](../../../docs/sql-safety.md); open the DB +**read-only** if it only reads (`Mode=ReadOnly` / `ACCESS_MODE=READ_ONLY`). + +### 5. Tests + +Add tests under [`Tests/`](../../../Tests) targeting the Core runner / calculator (e.g. +`BatchExportRunnerTests.cs`, `SummaryMetricsCalculatorTests.cs`). Helpers you assert on must be +**`public`** — `InternalsVisibleTo` is a no-op here, so Core internals are not visible to Tests. + +### 6. Document in README.md + +**This step is required — a new command is not done until the README documents it.** In +[`README.md`](../../../README.md): + +- Add a `### ` subsection under **How to use**, after the existing + command subsections, with the invocation form and a worked example, matching the style of the + `export` and `report` sections: + + ````markdown + ### + + ```bash + dotnet run --project Cli/MemorySnapshotDataTools.Cli.csproj -- mycommand [options] + ``` + + - **`--option`:** what it does (default: …). + + **Example:** + + ```bash + dotnet run --project Cli/MemorySnapshotDataTools.Cli.csproj -- mycommand ./in.duckdb --mode b + ``` + ```` + +- If the command changes the headline behavior, add a bullet to the **What it does** list at the + top, and update **Output**/**Schema** if it writes new tables/files. +- Consider also updating, when relevant: the **Direct CLI invocation** list in + [`run-memory-snapshot-data-tool/SKILL.md`](../run-memory-snapshot-data-tool/SKILL.md), and + `docs/intro.md` / `docs/runbook.md`. + +### 7. Build, test, and verify it runs + +```bash +dotnet build MemorySnapshotDataTools.sln +dotnet test MemorySnapshotDataTools.sln +dotnet run --project Cli/MemorySnapshotDataTools.Cli.csproj -- --help # command listed? +dotnet run --project Cli/MemorySnapshotDataTools.Cli.csproj -- mycommand --help # args/options correct? +``` + +To exercise the full pipeline (build → run → screenshot), use the `run-memory-snapshot-data-tool` +skill's driver. + +## Conventions & gotchas + +- **Always `ExpandPath(...)` path arguments/options** before use — it expands `~`, env vars, and + resolves to a full path. Every existing handler does this. +- **Validate inputs in `SetAction` and return `1`** for bad args / missing files, before + constructing `CliOptions`. +- **Exit codes are meaningful and asserted on:** `0` success, `1` bad args / file not found, + `2` cancelled (Ctrl-C), `3` validation/error. Keep yours consistent. +- **Enum-like string options** → `option.AcceptOnlyFromAmong("a","b","c")` so bad values are + rejected at parse time (see `--validate`, `--destination`). +- **`Build`'s handlers are positional `Func` parameters.** Add yours in the same + position in both the `Build` signature and the `Main` call site, or the wrong handler runs. +- **Reading an exported DB?** Call `SchemaGate.Check(path)` first so a stale/newer schema fails + with a clear message instead of a confusing query error. +- **SQL safety is non-negotiable** — never interpolate external values into SQL; bind parameters and + validate identifiers (`CLAUDE.md`, `docs/sql-safety.md`). + +## Checklist + +- [ ] `CommandKind` value + `CliOptions` properties added (`Cli/CliOptions.cs`). +- [ ] Command, args/options, `SetAction`, `root.Add`, and new `Build` param added (`Cli/CommandLineBuilder.cs`). +- [ ] `RunXxx` handler added and threaded through `Main` (`Cli/Program.cs`). +- [ ] Logic lives in a Core `XxxRunner` (+ `XxxRunOptions`); SQL is parameterized / read-only. +- [ ] Tests added for the Core logic (public helpers). +- [ ] **`README.md` documents the command** (usage + example), other docs/skill updated if relevant. +- [ ] `dotnet build` + `dotnet test` pass; `-- --help` and `-- mycommand --help` look right. diff --git a/.claude/skills/memory-snapshot-report/SKILL.md b/.claude/skills/memory-snapshot-report/SKILL.md index d09cfc2..a70ad92 100644 --- a/.claude/skills/memory-snapshot-report/SKILL.md +++ b/.claude/skills/memory-snapshot-report/SKILL.md @@ -50,6 +50,10 @@ dotnet run --project Cli/MemorySnapshotDataTools.Cli.csproj -- batch-export .snap .duckdb --validate minimal +``` + +DuckDB recommended (`.db` + `--destination sqlite` also works). Needs the `native_objects`, +`native_roots`, and `summary_metrics` tables — a current-schema export has them. + +### 3. Run `validate` + +```bash +dotnet run --project Cli/MemorySnapshotDataTools.Cli.csproj -- validate _golden.json .duckdb [--out result.json] +``` + +- Writes `{name}_golden_validation_result.json` next to the golden file unless `--out` is set, and + prints the result to stdout. +- **Exit codes:** `0` = passed, `1` = metric mismatch(es), `3` = error (bad input / unparseable + golden / unsupported DB extension). The DB is schema-gated first; an older **major** schema is + rejected — re-export rather than `upgrade` (upgrade only covers minor analysis-view changes). + +## What it compares (summary — details in the doc) + +- **Native types:** `AssetBundle` (from `native_objects`) and `SerializedFile` (from `native_roots`, + `area_name LIKE '%serializedfile%'`) — Count and AllocatedBytes **exact**, ResidentBytes within tolerance. +- **PMR:** sum of PersistentManager Remapper roots — Allocated exact, Resident within tolerance. +- **Summary page:** Totals + `AllocatedMemoryDistribution` + `ManagedHeapUtilization` rows from the + `summary_metrics` table. + +**Tolerances:** counts & allocated bytes must be **exact**; resident = `max(64 KB, 1%)`; summary +committed = `max(64 KB, 1%)`, except **estimated** rows (Graphics, Untracked — golden +`ResidentAvailable=false`) which get `max(1 MB, 5%)`. Summary comparison is **skipped** when the +golden file has no Summary rows (older/partial golden files still validate on native metrics). + +## Reading a failure + +Each `Failures[]` entry is `\: expected=…, actual=…`, e.g. +`SerializedFile.Count: expected=34, actual=33` or +`Summary[AllocatedMemoryDistribution].Native.Committed: expected=…, actual=…`. A +`… row missing from export` means the export lacks a Summary row the golden file has → re-export +with the current tool. Everything mismatching → golden and DB came from different snapshots. + +## Changing what validation compares + +Touch **both** sides so they don't drift: + +- **Tool:** SQL constants in `Core/Validation/GoldenValidationQueries.cs`, comparison/tolerances in + `Core/Validation/GoldenValidationRunner.cs`, models in `Core/Validation/GoldenValidationModels.cs`. +- **Unity extractor:** `UnityPackage/com.unity.memory-snapshot-data-tools/Editor/GoldenValueExtractor.cs` + and shared names in `…/Editor/MemorySnapshotValidationHelpers.cs`. +- Validation SQL must stay constant-only (no interpolated external values) per + [`docs/sql-safety.md`](../../../docs/sql-safety.md). +- Add/extend tests in [`Tests/GoldenValidationRunnerTests.cs`](../../../Tests/GoldenValidationRunnerTests.cs) + (they build a SQLite DB + golden JSON in temp and assert pass/fail). Test helpers must be `public`. +- Run `dotnet test MemorySnapshotDataTools.sln` and update [`docs/golden-validation.md`](../../../docs/golden-validation.md). diff --git a/Core/Validation/GoldenValidationRunner.cs b/Core/Validation/GoldenValidationRunner.cs index 5f2c9b2..3fddfa7 100644 --- a/Core/Validation/GoldenValidationRunner.cs +++ b/Core/Validation/GoldenValidationRunner.cs @@ -121,14 +121,16 @@ private static ExportedMetrics QueryExportedMetrics(string databasePath) private static ExportedMetrics QueryDuckDb(string databasePath) { - using var connection = new DuckDBConnection($"Data Source={databasePath}"); + // Validation only reads; open read-only (defense-in-depth, per CLAUDE.md rule 5). + using var connection = new DuckDBConnection($"Data Source={databasePath};ACCESS_MODE=READ_ONLY"); connection.Open(); return Query(connection, isDuckDb: true); } private static ExportedMetrics QuerySqlite(string databasePath) { - using var connection = new SqliteConnection($"Data Source={databasePath}"); + // Validation only reads; open read-only (defense-in-depth, per CLAUDE.md rule 5). + using var connection = new SqliteConnection($"Data Source={databasePath};Mode=ReadOnly"); connection.Open(); return Query(connection, isDuckDb: false); } diff --git a/README.md b/README.md index ae0a7f8..e01deae 100644 --- a/README.md +++ b/README.md @@ -4,8 +4,15 @@ Single CLI to **export** Unity memory snapshots (`.snap`) to DuckDB or SQLite an ## What it does -- **Export:** Reads a `.snap` file, parses and extracts snapshot data, and writes it to a DuckDB (default) or SQLite file. -- **Report:** Connects to an exported database (DuckDB or SQLite), runs report queries, and produces a self-contained HTML report with sortable tables. +The tool is a single CLI with these commands: + +- **`export`:** Read a `.snap` file, parse and extract its data, and write it to a DuckDB (default) or SQLite database. +- **`batch-export`:** Export every `.snap` in a directory to its own database with the same basename. +- **`report`:** Run report queries against one exported database and produce a self-contained HTML report with sortable tables. +- **`multi-report`:** Produce a single HTML report comparing multiple exported databases in a directory. +- **`validate`:** Compare an exported database against a Unity golden JSON file. +- **`summary`:** Print a high-level memory-usage summary for a `.snap` or database (writes no database). +- **`upgrade`:** Upgrade an exported database's analysis views/indexes to the current schema version, in place. ## Prerequisites @@ -19,18 +26,23 @@ Single CLI to **export** Unity memory snapshots (`.snap`) to DuckDB or SQLite an ## How to use -Use the **MemorySnapshotDataTools** directory as the project root. Run the CLI with the Cli project: +Build a Release binary once, then add its output directory to your `PATH` so you can invoke `MemorySnapshotDataTools` directly: ```bash -dotnet run --project Cli/MemorySnapshotDataTools.Cli.csproj -- [args...] +dotnet build -c Release +# Add the build output to PATH. is your runtime: osx-arm64, osx-x64, linux-x64, or win-x64. +export PATH="$PATH:$(pwd)/Cli/bin/Release/net10.0/" +MemorySnapshotDataTools [args...] ``` -Or from the `Cli` directory: `dotnet run -- [args...]`. +Add that `export PATH=...` line to your shell profile to make it permanent. All examples below use `MemorySnapshotDataTools `. + +> Working from source without installing? Run `dotnet run --project Cli/MemorySnapshotDataTools.Cli.csproj -- ` from the repo root instead. ### Export a snapshot to a database ```bash -dotnet run --project Cli/MemorySnapshotDataTools.Cli.csproj -- export [options] +MemorySnapshotDataTools export [options] ``` - Use a `.duckdb` extension for DuckDB (default) or `.db` for SQLite. @@ -39,19 +51,37 @@ dotnet run --project Cli/MemorySnapshotDataTools.Cli.csproj -- export [options] +``` + +- Exports every top-level `.snap` in the directory to a `.duckdb`/`.db` file with the same basename. +- **`--filter `:** case-insensitive substring filter on snapshot filenames (e.g. `MyGame`). +- **`--skip-existing`:** skip a file when its output database exists and is newer than the `.snap`. +- **`--continue-on-error`:** keep going after a single-file failure (default: `true`). +- Also accepts the same `--destination`, `--validate`, `--batch-size`, `--queue-capacity`, and `--verbose` options as `export`. + +**Example:** + +```bash +MemorySnapshotDataTools batch-export ./captures --filter MyGame --skip-existing --verbose ``` ### Generate a report from a database ```bash -dotnet run --project Cli/MemorySnapshotDataTools.Cli.csproj -- report [--out report.html] [options] +MemorySnapshotDataTools report [--out report.html] [options] ``` - **`--out`** path: where to write the HTML file. If omitted, writes to a temp file and opens it in the browser. @@ -61,7 +91,73 @@ dotnet run --project Cli/MemorySnapshotDataTools.Cli.csproj -- report [--out report.html] [options] +``` + +- Builds one HTML report comparing every `.duckdb`/`.db` snapshot database in the directory. +- **`--filter `:** case-insensitive substring filter on database filenames (e.g. `MyGame`). +- **`--out`** path: where to write the HTML file. If omitted, writes to a temp file and opens it in the browser. +- **`--title "Title"`:** report title (default: "Multi-Snapshot Memory Report"). +- **`--no-reports`:** skip generating per-snapshot drill-down reports (faster; rows are not clickable). +- **`--verbose`:** print progress and timings. + +**Example:** + +```bash +MemorySnapshotDataTools multi-report ./databases --out multi.html --verbose +``` + +### Validate an export against Unity golden values + +```bash +MemorySnapshotDataTools validate [--out result.json] +``` + +- Compares the exported database against a `*_golden.json` produced by Unity's GoldenValueExtractor. +- **`--out`** path: where to write the validation result JSON (default: next to the golden file). +- Exit codes: `0` = passed, `1` = metric mismatch(es), `3` = error. +- For the full workflow — extracting the golden file in Unity, what each metric compares, and the + tolerances — see [docs/golden-validation.md](docs/golden-validation.md). + +**Example:** + +```bash +MemorySnapshotDataTools validate ./memory_golden.json ./out.duckdb +``` + +### Print a memory-usage summary + +```bash +MemorySnapshotDataTools summary [--verbose] +``` + +- Prints a high-level memory breakdown to the console. Accepts **either** a `.snap` snapshot or an exported database, and writes **no** database. (Decoding a raw `.snap` is slower than reading a database.) +- **`--verbose`:** print progress while decoding a snapshot. + +**Example:** + +```bash +MemorySnapshotDataTools summary ./out.duckdb +``` + +### Upgrade a database's schema in place + +```bash +MemorySnapshotDataTools upgrade +``` + +- Upgrades an exported database's analysis views/indexes to the current minor schema version, in place — no re-export needed. If the database's major version is behind, it reports that a re-export is required instead. + +**Example:** + +```bash +MemorySnapshotDataTools upgrade ./out.duckdb ``` ## Output @@ -93,7 +189,7 @@ dotnet build dotnet test ``` -To run the CLI: `dotnet run --project Cli/MemorySnapshotDataTools.Cli.csproj --` or publish the Cli project (see below). +To run the CLI after building, put `Cli/bin/Release/net10.0/` on your `PATH` (see [How to use](#how-to-use)), or publish the Cli project (see below). ## Publish (versioned artifacts) @@ -101,4 +197,4 @@ From the project root, run `./publish.sh` (macOS/Linux) or `./publish.ps1` (Wind ## AI IDE integration -A project skill for Cursor (and similar AI IDEs) is in `.cursor/skills/memory-snapshot-report/`. It describes the export and report workflow and when to use it. +Project skills for Claude (and similar AI IDEs) are in `.claude/skills/**`. diff --git a/docs/golden-validation.md b/docs/golden-validation.md new file mode 100644 index 0000000..1862e0a --- /dev/null +++ b/docs/golden-validation.md @@ -0,0 +1,210 @@ +# Golden value validation + +Golden validation answers one question: **does the database this tool exports from a `.snap` +agree with what Unity's own Memory Profiler reports for the same snapshot?** You capture +reference ("golden") numbers from inside Unity, then ask the CLI to re-derive the same numbers +from an exported database and diff the two within set tolerances. + +Use it to catch export/parsing regressions — if a schema, query, or extraction change quietly +shifts a size or count, validation fails with the exact metric and the expected/actual values. + +## End-to-end workflow + +``` +Unity Editor MemorySnapshotDataTools CLI +──────────── ─────────────────────────── +.snap ──(Memory Profiler)──► {name}_golden.json +.snap ──────────────────────────────► export ──► {name}.duckdb / .db + validate {name}_golden.json {name}.duckdb + │ + ▼ + {name}_golden_validation_result.json (+ exit code) +``` + +The golden file and the database **must come from the same `.snap`**, or every metric will +differ. + +> The CLI examples below call `MemorySnapshotDataTools` directly. Build it once with +> `dotnet build -c Release` and put `Cli/bin/Release/net10.0/` on your `PATH` (see the +> README's *How to use*), or run from source with +> `dotnet run --project Cli/MemorySnapshotDataTools.Cli.csproj -- `. + +## Step 1 — Extract golden values in Unity + +The extractor is a separate Unity Editor package, **not** part of the .NET CLI: + +- **Package:** `com.unity.memory-snapshot-data-tools` (v0.2.0), in + [`UnityPackage/`](https://github.com/Unity-Technologies/MemorySnapshotDataTools/tree/main/UnityPackage) + of this repo. Requires Unity **2022.3+** and `com.unity.memoryprofiler` **1.1.12**. +- **Import:** add it to the target Unity project's `Packages/manifest.json` via a local `file:` + path pointing at `UnityPackage/com.unity.memory-snapshot-data-tools`. +- **Run:** **Tools → Memory Snapshot Validation → Extract Golden Values**, then pick a `.snap`. + It writes `{name}_golden.json` next to the snapshot and reveals it in the file browser. The + Console logs a summary of the extracted metrics. + +Under the hood the extractor loads the snapshot through the Memory Profiler, reads +`ProcessedNativeRoots`, and invokes the Memory Profiler's **own** summary model builders +(`AllMemorySummaryModelBuilder`, `ManagedMemorySummaryModelBuilder`, +`ResidentMemorySummaryModelBuilder`). That is deliberate: the golden Summary-page numbers are +produced by the same code as the Memory Profiler UI, so a passing validation means the tool +matches what a developer sees in the profiler — not just an independent re-implementation. + +If Summary-model extraction fails for a snapshot, the native metrics are still written and the +summary arrays are left empty. Such a golden file is still valid; the tool simply skips the +Summary comparison for it (see [backward compatibility](#backward-compatibility)). + +## Step 2 — Export the same snapshot to a database + +```bash +MemorySnapshotDataTools export .snap .duckdb --validate minimal +``` + +DuckDB (`.duckdb`) is recommended; SQLite (`.db`, with `--destination sqlite`) also works. +Validation needs the `native_objects`, `native_roots`, and `summary_metrics` tables, which a +current-schema export produces. + +## Step 3 — Run `validate` + +```bash +MemorySnapshotDataTools validate _golden.json .duckdb [--out result.json] +``` + +- The database is checked against the current schema version first; a database from an older + **major** schema (missing tables/columns validation needs) is rejected — re-export it. +- The result JSON is written next to the golden file as `{name}_golden_validation_result.json` + unless `--out` is given. The full result is also printed to stdout. +- **Exit codes:** `0` = passed, `1` = one or more metric mismatches, `3` = error (bad input, + unparseable golden JSON, unsupported database extension, etc.). + +## What gets compared + +| Metric | Golden source | Exported-DB source | Tolerance | +|--------|---------------|--------------------|-----------| +| `AssetBundle` Count / Allocated | `NativeTypeMetrics[AssetBundle]` | `native_objects` where `native_type_name='AssetBundle'` and not destroyed: `COUNT`, `SUM(size_bytes)` | exact | +| `AssetBundle` Resident | same | `SUM(resident_size_bytes)` | resident | +| `SerializedFile` Count / Allocated | `NativeTypeMetrics[SerializedFile]` | `native_roots` where `area_name LIKE '%serializedfile%'`: `COUNT`, `SUM(accumulated_size_bytes)` | exact | +| `SerializedFile` Resident | same | `SUM(resident_size_bytes)` | resident | +| `PMR` Allocated | sum of `NativeRootMetrics[*].AllocatedBytes` | sum of `native_roots` Remapper / `PersistentManager…Remapper` rows' `accumulated_size_bytes` | exact | +| `PMR` Resident | sum of `NativeRootMetrics[*].ResidentBytes` | sum of those rows' `resident_size_bytes` | resident | +| `Summary.TotalAllocated` / `TotalResident` | `TotalAllocatedBytes` / `TotalResidentBytes` | `summary_metrics` Totals row | committed / resident | +| `Summary[AllocatedMemoryDistribution].*` | `AllocatedMemoryDistribution[]` | `summary_metrics` group rows | committed / resident | +| `Summary[ManagedHeapUtilization].*` | `ManagedHeapUtilization[]` | `summary_metrics` group rows | committed / resident | + +"PMR" = PersistentManager Remapper; the golden side lists individual Remapper roots and the tool +compares their **sum**, not each row. + +### Tolerances + +Memory accounting diverges slightly between Unity's full memory-map post-processing and the +exported tables, so non-counting metrics allow a small delta. A value passes when it is within +`max(absolute, relative × max(|expected|, |actual|))`: + +| Comparison | Rule | +|------------|------| +| Counts (`*.Count`) and allocated bytes for tracked types / PMR | **exact** — must be equal | +| Resident bytes (everywhere) | `max(64 KB, 1%)` | +| Summary committed — normal rows | `max(64 KB, 1%)` | +| Summary committed — **estimated** rows (Graphics, Untracked) | `max(1 MB, 5%)` | + +A row is "estimated" when its golden `ResidentAvailable` is `false` (Graphics and Untracked are +derived from platform stats, not measured directly). Resident is only compared when **both** +golden and exported rows have resident available. The Memory Profiler labels Untracked as +`Untracked*`; the trailing `*` is tolerated on category-name matching. + +### Backward compatibility + +The Summary comparison is skipped entirely when the golden file has no +`AllocatedMemoryDistribution` and no `ManagedHeapUtilization` rows. Older golden files (and any +where Summary extraction failed in Unity) therefore still validate on the native metrics alone. + +## Golden JSON schema + +Produced by `GoldenValueExtractor`; consumed by `GoldenSnapshotFile`. + +```jsonc +{ + "SnapshotName": "MyGame", + "SnapshotPath": "/path/to/MyGame.snap", + "FormatVersion": 17, + "ExtractedAtUtc": "2026-01-01T00:00:00.0000000Z", + "NativeTypeMetrics": [ + { "NativeTypeName": "AssetBundle", "Count": 12, "AllocatedBytes": 0, "ResidentBytes": 0 }, + { "NativeTypeName": "SerializedFile", "Count": 34, "AllocatedBytes": 0, "ResidentBytes": 0 } + ], + "NativeRootMetrics": [ + { "AreaName": "PersistentManager.Remapper", "ObjectName": "Remapper", "AllocatedBytes": 0, "ResidentBytes": 0 } + ], + "TotalAllocatedBytes": 0, + "TotalResidentBytes": 0, + "AllocatedMemoryDistribution": [ + { "Name": "Native", "CommittedBytes": 0, "ResidentBytes": 0, "ResidentAvailable": true }, + { "Name": "Managed", "CommittedBytes": 0, "ResidentBytes": 0, "ResidentAvailable": true }, + { "Name": "Executables & Mapped", "CommittedBytes": 0, "ResidentBytes": 0, "ResidentAvailable": true }, + { "Name": "Graphics (Estimated)", "CommittedBytes": 0, "ResidentBytes": 0, "ResidentAvailable": false }, + { "Name": "Untracked", "CommittedBytes": 0, "ResidentBytes": 0, "ResidentAvailable": false } + ], + "ManagedHeapUtilization": [ + { "Name": "Virtual Machine", "CommittedBytes": 0, "ResidentBytes": 0, "ResidentAvailable": true }, + { "Name": "Objects", "CommittedBytes": 0, "ResidentBytes": 0, "ResidentAvailable": true }, + { "Name": "Empty Heap Space", "CommittedBytes": 0, "ResidentBytes": 0, "ResidentAvailable": true } + ] +} +``` + +## Validation result JSON + +Produced by `GoldenValidationResult`: + +```jsonc +{ + "SnapshotName": "MyGame", + "GoldenPath": "/path/to/MyGame_golden.json", + "DatabasePath": "/path/to/MyGame.duckdb", + "ValidatedAtUtc": "2026-06-05T12:00:00.0000000Z", + "Passed": false, + "Failures": [ + "SerializedFile.Count: expected=34, actual=33", + "Summary[AllocatedMemoryDistribution].Native.Committed: expected=5000000, actual=9000000" + ] +} +``` + +### Failure string formats + +Each entry in `Failures` is one of: + +- `{Type}.Count` / `{Type}.AllocatedBytes` / `{Type}.ResidentBytes` — for `AssetBundle`, `SerializedFile`. +- `PMR.AllocatedBytes` / `PMR.ResidentBytes`. +- `Summary.TotalAllocated` / `Summary.TotalResident`, or `Summary.Total: row missing from export`. +- `Summary[{group}].{name}.Committed` / `Summary[{group}].{name}.Resident`, or + `Summary[{group}].{name}: row missing from export` — where `{group}` is + `AllocatedMemoryDistribution` or `ManagedHeapUtilization`. + +All comparison failures carry `expected=…, actual=…`. + +## Troubleshooting + +- **Everything mismatches** → the golden file and the database came from *different* snapshots, + or different captures of the same scene. Re-extract and re-export from one `.snap`. +- **`Summary.* row missing from export`** → the export predates the `summary_metrics` table, or + the golden file has Summary rows the export lacks. Re-export with the current tool. +- **Schema-gate rejection on `validate`** → the database is from an older major schema. Re-export + from the `.snap` (an in-place `upgrade` only covers minor analysis-view changes, not new + tables/columns validation depends on). +- **`Unsupported database extension`** → pass a `.duckdb` or `.db` file, not a `.snap`. +- **Golden has empty Summary arrays** → Summary extraction failed in Unity (a `Debug.LogWarning` + was logged there). Native metrics still validate; re-extract in Unity to get Summary coverage. + +## Keeping the two sides in sync + +The Unity extractor and the .NET tool intentionally share definitions so they don't drift: + +- Tracked type names (`AssetBundle`, `SerializedFile`) and the SerializedFile area predicate are + defined on both sides — `MemorySnapshotValidationHelpers` (Unity) and + `GoldenValidationQueries` (tool). Change them together. +- The category names and `ResidentAvailable` semantics mirror the Memory Profiler Summary rows; + the tool reads them from the export's `summary_metrics` table. + +The validation queries are constants in `GoldenValidationQueries` (no external values are +interpolated into SQL), per the repo's [SQL safety rules](sql-safety.md). For the database tables +and columns referenced above, see the [database schema](database-schema.md). diff --git a/mkdocs.yml b/mkdocs.yml index 3950aaf..b250afa 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -14,6 +14,7 @@ nav: - Architecture and design: "design.md" - Snap File Format: "snap-file-format.md" - Database schema: "database-schema.md" + - Golden value validation: "golden-validation.md" - SQL safety: "sql-safety.md" - Installing for local development: "installation.md" - Troubleshooting: "troubleshooting.md"