Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
75 changes: 75 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
# Project guidance for Claude

MemorySnapshotDataTool parses Unity memory snapshot (`.snap`) files and exports them to
DuckDB / SQLite databases, then runs SQL to build HTML reports. Because the whole tool is
built around composing and executing SQL, **SQL safety is a first-class rule in this repo.**

## Rule: never build SQL from unsanitized external data

External / untrusted data includes anything not hard-coded in source: CLI arguments,
file names and paths, environment values, fields read from a `.snap` file, values read
back out of the database, and anything derived from them. **Never** concatenate or string-
interpolate such data into a SQL statement.

When you write or modify any code that builds a SQL string, follow these in order:

1. **Bind values as parameters — always the default for any *value*.**
- **SQLite** (`Microsoft.Data.Sqlite`) — named parameters with a `$` prefix:
```csharp
cmd.CommandText = "SELECT 1 FROM pragma_table_info($t) WHERE name = $c LIMIT 1";
cmd.Parameters.AddWithValue("$t", tableName);
cmd.Parameters.AddWithValue("$c", columnName);
```
- **DuckDB** (`DuckDB.NET`) — positional `?` parameters in order:
```csharp
cmd.CommandText = "... WHERE table_name = ? AND column_name = ? LIMIT 1";
cmd.Parameters.Add(new DuckDBParameter { Value = tableName });
cmd.Parameters.Add(new DuckDBParameter { Value = columnName });
```

2. **Identifiers (table / column names) can't be parameters in most positions.**
Validate them against a hard-coded safe-list, or query a catalog table that *does*
accept parameters (`information_schema.columns` for DuckDB, `pragma_table_info($t)`
for SQLite) instead of splicing the name into the statement.

3. **Only interpolate a value directly if it is a non-string numeric type** (`int`,
`long`, …) that you control — a strongly-typed number cannot carry a payload. Add a
comment saying why it is safe, and prefer a parameter anyway when the API allows one.

4. **Never** do `"... WHERE x = '" + value + "'"` or `$"... '{value}'"` with a string value.

5. **Open read-only when a path only reads.** Report/analysis code that never writes opens the
database with least privilege — `Data Source=<path>;Mode=ReadOnly` (SQLite) or
`Data Source=<path>;ACCESS_MODE=READ_ONLY` (DuckDB) — so a bad query can't mutate data. Only
the export writers open read-write.

## Canonical safe examples already in this repo

Match these when you touch SQL — don't reinvent the pattern:

- `Core/Report/Queries/SqliteReportQueries.cs` `HasColumn` — parameterized `pragma_table_info`.
- `Core/Report/Queries/DuckDbReportQueries.cs` `HasColumn` — identifier check via catalog table.
- `Core/Report/MultiSnapshotReport/MultiSnapshotReportBuilder.cs` `HasColumn` — both dialects parameterized.
- `Core/ExportDestination/DuckDbExportDestination.cs` — positional `?` parameters for inserts.
- `Core/ExportDestination/SqliteWriter.cs` — bulk inserts with `$pN` parameters.

## The `ExecuteQuery(string sql)` contract

`IReportQueryBackend.ExecuteQuery(string sql)` runs a raw SQL string and has **no parameter
overload**. It must therefore only ever receive an internally-constructed query — a constant
from `ReportSql` or one of its builders — never external input. The single dynamic builder,
`ReportSql.DownstreamStats(long rootIdx)`, interpolates a numeric `long`, which is injection-
safe. If you ever need to pass a *value* into a report query, add a parameterized path rather
than interpolating it into the SQL string. As defense-in-depth, the backends behind this sink
open the database read-only (rule 5 above), so a malformed query cannot mutate data.

## Best-practices reference

Full guidance for both humans and Claude — with rationale, anti-patterns, and review
checklist — is in [`docs/sql-safety.md`](docs/sql-safety.md). Read it before adding any new
query path or query builder.

## Build & test

- Build: `dotnet build MemorySnapshotDataTools.sln`
- Test: `dotnet test MemorySnapshotDataTools.sln`
39 changes: 28 additions & 11 deletions Core/Report/MultiSnapshotReport/MultiSnapshotReportBuilder.cs
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,8 @@ private static SnapshotMetricsRow QueryDatabase(string dbPath)

private static SnapshotMetricsRow QueryDuckDb(string dbPath)
{
using var connection = new DuckDBConnection($"Data Source={dbPath}");
// Read-only: this path only queries metrics, never writes. See docs/sql-safety.md.
using var connection = new DuckDBConnection($"Data Source={dbPath};ACCESS_MODE=READ_ONLY");
connection.Open();

var snapshotMeta = QuerySnapshotMetadata(connection, isDuckDb: true);
Expand All @@ -81,7 +82,8 @@ private static SnapshotMetricsRow QueryDuckDb(string dbPath)

private static SnapshotMetricsRow QuerySqlite(string dbPath)
{
using var connection = new SqliteConnection($"Data Source={dbPath}");
// Read-only: this path only queries metrics, never writes. See docs/sql-safety.md.
using var connection = new SqliteConnection($"Data Source={dbPath};Mode=ReadOnly");
connection.Open();

var snapshotMeta = QuerySnapshotMetadata(connection, isDuckDb: false);
Expand Down Expand Up @@ -112,12 +114,15 @@ private static Dictionary<string, NativeTypeSnapshotMetrics> QueryNativeTypes(ob
? "COALESCE(SUM(resident_size_bytes), 0)"
: nullResidentExpr;

// The resident expressions above are a closed set of hard-coded SQL fragments chosen by HasColumn;
// the native type name is a value, so it is bound as a parameter (DuckDB '?', SQLite '$nativeType').
var nativeTypeParam = isDuckDb ? "?" : "$nativeType";
var assetBundleSql = $"""
SELECT COUNT(*) AS obj_count,
COALESCE(SUM(size_bytes), 0) AS allocated_bytes,
{objectResidentExpr} AS resident_bytes
FROM native_objects
WHERE native_type_name = '{GoldenValidationQueries.AssetBundleNativeTypeName}'
WHERE native_type_name = {nativeTypeParam}
AND is_destroyed = false
""";

Expand All @@ -129,26 +134,32 @@ FROM native_roots
WHERE {GoldenValidationQueries.SerializedFileAreaPredicate}
""";

ReadNativeTypeAggregate(connection, isDuckDb, assetBundleSql, GoldenValidationQueries.AssetBundleNativeTypeName, result);
ReadNativeTypeAggregate(connection, isDuckDb, assetBundleSql, GoldenValidationQueries.AssetBundleNativeTypeName, result,
("$nativeType", GoldenValidationQueries.AssetBundleNativeTypeName));
ReadNativeTypeAggregate(connection, isDuckDb, serializedFileSql, GoldenValidationQueries.SerializedFileMetricName, result);
return result;
}

// Table/column names are bound as parameters rather than interpolated, so this never builds SQL
// by concatenating identifiers. DuckDB queries information_schema.columns (a regular table that
// accepts bind parameters, matching DuckDbReportQueries.HasColumn); SQLite uses pragma_table_info
// with named parameters, matching SqliteReportQueries.HasColumn.
private static bool HasColumn(object connection, bool isDuckDb, string tableName, string columnName)
{
var sql = isDuckDb
? $"SELECT 1 FROM pragma_table_info('{tableName}') WHERE name = '{columnName}' LIMIT 1"
: $"SELECT 1 FROM pragma_table_info('{tableName}') WHERE name = '{columnName}' LIMIT 1";

if (isDuckDb)
{
using var cmd = ((DuckDBConnection)connection).CreateCommand();
cmd.CommandText = sql;
cmd.CommandText =
"SELECT 1 FROM information_schema.columns WHERE table_schema = 'main' AND table_name = ? AND column_name = ? LIMIT 1";
cmd.Parameters.Add(new DuckDBParameter { Value = tableName });
cmd.Parameters.Add(new DuckDBParameter { Value = columnName });
return cmd.ExecuteScalar() != null;
}

using var sqliteCmd = ((SqliteConnection)connection).CreateCommand();
sqliteCmd.CommandText = sql;
sqliteCmd.CommandText = "SELECT 1 FROM pragma_table_info($t) WHERE name = $c LIMIT 1";
sqliteCmd.Parameters.AddWithValue("$t", tableName);
sqliteCmd.Parameters.AddWithValue("$c", columnName);
return sqliteCmd.ExecuteScalar() != null;
}

Expand All @@ -157,12 +168,16 @@ private static void ReadNativeTypeAggregate(
bool isDuckDb,
string sql,
string typeName,
Dictionary<string, NativeTypeSnapshotMetrics> result)
Dictionary<string, NativeTypeSnapshotMetrics> result,
params (string Name, object Value)[] parameters)
{
if (isDuckDb)
{
using var cmd = ((DuckDBConnection)connection).CreateCommand();
cmd.CommandText = sql;
// DuckDB binds positionally ('?'), in declaration order.
foreach (var (_, value) in parameters)
cmd.Parameters.Add(new DuckDBParameter { Value = value });
using var reader = cmd.ExecuteReader();
if (!reader.Read())
return;
Expand All @@ -179,6 +194,8 @@ private static void ReadNativeTypeAggregate(

using var sqliteCmd = ((SqliteConnection)connection).CreateCommand();
sqliteCmd.CommandText = sql;
foreach (var (name, value) in parameters)
sqliteCmd.Parameters.AddWithValue(name, value);
using var sqliteReader = sqliteCmd.ExecuteReader();
if (!sqliteReader.Read())
return;
Expand Down
17 changes: 12 additions & 5 deletions Core/Report/Queries/DuckDbReportQueries.cs
Original file line number Diff line number Diff line change
Expand Up @@ -7,11 +7,13 @@ internal sealed class DuckDbReportQueries : IReportQueryBackend
{
private readonly DuckDBConnection _connection;

/// <summary>Opens a connection to the DuckDB database at the given path.</summary>
/// <summary>Opens a read-only connection to the DuckDB database at the given path.</summary>
/// <param name="dbPath">Path to the .duckdb file.</param>
public DuckDbReportQueries(string dbPath)
{
_connection = new DuckDBConnection($"Data Source={dbPath}");
// The report path only ever runs SELECTs. Open read-only (least privilege) so that even a
// malformed query reaching ExecuteQuery cannot modify or drop data. See docs/sql-safety.md.
_connection = new DuckDBConnection($"Data Source={dbPath};ACCESS_MODE=READ_ONLY");
_connection.Open();
}

Expand Down Expand Up @@ -43,9 +45,14 @@ public bool HasColumn(string tableName, string columnName)
{
try
{
var (_, rows) = ExecuteQuery(
$"SELECT 1 FROM information_schema.columns WHERE table_schema = 'main' AND table_name = '{tableName.Replace("'", "''")}' AND column_name = '{columnName.Replace("'", "''")}' LIMIT 1");
return rows.Count > 0;
// Bind table/column names as parameters rather than interpolating them. information_schema.columns
// is a regular table, so it accepts bind parameters (DuckDB uses positional '?').
using var cmd = _connection.CreateCommand();
cmd.CommandText =
"SELECT 1 FROM information_schema.columns WHERE table_schema = 'main' AND table_name = ? AND column_name = ? LIMIT 1";
cmd.Parameters.Add(new DuckDBParameter { Value = tableName });
cmd.Parameters.Add(new DuckDBParameter { Value = columnName });
return cmd.ExecuteScalar() != null;
}
catch
{
Expand Down
8 changes: 7 additions & 1 deletion Core/Report/Queries/IReportQueryBackend.cs
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,13 @@ internal interface IReportQueryBackend : IDisposable
ReportBackendDialect Dialect { get; }

/// <summary>Executes the given SQL and returns column names and rows (null for missing values).</summary>
/// <param name="sql">SQL query (single statement).</param>
/// <param name="sql">
/// SQL query (single statement). Must be an internally-constructed query — a constant from
/// <see cref="ReportSql"/> or one of its builders — never external/untrusted input. The only
/// dynamic value, <see cref="ReportSql.DownstreamStats"/>, interpolates a numeric <c>long</c>
/// index, which cannot carry a SQL-injection payload. As defense-in-depth, implementations open
/// the database read-only, so a malformed query here cannot modify or drop data.
/// </param>
/// <returns>Column names and list of row arrays.</returns>
(string[] Columns, List<object?[]> Rows) ExecuteQuery(string sql);

Expand Down
6 changes: 4 additions & 2 deletions Core/Report/Queries/SqliteReportQueries.cs
Original file line number Diff line number Diff line change
Expand Up @@ -7,11 +7,13 @@ internal sealed class SqliteReportQueries : IReportQueryBackend
{
private readonly SqliteConnection _connection;

/// <summary>Opens a connection to the SQLite database at the given path.</summary>
/// <summary>Opens a read-only connection to the SQLite database at the given path.</summary>
/// <param name="dbPath">Path to the .db or .sqlite file.</param>
public SqliteReportQueries(string dbPath)
{
_connection = new SqliteConnection($"Data Source={dbPath}");
// The report path only ever runs SELECTs. Open read-only (least privilege) so that even a
// malformed query reaching ExecuteQuery cannot modify or drop data. See docs/sql-safety.md.
_connection = new SqliteConnection($"Data Source={dbPath};Mode=ReadOnly");
_connection.Open();
}

Expand Down
5 changes: 3 additions & 2 deletions Core/Report/SummaryReportRunner.cs
Original file line number Diff line number Diff line change
Expand Up @@ -135,9 +135,10 @@ private static SummaryReport FromSnapshot(string snapshotPath, IProgressReporter
/// </summary>
private static SummaryReport FromDatabase(string databasePath, string extension)
{
// Read-only: the summary path only reads from the database. See docs/sql-safety.md.
using DbConnection connection = extension.Equals(".db", StringComparison.OrdinalIgnoreCase)
? new SqliteConnection($"Data Source={databasePath}")
: new DuckDBConnection($"Data Source={databasePath}");
? new SqliteConnection($"Data Source={databasePath};Mode=ReadOnly")
: new DuckDBConnection($"Data Source={databasePath};ACCESS_MODE=READ_ONLY");
connection.Open();

var info = ReadSnapshotInfo(connection);
Expand Down
47 changes: 47 additions & 0 deletions SECURITY.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
# Security

## Reporting a vulnerability

Report suspected security issues to **zack.asofsky@unity3d.com**. Please do not open public issues for undisclosed vulnerabilities.

## SQL safety

This tool composes and executes SQL against DuckDB and SQLite. Safe-query rules and best
practices are documented in [`docs/sql-safety.md`](docs/sql-safety.md), with the enforced rule
for contributors (and Claude) in [`CLAUDE.md`](CLAUDE.md). In short: bind values as parameters,
never interpolate external data, validate identifiers against a safe-list or parameterized
catalog lookup, and open read-only on any path that only reads.

## SAST finding triage log

Findings from static analysis (Cycode SAST) that have been reviewed and accepted are recorded
here so the rationale is durable and reviewable. Suppression itself is performed in Cycode, not
in this repo — see "How we suppress" below.

### CYCODE-SAST: Unsanitized external input in SQL query — `ExecuteQuery`

| Field | Value |
| --- | --- |
| Scanner | Cycode SAST |
| Rule | Unsanitized external input in SQL query (SQL injection) |
| Location | [`Core/Report/Queries/SqliteReportQueries.cs`](Core/Report/Queries/SqliteReportQueries.cs) — `ExecuteQuery(string sql)` (and the equivalent `DuckDbReportQueries.ExecuteQuery`) |
| Reviewed | 2026-06-01, Zack Asofsky |
| Disposition | **False positive** — no external/untrusted input reaches the sink — accepted with defense-in-depth mitigations |

**Rationale.** `ExecuteQuery(string sql)` only ever receives internally-constructed queries:
compile-time constants from `ReportSql` and its builders. The sole dynamic value,
`ReportSql.DownstreamStats(long rootIdx)`, interpolates a numeric `long`, which cannot carry an
injection payload. No CLI argument, file path, or snapshot field reaches the query string.

**Defense-in-depth mitigations applied** (branch `bugfix/sql-sanitization`):

- Report/analysis database connections are opened **read-only** (`Mode=ReadOnly` for SQLite,
`ACCESS_MODE=READ_ONLY` for DuckDB), so a malformed query reaching this sink cannot modify or
drop data. Only the export writers open read-write.
- Identifier-bearing helper queries (`HasColumn`) use parameterized catalog lookups
(`pragma_table_info($t)` / `information_schema.columns` with bind parameters) instead of
string concatenation.
- The multi-snapshot native-type filter binds its value as a parameter rather than interpolating
it.
- The `ExecuteQuery` contract is documented on `IReportQueryBackend`, and the rules are codified
in [`docs/sql-safety.md`](docs/sql-safety.md) and [`CLAUDE.md`](CLAUDE.md).
Loading
Loading