docs: hidden attributes are platform-only; clarify users cannot declare them#162
Conversation
Hidden attributes (names starting with `_`) were primarily designed for
platform operations — DataJoint itself uses them for `_job_start_time`,
`_job_duration`, `_job_version` on Computed/Imported tables and for the
`_singleton` implementation detail. Some functionality is intentionally
exposed to users (notably: a unique index can reference a hidden column,
making `_params_hash`-style derived columns useful), but the feature is
not intended as a general column-hiding tool.
Reframe section 3.4 around that intent, and replace the previous
behavior table with a verified one drawn from the actual code paths:
- Distinguishes platform-managed (auto-injected) from user-defined.
- Documents the exact filter point (Heading.attributes) and lists every
user-facing surface that consumes it: fetch, proj, joins, dict vs.
string restrictions, insert/update1, repr, describe.
- Calls out that fetch1("_name")/proj("_name") explicitly *is* allowed,
matching the test_hidden_job_metadata.py spec.
- Adds a round-trip caveat for describe(): platform-managed hidden
columns regenerate fine because they're re-injected on declare,
but user-defined hidden columns (like _params_hash) are silently
dropped from describe() output.
- Adds guidance on when to declare a hidden attribute vs. a regular one.
Aligns with #1433 (which made user-defined hidden attributes parsable
in the first place).
Expand §3.4 with a write caveat covering the three observed behaviors: 1. update1 raises "Attribute '_name' not found" — heading.names is filtered (heading.py:232). 2. insert raises "Field '_name' not in table heading" — Heading.__iter__ walks the filtered view (heading.py:367). 3. insert(..., ignore_extra_fields=True) silently *drops* the hidden key without writing it. Less obvious than the loud error and easy to miss. Also note that platform-managed hidden columns (_job_start_time, etc.) are populated by DataJoint internals via raw SQL during populate() (autopopulate.py:786), not via insert/update1. There is no public-API path to write to a hidden column today; users with a declared hidden column must reach for connection.query() or compute the value inside an auto_populate step. Tracks the write side of the gap that #1441 leaves open.
The previous "when to declare hidden" paragraph allowed too much: backing an index was treated as sufficient reason to hide. It isn't. The clean heuristic is: if application code touches the column (computes it, inserts it, queries on it, wants it in describe() output), it should be a regular attribute. Hidden is for platform/implementation concerns the application code never references — _job_* populated by autopopulate internals, _singleton's implementation pattern, or fields that would actively interfere with natural-join semantics. Use the params_hash-with-unique-index case as a concrete example of when NOT to hide: even though it backs an index, the application code computes and inserts the hash, so it should be regular and let proj() handle visibility at the call site if needed.
Updated to reflect the design decision in datajoint/datajoint-python#1441: the parser keeps rejecting leading-underscore attribute names and now returns a clear DataJointError instead of a cryptic ParseException. Reframe §3.4 around the platform-managed-only intent: - Lead paragraph states up-front that user-defined hidden attributes are not supported, and shows the new error message users will see. - Drop the "User-defined hidden attributes" subsection and the _params_hash hidden example. - Keep the platform-attributes table and the behavior matrix — both are still useful for users encountering platform-managed hidden columns (_job_start_time, etc.) in fetch results, joins, and describe output. - Add an explanation paragraph ("Why users can't declare them") covering the no-write-path / no-round-trip / silent-filter rationale. - Replace the user-defined example with a regular-attribute example (params_hash backing a unique index), demonstrating the recommended pattern: declare as a regular attribute, use proj() at the call site for visibility control.
MilagrosMarin
left a comment
There was a problem hiding this comment.
Thanks @dimitri-yatsenko for this thorough rework! The reframing of §3.4 as platform-only — with the upfront error message, the platform-managed table, the "why users can't declare them" rationale, and the regular-attribute alternative — is a real improvement over the previous version. Verified the companion PR datajoint/datajoint-python#1441 (merged), and the error message block in §3.4 matches the code in declare.py:858 verbatim. ✅
Most of the behavior matrix also checks out against datajoint-python master:
heading.attributes/heading.names/heading.primary_keyexclude hidden ✅ (heading.py:230-247)heading._attributesincludes hidden ✅ (heading.py:204)to_dicts/to_pandasdefault exclude ✅ (expression.py:899)- Natural-join namesake matching excludes hidden ✅ (
expression.py:397-398) - Dict restriction silently ignored ✅ (
condition.py:392) - String restriction passes through to SQL ✅
describe()excludes ✅ (table.py:1233)ignore_extra_fields=Truesilently drops hidden ✅ (table.py:1443)- Platform-managed columns populated via raw SQL during
populate()✅ (autopopulate.py:766-789)
A few comments below — one critical accuracy issue on the fetch("_name") / proj("_name") rows of the matrix (and the corresponding example code), plus a minor wording nit on the insert/update1 row.
…essages Per Milagros's review on PR #162: the matrix rows for fetch("_name") and proj("_name") said "Included" but the actual behavior is "Rejected" — both route through proj()'s heading.names check (visible-only list at heading.py:236-237), which raises DataJointError. The integration test tests/integration/test_hidden_job_metadata.py:170-172 confirms this constraint by dropping to raw SQL via conn.query() to inspect hidden columns. The "Inspecting platform-managed hidden columns" example block had the same bug — the proj()/fetch1() examples would raise as written. Replaced with the raw-SQL pattern that mirrors the integration test. Also tightened the insert/update1 row: the previous parenthetical "(Field not in table heading)" was an inexact paraphrase. insert/insert1 raise KeyError("`_name` is not in the table heading") (table.py:1424); update1 raises DataJointError("Attribute `_name` not found.") (table.py:514). Split into two rows with the verbatim messages.
|
Thank you @MilagrosMarin for the careful read — and especially for catching the All three comments addressed in
Ready for another pass when you have a moment. |
MilagrosMarin
left a comment
There was a problem hiding this comment.
Thanks @dimitri-yatsenko! Re-read §3.4 at 20224ae and verified all three:
✅ Matrix rows for fetch("_name") / proj("_name") — now correctly state Rejected (Attribute not found) — use raw SQL via conn.query(...). Matches the actual proj() validation against heading.names (expression.py:574) and the integration-test workaround.
✅ Example block — replaced with the conn.query(...) raw-SQL pattern that mirrors tests/integration/test_hidden_job_metadata.py:170-172. Reads cleanly and is now actually runnable.
✅ Insert/update1 rows — split into two rows with verbatim error messages: KeyError("`_name` is not in the table heading") (table.py:1424) for insert/insert1, and DataJointError("Attribute `_name` not found.") (table.py:514) for update1. Matrix is now precise.
The whole §3.4 section now reads as a clean, accurate platform-behavior reference. LGTM — approving.
Summary
Reworks
reference/specs/table-declaration.md§3.4 to reflect the design decision in datajoint/datajoint-python#1441: user-defined hidden attributes are not supported, and the parser now returns a clear `DataJointError` (instead of a cryptic `pyparsing.ParseException`) when a definition uses a leading-underscore name.What changed in §3.4
_job_start_time,_job_duration,_job_version,_singleton) is preserved — these are the actual hidden columns users encounter in fetch results, joins, and describe output.describe()round-trip, silent filtering on dict restrictions andinsert(ignore_extra_fields=True)._params_hash-as-hidden example are removed. Replaced with a regular-attribute example showing the recommended pattern: declareparams_hashas a regular column, useproj()at the call site if visibility control is needed.Companion code PR
datajoint/datajoint-python#1441 — replaces the cryptic `pyparsing.ParseException` with the helpful `DataJointError` and adds a unit test asserting that `compile_attribute` rejects leading-underscore names with the new message.
Test plan