Skip to content

Port OTEP-4947 thread-context writer from polarsignals/custom-labels#347

Draft
szegedi wants to merge 4 commits into
mainfrom
otel-thread-ctx
Draft

Port OTEP-4947 thread-context writer from polarsignals/custom-labels#347
szegedi wants to merge 4 commits into
mainfrom
otel-thread-ctx

Conversation

@szegedi

@szegedi szegedi commented Jun 9, 2026

Copy link
Copy Markdown

Adds a Node.js writer for the OpenTelemetry Thread Local Context Record
(OTEP-4947)
,
ported from the in-development upstream at
polarsignals/custom-labels#16.
The two codebases will diverge again later; for now this is a snapshot of
the current state.

Draft because the upstream OTEP is still in development and the upstream
custom-labels PR isn't merged either — this PR is here for review and
parallel development, not for landing as-is.

What's in this branch

Native addon (bindings/)

  • otel-thread-ctx.cc + otel-thread-ctx.hh: the writer, namespaced as
    dd::OtelThreadCtx::Init(exports) and called from binding.cc.
  • The discovery TLS symbol otel_thread_ctx_nodejs_v1 stays in extern \"C\" at
    file scope so it's exported by name through the dd_pprof.node dynsym
    table. It's a 32-byte struct holding the four fields a reader needs:
    cped_slot (V8 isolate's ContinuationPreservedEmbedderData slot
    pointer), als_handle (Global<Object> to the writer's
    AsyncLocalStorage), als_identity_hash (JS identity hash for hash-bucket
    narrowing), and undefined_addr (cached per-isolate undefined singleton
    for clean "no context" detection by the reader).
  • Records use a C99 flexible-array-member tail and are right-sized to fit
    the encoded attribute payload, with a 36-byte attrs_data floor (one
    cache line total — matches the OTEP "frugal writer" guidance) and ×2
    geometric growth on append, capped at the OTEP-recommended 612-byte
    attrs_data ceiling.
  • binding.gyp adds -mtls-dialect=gnu2 on x86_64 Linux (required by
    OTEP-4947 for TLSDESC; on arm64 TLSDESC is the only dynamic TLS model
    so no flag is needed).

TypeScript layer (ts/src/otel-thread-ctx.ts)

API surface (Linux only; no-op stubs elsewhere):

  • runWithContext(fn, opts) — wraps AsyncLocalStorage.run; scopes the
    context to fn.
  • enterWithContext(opts) — wraps AsyncLocalStorage.enterWith; attaches
    to the current async scope without a callback boundary.
  • clearContext() — detaches any active context (enterWith(undefined)).
    Idempotent.
  • appendAttributes(attributes) — appends attributes to the active
    record. Append-only: existing entries are not overwritten. Either
    updates the record in place (when the encoded bytes fit in the current
    allocation's slack) or reallocates with geometric growth.
  • isContextTruncated() — sticky boolean; returns true if at any point
    in this context's lifetime an attribute had to be dropped because the
    encoded payload would have exceeded the 612-byte cap.
  • makeNamedContext(keys) — returns a NamedContext exposing
    name-addressed variants of all five top-level functions plus
    processContextAttributes, a snapshot of the OTEP-4719 process-context
    entries (schema version, attribute_key_map, plus the V8 layout
    constants — see below) the caller should publish.

opts.traceId/opts.spanId are raw Uint8Array (16 and 8 bytes;
Buffer works as a subclass). opts.attributes is positional: index N
is the value for uint8 key index N on the wire; null/undefined/holes are
skipped. Per-value cap of 255 UTF-8 bytes (uint8 length prefix), total
attrs_data cap of 612 bytes (OTEP-recommended 640-byte total record
minus 28-byte header).

Process-context attributes

makeNamedContext(keys).processContextAttributes exposes a frozen
snapshot ready to spread into whatever OTEP-4719 process-context
publisher the application uses:

```js
{
'threadlocal.schema_version': 'nodejs_v1',
'threadlocal.attribute_key_map': [...keys],
// V8 layout constants captured from the V8 headers the addon was
// compiled against — let the reader walk our wrapper and V8's
// OrderedHashMap layout without doing its own V8-internal-symbol
// lookups for the pointer-compression / sandbox state.
'threadlocal.nodejs_v1.wrapped_object_offset': 24,
'threadlocal.nodejs_v1.tagged_size': 8,
}
```

Tests

65-case mocha suite under ts/test/test-otel-thread-ctx.ts covering:
input parsing and validation, on-the-wire record encoding (including
multibyte UTF-8 truncation at the 255-byte per-value cap), the 612-byte
cap and the isContextTruncated flag, in-place append vs geometric
realloc, async propagation, processContextAttributes shape and
immutability, clearContext semantics, and a readelf --dyn-syms
check that the TLS symbol is exported with the right binding /
visibility / type. The whole describe block is skipped on non-Linux.

A second commit adds a scripts/docker/ Dockerfile + launcher and a
test:docker npm script that builds the addon and runs the full test
suite in a Linux container — useful for running the new tests from
macOS dev machines. End-to-end verified at 161 passing (96
existing pprof tests + the new suite).

szegedi added 2 commits June 9, 2026 16:05
Ports the in-development OpenTelemetry thread-context writer that
lives on the otel-thread-ctx-node branch of polarsignals/custom-labels
(szegedi fork) into this project. The two codebases will likely
diverge again later; for now this is a snapshot of the current state.

Structurally:
- bindings/otel-thread-ctx.cc/.hh: the native addon code, namespaced
  in `dd::` and exposed via OtelThreadCtx::Init(exports) called from
  binding.cc. The thread_local otel_thread_ctx_nodejs_v1 discovery
  symbol stays in extern "C" at file scope so it's exported by name
  through the dd_pprof.node dynsym table.
- ts/src/otel-thread-ctx.ts: the runWithContext / enterWithContext /
  makeNamedContext API, loading the native addon via node-gyp-build
  like the rest of this project.
- ts/test/test-otel-thread-ctx.ts: mocha port of the node:test suite.
  Skipped wholesale on non-Linux.
- binding.gyp: adds bindings/otel-thread-ctx.cc to both target source
  lists and the -mtls-dialect=gnu2 cflag on x86_64 Linux (required by
  the OTEP-4947 spec; on arm64 TLSDESC is the only dynamic TLS model
  so no flag is needed).

Verified by mocha against the built dd_pprof.node in a Linux
container (Node 22 with --experimental-async-context-frame):
35 passing.
Mirrors the test:docker mechanism in custom-labels/js: a Dockerfile
under scripts/docker/ extending node:24-bookworm with python3 and
build-essential, plus a launcher script that builds the image
(cached), mounts the repo read-only, copies it into /tmp/work inside
the container, and runs `npm install && npm test`. The host tree is
never modified (no stray node_modules/, build/, out/).

Node 24 is used so the full test suite — including the new OTEP-4947
thread-context tests, which need AsyncContextFrame — runs without
extra Node flags.

Run via `npm run test:docker`.
@datadog-prod-us1-3

datadog-prod-us1-3 Bot commented Jun 9, 2026

Copy link
Copy Markdown

Pipelines

Fix all issues with BitsAI

⚠️ Warnings

🚦 17 Pipeline jobs failed

Build | asan (20)   View in Datadog   GitHub Actions

Build | build / darwin-arm64   View in Datadog   GitHub Actions

Build | build / darwin-x64   View in Datadog   GitHub Actions

View all 17 failed jobs.

Useful? React with 👍 / 👎

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: 64d4d68 | Docs | Datadog PR Page | Give us feedback!

@github-actions

github-actions Bot commented Jun 9, 2026

Copy link
Copy Markdown

Overall package size

Self size: 2.41 MB
Deduped: 2.77 MB
No deduping: 2.77 MB

Dependency sizes | name | version | self size | total size | |------|---------|-----------|------------| | source-map | 0.7.6 | 185.63 kB | 185.63 kB | | pprof-format | 2.2.1 | 163.06 kB | 163.06 kB | | node-gyp-build | 4.8.4 | 13.86 kB | 13.86 kB |

🤖 This report was automatically generated by heaviest-objects-in-the-universe

szegedi added 2 commits June 10, 2026 15:44
- Add a static_assert that offsetof(CtxWrap, record_) ==
  sizeof(node::ObjectWrap), since that offset is part of the reader ABI.
  Restructure CtxWrap so record_, capacity_, and truncated_ live in a
  single public access section: C++ leaves cross-access-control field
  ordering implementation-defined, so splitting them would allow a
  conforming compiler to reorder the bookkeeping fields ahead of
  record_.
- Add an acq_rel signal fence between the pointer swap and free() in
  the reallocate path. The pre-existing release fence only constrains
  prior writes; nothing was stopping the compiler from hoisting free()
  above the publication store, which would let a stopped reader follow
  self->record_ into freed memory.
- Restore the [[unlikely]] annotation on the IsConstructCall() check.
- Misc local cleanups (std::min/max in two spots, assert valid==1 after
  the memcpy instead of redundantly setting it).
The CtxWrap accessor that returns the raw record as a Uint8Array is only
intended for tests and out-of-process-reader development. Naming it
DebugBytes (and exposing it as wrap.debugBytes() on the JS prototype)
makes that explicit at every call site.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant