Skip to content

Arena allocator for ToT einsum (cases involving outer-Hadamard)#545

Open
zhihao-deng wants to merge 7 commits into
masterfrom
arena_allocator
Open

Arena allocator for ToT einsum (cases involving outer-Hadamard)#545
zhihao-deng wants to merge 7 commits into
masterfrom
arena_allocator

Conversation

@zhihao-deng
Copy link
Copy Markdown

Summary

Introduces an "arena" pipeline that pre-allocate storage for ToT einsum, replacing per-cell heap allocations. Wired end-to-end through the contraction engine, einsum dispatch, and actual ops. Currently, permutation is not supported.

Contents

  • tensor/arena.h — one-shot bump allocator over std::pmr::memory_resource with aliasing shared_ptr slab co-ownership, plus a runtime kill switch (detail::arena_disabled()) for safe fallback.
    • Replace shared_ptr with raw ptr?
  • tensor/arena_kernels.h — ToT trivial-op kernels (scale, add, etc.) that allocate inner cells contiguously in the arena slab.
  • tensor/arena_einsum.h — regime-A (outer-Hadamard) ContractionArenaPlan: derives result outer/inner ranges and constructs non-empty inner cells in a single slab; supports left_range / right_range / gemm_result_range inner-shape derivation.
  • expressions/cont_engine.h — threads an optional arena plan into the contraction op; plan is std::moved into op_ and the local optional is reset so later reads see "no plan".
  • einsum/tiledarray.h — hooks regime-A arena dispatch into the einsum entry point.
  • tensor/tensor.h + tile_op/contract_reduce.h — route ToT trivial ops and contract-reduce through the arena kernels when a plan is present.

Tests

  • tests/arena.cpp — allocator + slice/claim semantics
  • tests/arena_kernels.cpp — ToT trivial-op kernels
  • tests/arena_tot_trivial.cpp — tensor-level routing
  • tests/arena_einsum_unit_suite.cpp — regime-A plan + dispatch
  • tests/arena_sizeof_invariant_suite.cpp — layout invariants
  • tests/cases/case_hec_{e,ec,h,scale}.cpp + case_4d_e.cpp — end-to-end contraction-engine cases (Hadamard / contraction / scale shapes)

- arena_sizeof_invariant_suite: drop platform-specific absolute baselines
  (328/16/248 were Apple-arm64/libc++ only); keep relative
  ImplLayoutAllocator == ImplLayoutMaster invariant + monostate static_asserts.
- cont_engine: reset arena_plan_ after std::move into op_ so later reads
  see "no plan" rather than a moved-from optional.
- arena_kernels: one-line intent note on trivial kernels' tight packing.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant