Local Aware IBM#1378
Conversation
… are only locally aware
|
Claude Code Review Head SHA: ee0bc0c Files changed:
Findings: [Correctness — Out-of-bounds array access in
patch_ib(:) = patch_ib_gbl(1:num_aware_ibs) ! 1:54000 from a 50000-element arrayAccessing elements 50001–54000 of [MPI Correctness — Global allreduce replaced by 2-hop neighborhood reduce] The previous [Memory Management / GPU Correctness —
allocate (patch_ib(num_ib_patches_max))and the resize in deallocate (patch_ib)
...
allocate (patch_ib(num_aware_ibs))both use bare Fortran statements instead of |
Claude Code ReviewPR #1378 — Local Aware IBM This is a draft PR. The concept (replacing global allreduce with neighborhood-local communication for IBM forces) is sound and the scalability motivation is clear. Several correctness issues need to be addressed before this is merge-ready. Critical Issues1. The new function wraps its entire ownership logic in 2. The guard 3. Out-of-bounds potential in In 3D with 27 neighbor cells, 4. Analytic MIBM code generation uses local loop index, not global patch ID The template still emits Important Issues5. if (bounds_error) error stop "Ghost Point and Image Point on Different Processors. Exiting"
6.
7. Post-process The PR removes the 8. normal_axis = axis/sqrt(sum(axis)) ! BUG: missing **2Should be Suggestions
Strengths
|
Automated Code ReviewSummary: 6 critical issues, 3 important issues. (Draft PR) Critical Issues1.
if (bounds_error) error stop "Ghost Point and Image Point on Different Processors. Exiting"
2. Missing
3. Missing
4. Dead expression in pressure correction loop — result silently discarded (confidence: 100%)
q_prim_vf(eqn_idx%E)%sf(j, k, l) + pres_IP/(1._wp - ...)This is a bare expression — not an assignment. The pressure field is never updated. For q_prim_vf(eqn_idx%E)%sf(j, k, l) = q_prim_vf(eqn_idx%E)%sf(j, k, l) + pres_IP/(...)5. MPI tag accumulation across phases in The back-propagation Fypp loop does not reset 6. The Important Issues7.
8. De-duplication scheme in force accumulation untested for The snapshot-based force de-duplication (subtract previous snapshot, add new) is sound for radius=1 but relies on each IB appearing at most once per received message. For radius > 1 with non-linear neighborhood topology, duplicate accumulation is possible. Needs explicit testing. 9. Minor design concern for the radius > 1 case — the accumulation loop structure may not generalize cleanly beyond the tested radius=1 case. Suggestions
Strengths
|
Description
The current code has an issue scaling much past 10k particles due to limitations in the MPIAllReduceSum in the IB force computation. This PR attempts to alleviate this by limiting the number of IBs any given rank can be aware of to its neighbors. This turns the AllReduce compute to a MPI neighbor computation, removing the communication bottlneck. To support this, a massive overhaul of IB ownership between ranks was required.
Type of change
Testing
TBD
Checklist
AI code reviews
Reviews are not triggered automatically. To request a review, comment on the PR:
@coderabbitai review— incremental review (new changes only)@coderabbitai full review— full review from scratch/review— Qodo review/improve— Qodo code suggestions@claude full review— Claude full review (also triggers on PR open/reopen/ready)claude-full-review— Claude full review via label