Test/2026 05 12 lark bot reply chain regressions by YueZh127 · Pull Request #651 · aevatarAI/aevatar

YueZh127 · 2026-05-14T07:21:57Z

Summary

Part of #633.

This change adds focused regression coverage for the lark-bot reply chain on feature/lark-bot, covering the current follow-up scope from:

[test] 补齐 AgentRunGAgent 与 conversation->run handoff 的 actor 语义回归测试 #635 AgentRunGAgent terminal / stale / duplicate regressions
[test] 补齐 TurnStreamingReplySink 的并发收尾回归测试 #636 TurnStreamingReplySink targeted race regressions
[test][guard] 补齐 reply generator / tool closeout 回归测试并增加最小结构护栏 #637 dispatcher seam guard + generator warning / closeout basics

What changed

#635 AgentRunGAgent regression coverage

Added basic combination regressions around terminal and cleanup behavior:

ignore cleanup for non-terminal runs
ignore cleanup when AgentRunCleanupRequested.RunId does not match the terminal run
do not dispatch duplicate LlmReplyReadyEvent after a terminal failed fallback reply has already been produced and delivered

These assertions are aligned with the current contract:
terminal failure after successful fallback delivery remains
ReplyProduced + ReplyDispatched + ProducedTerminalState=Failed,
rather than Status=Failed.

#636 TurnStreamingReplySink targeted race regressions

Added two sink-level race regressions:

Dispose() while FinalizeAsync() is waiting for drain unblocks correctly and does not dispatch the stashed final flush
if a deferred timer-driven flush dispatch fails, a later delta can still recover and publish the latest accumulated text once

This keeps the sink coverage focused on real race-prone paths instead of broad coverage expansion.

#637 dispatcher seam + generator basics

Added:

a minimal architecture/source-text guard that keeps ConversationGAgent on the IChannelLlmReplyRunDispatcher seam and prevents direct references to concrete run/inbox runtime types
generator coverage for:
- warning behavior when SkillRegistry is present without IRemoteSkillFetcher
- placeholder + tool-follow-up path, with direct proof that a second request containing a tool message is issued

Also tightened naming/intent so the tests describe the current verified contract, not a stronger target-state contract.

Audit doc update

Updated the reply-chain audit document to reflect the current test reality:

ChatRuntime terminal chunk behavior is still a target-state question, not an accepted current contract
the dispatcher seam guard is currently a minimal source-text guard, not a Roslyn/compile-level rule
the new sink regressions are recorded as the right next-step coverage for this branch

Verification

test/Aevatar.GAgents.ChannelRuntime.Tests
filtered to AgentRunGAgentTests | ConversationReplyGeneratorTests | TurnStreamingReplySinkTests
test/Aevatar.Architecture.Tests
filtered to the dispatcher seam guard test
bash tools/ci/test_stability_guards.sh

Results:

channel runtime targeted tests: passed
architecture targeted test: passed
stability guard: passed

Notes

This change intentionally does not modify ChatRuntime production behavior.
It only adds/adjusts tests and audit documentation against the currently accepted behavior on this branch.

…-05-12_lark-bot-reply-chain-regressions

codecov · 2026-05-14T07:39:06Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 72.50%. Comparing base (806b181) to head (f603372).
⚠️ Report is 67 commits behind head on feature/lark-bot.

@@                 Coverage Diff                 @@
##           feature/lark-bot     #651     +/-   ##
===================================================
  Coverage             72.50%   72.50%             
===================================================
  Files                  1286     1296     +10     
  Lines                 95060    96507   +1447     
  Branches              12428    12632    +204     
===================================================
+ Hits                  68920    69972   +1052     
- Misses                21179    21485    +306     
- Partials               4961     5050     +89

Flag	Coverage Δ
ci	`72.50% <ø> (+<0.01%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.
see 48 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

YueZh127 added 4 commits May 13, 2026 14:50

Add lark-bot reply-chain coverage audit

4b9fedc

Merge remote-tracking branch 'origin/feature/lark-bot' into test/2026…

97abfc1

…-05-12_lark-bot-reply-chain-regressions

Add actor handoff regression tests for lark reply chain

9462fa1

Add lark-bot reply chain regression tests

f603372

YueZh127 requested a review from eanzhao May 14, 2026 07:21

YueZh127 requested review from jason-aelf and louis4li as code owners May 14, 2026 07:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Test/2026 05 12 lark bot reply chain regressions#651

Test/2026 05 12 lark bot reply chain regressions#651
YueZh127 wants to merge 4 commits into
feature/lark-botfrom
test/2026-05-12_lark-bot-reply-chain-regressions

YueZh127 commented May 14, 2026

Uh oh!

codecov Bot commented May 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

YueZh127 commented May 14, 2026

Summary

What changed

#635 AgentRunGAgent regression coverage

#636 TurnStreamingReplySink targeted race regressions

#637 dispatcher seam + generator basics

Audit doc update

Verification

Notes

Uh oh!

codecov Bot commented May 14, 2026

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant