Skip to content

Wait for consensus start before answering region requests#17546

Open
jt2594838 wants to merge 4 commits intomasterfrom
check_consensus_before_answering_region_request
Open

Wait for consensus start before answering region requests#17546
jt2594838 wants to merge 4 commits intomasterfrom
check_consensus_before_answering_region_request

Conversation

@jt2594838
Copy link
Copy Markdown
Contributor

Summary

  • Add a lightweight polling utility (Await / ConditionAwaiter) in node-commons that provides a fluent API for waiting until a condition becomes true, with configurable timeout and poll interval.
  • Introduce DataNodeContext inner class in DataNode to expose consensus initialization state (isAllConsensusStarted()), with volatile visibility on the underlying flags.
  • Guard all region management RPCs in DataNodeInternalRPCServiceImpl (create/delete region, change leader, add/remove/delete/reset peer, notify migration) with a waitForConsensusStarted() check that polls up to 30 seconds before rejecting with CONSENSUS_NOT_INITIALIZED.
  • This prevents race conditions where ConfigNode dispatches region requests to a DataNode whose consensus layer hasn't finished initializing after restart.

Test plan

  • Unit tests for Await / ConditionAwaiter covering: already-true condition, condition becomes true, timeout, poll delay, exception ignoring, untilAsserted, forever mode (AwaitTest)
  • Unit tests for consensus wait behavior: createSchemaRegion, createDataRegion, deleteRegion, changeRegionLeader all reject with CONSENSUS_NOT_INITIALIZED when consensus is not started (ConsensusWaitTest)
  • Existing DataNodeInternalRPCServiceImplTest updated to pass a mocked DataNodeContext with consensus started

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 23, 2026

Codecov Report

❌ Patch coverage is 70.45455% with 39 lines in your changes missing coverage. Please review.
✅ Project coverage is 39.82%. Comparing base (acaabb8) to head (51b6674).
⚠️ Report is 3 commits behind head on master.

Files with missing lines Patch % Lines
...ol/thrift/impl/DataNodeInternalRPCServiceImpl.java 55.31% 21 Missing ⚠️
...che/iotdb/commons/concurrent/ConditionAwaiter.java 86.56% 9 Missing ⚠️
...e/iotdb/db/service/DataNodeInternalRPCService.java 0.00% 5 Missing ⚠️
...ain/java/org/apache/iotdb/db/service/DataNode.java 50.00% 4 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master   #17546      +/-   ##
============================================
+ Coverage     39.80%   39.82%   +0.01%     
  Complexity      312      312              
============================================
  Files          5142     5147       +5     
  Lines        347882   348088     +206     
  Branches      44404    44430      +26     
============================================
+ Hits         138489   138618     +129     
- Misses       209393   209470      +77     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@sonarqubecloud
Copy link
Copy Markdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant