[codex] fix ANN OpenMP build budget and add concurrency test#61313
[codex] fix ANN OpenMP build budget and add concurrency test#61313airborne12 merged 2 commits intoapache:masterfrom
Conversation
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
|
run buildall |
TPC-H: Total hot run time: 26750 ms |
TPC-DS: Total hot run time: 168033 ms |
6589fdd to
e4c203c
Compare
|
run buildall |
TPC-H: Total hot run time: 27171 ms |
TPC-DS: Total hot run time: 169028 ms |
|
run buildall |
TPC-H: Total hot run time: 27368 ms |
TPC-DS: Total hot run time: 169081 ms |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
|
PR approved by at least one committer and no changes requested. |
|
PR approved by anyone and no changes requested. |
airborne12
left a comment
There was a problem hiding this comment.
LGTM. Clean and correct fix for the OMP thread budget bug.
Core fix: Properly uses get_omp_threads_limit() instead of raw config::omp_threads_limit (which defaults to -1), and adds condition_variable blocking to coordinate concurrent builders.
Minor suggestions (non-blocking):
- Consider
notify_oneinstead ofnotify_allin the destructor for slightly better wakeup precision. - The unit test covers the strict limit=1 boundary well; a follow-up test with limit>1 could verify the halving policy under normal concurrency.
## Summary This PR fixes ANN index build OpenMP thread budgeting and adds a BE unit test for the concurrency cap. ## Problem `ScopedOmpThreadBudget` computed available budget from `config::omp_threads_limit` directly. With the default `omp_threads_limit = -1`, each builder effectively reserved 1 thread and did not follow the documented auto behavior (80% of CPU cores). In addition, concurrent builders could overrun the intended global limit because there was no wait/coordination when the budget was exhausted. ## Root Cause In `faiss_ann_index.cpp`: - The constructor used `config::omp_threads_limit` directly instead of the auto-resolved limit from `get_omp_threads_limit()`. - No blocking mechanism existed when all OpenMP budget was already in use. ## Fix - Use `get_omp_threads_limit()` as the actual global limit for budgeting. - Add a condition variable to block builders until at least one OpenMP slot is available. - Keep the existing policy of reserving up to half of remaining budget (minimum 1). - Add concise comments to explain wait/wakeup behavior. ## Test - Added `VectorSearchTest.OmpThreadBudgetNeverExceedsLimit` in `be/test/storage/index/ann/faiss_vector_index_test.cpp`. - The test sets `config::omp_threads_limit = 1`, runs multiple concurrent ANN `add()` builds, samples `ann_index_build_index_threads`, and asserts peak usage never exceeds 1 and finally returns to 0. ## Validation - Local compilation/test run was intentionally skipped per request.
…61313) ## Summary This PR fixes ANN index build OpenMP thread budgeting and adds a BE unit test for the concurrency cap. ## Problem `ScopedOmpThreadBudget` computed available budget from `config::omp_threads_limit` directly. With the default `omp_threads_limit = -1`, each builder effectively reserved 1 thread and did not follow the documented auto behavior (80% of CPU cores). In addition, concurrent builders could overrun the intended global limit because there was no wait/coordination when the budget was exhausted. ## Root Cause In `faiss_ann_index.cpp`: - The constructor used `config::omp_threads_limit` directly instead of the auto-resolved limit from `get_omp_threads_limit()`. - No blocking mechanism existed when all OpenMP budget was already in use. ## Fix - Use `get_omp_threads_limit()` as the actual global limit for budgeting. - Add a condition variable to block builders until at least one OpenMP slot is available. - Keep the existing policy of reserving up to half of remaining budget (minimum 1). - Add concise comments to explain wait/wakeup behavior. ## Test - Added `VectorSearchTest.OmpThreadBudgetNeverExceedsLimit` in `be/test/storage/index/ann/faiss_vector_index_test.cpp`. - The test sets `config::omp_threads_limit = 1`, runs multiple concurrent ANN `add()` builds, samples `ann_index_build_index_threads`, and asserts peak usage never exceeds 1 and finally returns to 0. ## Validation - Local compilation/test run was intentionally skipped per request.
Summary
This PR fixes ANN index build OpenMP thread budgeting and adds a BE unit test for the concurrency cap.
Problem
ScopedOmpThreadBudgetcomputed available budget fromconfig::omp_threads_limitdirectly. With the defaultomp_threads_limit = -1, each builder effectively reserved 1 thread and did not follow the documented auto behavior (80% of CPU cores). In addition, concurrent builders could overrun the intended global limit because there was no wait/coordination when the budget was exhausted.Root Cause
In
faiss_ann_index.cpp:config::omp_threads_limitdirectly instead of the auto-resolved limit fromget_omp_threads_limit().Fix
get_omp_threads_limit()as the actual global limit for budgeting.Test
VectorSearchTest.OmpThreadBudgetNeverExceedsLimitinbe/test/storage/index/ann/faiss_vector_index_test.cpp.config::omp_threads_limit = 1, runs multiple concurrent ANNadd()builds, samplesann_index_build_index_threads, and asserts peak usage never exceeds 1 and finally returns to 0.Validation