refactor(engine): pass Python worker startup arguments by name#5597
refactor(engine): pass Python worker startup arguments by name#5597yangzhang75 wants to merge 2 commits into
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #5597 +/- ##
============================================
+ Coverage 52.38% 52.39% +0.01%
+ Complexity 2484 2480 -4
============================================
Files 1070 1070
Lines 41359 41369 +10
Branches 4441 4441
============================================
+ Hits 21666 21677 +11
+ Misses 18427 18423 -4
- Partials 1266 1269 +3
*This pull request uses carry forward flags. Click here to find out more. ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
|
Could you assign a reviewer for this PR? @chenlica |
Yicong-Huang
left a comment
There was a problem hiding this comment.
Thanks for the change. We need to add tests to guard the behavior.
|
@Yicong-Huang I added unit tests for the mapping and missing-key behavior in the latest commit, CI's green. Could you take another look? Thanks! |
| @pytest.mark.parametrize("missing_key", sorted(_full_config().keys())) | ||
| def test_main_raises_keyerror_when_a_field_is_missing(missing_key): | ||
| """A missing/renamed key fails loudly rather than being silently misassigned.""" | ||
| config = _full_config() | ||
| del config[missing_key] | ||
| with ( | ||
| mock.patch.object(entry, "StorageConfig"), | ||
| mock.patch.object(entry, "PythonWorker"), | ||
| mock.patch.object(entry, "init_loguru_logger"), | ||
| ): | ||
| with pytest.raises(KeyError): | ||
| entry.main(json.dumps(config)) |
There was a problem hiding this comment.
I think we need to guard it a big tighter, so that we know once a drift happens.
please include tests about
- having extra configs. -> should fail
- having wrong types. -> should fail
- having wrong orders -> should succeed as json/dict is used
- having duplicate configs -> should fail on scala side. (do we have tests for scala side yet?)
What changes were proposed in this PR?
Passes Python worker startup configuration by name instead of by argv position, as proposed in the issue.
PythonWorkflowWorker (JVM) previously built ~19 positional command-line arguments, and texera_run_python_worker.py unpacked them positionally. Because the two sides agreed only by index, adding/removing/reordering one argument could silently misassign values.
The set of keys written on the Scala side and read on the Python side is identical (19 keys). No behavior change otherwise.
Any related issues, documentation, discussions?
Closes #5547
How was this PR tested?
WorkflowExecutionService/compile(amber) succeeds.scalafmtCheckAllpasses; scalafix rules (RemoveUnused, ProcedureSyntax) are satisfied (the new import is used).py_compile.Was this PR authored or co-authored using generative AI tooling?
Generated-by: Claude Code (Claude Opus 4.8)