Skip to content

refactor(config): unify default-data-transfer-batch-size into one config key#5545

Open
Ma77Ball wants to merge 5 commits into
apache:mainfrom
Ma77Ball:consolidate-batch-size-config-5544
Open

refactor(config): unify default-data-transfer-batch-size into one config key#5545
Ma77Ball wants to merge 5 commits into
apache:mainfrom
Ma77Ball:consolidate-batch-size-config-5544

Conversation

@Ma77Ball

@Ma77Ball Ma77Ball commented Jun 7, 2026

Copy link
Copy Markdown
Contributor

What changes were proposed in this PR?

The default data-transfer batch size was controlled by two separate config keys, each with its own env var, both defaulting to 400:

  • network-buffering.default-data-transfer-batch-size (NETWORK_BUFFERING_DEFAULT_DATA_TRANSFER_BATCH_SIZE), read by the backend (ApplicationConfig.defaultDataTransferBatchSize).
  • gui.workflow-workspace.default-data-transfer-batch-size (GUI_WORKFLOW_WORKSPACE_DEFAULT_DATA_TRANSFER_BATCH_SIZE), read by GuiConfig and sent to the frontend via ConfigResource.

Having two keys for one value meant an operator could set one env var and forget the other, leaving the backend and GUI out of sync.

This PR makes the network-buffering key the single source of truth:

  • ConfigResource now surfaces ApplicationConfig.defaultDataTransferBatchSize directly.
  • Removed the duplicate guiWorkflowWorkspaceDefaultDataTransferBatchSize field from GuiConfig.
  • Removed the duplicate key (and its env override) from gui.conf.

The frontend is unchanged: the JSON field name (defaultDataTransferBatchSize) it consumes stays the same.

Any related issues, documentation, discussions?

Closes #5544

How was this PR tested?

  • sbt Config/compile ConfigService/compile both succeed.
  • Repo-wide grep confirms no remaining references to the old key or GUI_WORKFLOW_WORKSPACE_DEFAULT_DATA_TRANSFER_BATCH_SIZE (outside generated dist/ artifacts).
  • The backend reader of the canonical key is untouched, so existing tests (e.g. NetworkOutputBufferSpec) are unaffected.
  • Also tested that the application performs as normal on all services (e.g., running workflows, uploading data, creating workflows, etc..)

Was this PR authored or co-authored using generative AI tooling?

Generated-by: Claude Code (Opus 4.8)

@github-actions github-actions Bot added common platform Non-amber Scala service paths labels Jun 7, 2026
@codecov-commenter

codecov-commenter commented Jun 7, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 60.00000% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 52.36%. Comparing base (07ca5d4) to head (ec0d725).

Files with missing lines Patch % Lines
...pache/texera/service/resource/ConfigResource.scala 0.00% 2 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main    #5545      +/-   ##
============================================
- Coverage     52.38%   52.36%   -0.02%     
+ Complexity     2484     2479       -5     
============================================
  Files          1070     1070              
  Lines         41359    41361       +2     
  Branches       4441     4441              
============================================
- Hits          21666    21660       -6     
+ Misses        18427    18426       -1     
- Partials       1266     1275       +9     
Flag Coverage Δ *Carryforward flag
access-control-service 64.61% <ø> (ø)
agent-service 33.76% <ø> (ø) Carriedforward from 07ca5d4
amber 53.25% <ø> (-0.06%) ⬇️
computing-unit-managing-service 1.65% <ø> (ø)
config-service 53.73% <0.00%> (-2.33%) ⬇️
file-service 38.21% <ø> (ø)
frontend 46.94% <100.00%> (+0.02%) ⬆️
pyamber 90.72% <ø> (ø) Carriedforward from 07ca5d4
python 90.75% <ø> (ø) Carriedforward from 07ca5d4
workflow-compiling-service 58.69% <ø> (ø)

*This pull request uses carry forward flags. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Comment on lines 50 to 53
@GET
@RolesAllowed(Array("REGULAR", "ADMIN"))
@Path("/gui")
def getGuiConfig: Map[String, Any] =

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's not put engine configs under gui. Please create a new endpoint for engine configs.
e.g, /amber

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved defaultDataTransferBatchSize out of /gui into a new /amber endpoint, and updated the frontend to fetch it there.

import jakarta.annotation.security.{PermitAll, RolesAllowed}
import jakarta.ws.rs.core.MediaType
import jakarta.ws.rs.{GET, Path, Produces}
import org.apache.texera.amber.config.ApplicationConfig

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

config service should not depend on amber. we need to separate them. otherwise config service will require to be compiled with amber package.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ApplicationConfig actually lives in common/config, not the amber module, so config-service does not depend on amber (the package is just named amber). I added a short comment at the import to make that clear. Let me know if you'd still prefer it organized differently.

@github-actions github-actions Bot added the frontend Changes related to the frontend GUI label Jun 9, 2026
@Ma77Ball Ma77Ball requested a review from Yicong-Huang June 9, 2026 19:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

common frontend Changes related to the frontend GUI platform Non-amber Scala service paths

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Consolidate duplicate default-data-transfer-batch-size config keys into one canonical key + env var

3 participants