Skip to content

Add instrumentation for memory profiling in Python SDK#38853

Open
tvalentyn wants to merge 17 commits into
apache:masterfrom
tvalentyn:profiler_rebase
Open

Add instrumentation for memory profiling in Python SDK#38853
tvalentyn wants to merge 17 commits into
apache:masterfrom
tvalentyn:profiler_rebase

Conversation

@tvalentyn

@tvalentyn tvalentyn commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

This change adds instrumentation launch Python SDK Harness with custom profiling agents and adds memory Profiling capabilities with off-the-shelf profilers.

Design: https://s.apache.org/beam-python-memory-profiling

Requires these dependencies in the runtime environment (included in Beam default containers):

To enable memory profiling with Memray:

python pipeline.py  --runner=DataflowRunner   --profiler_agent=memray

By default, binary profiles are all uploaded to the <TEMP_LOCATION>/profiles and we are attempting to do basic profile postprocessing on the worker by creating memray flamegraphs.

Additional optional params:

  • To use native mode (for profiling extensions) or other memray run options, pass them like so: --profiler_extra_arg="--native". For more information, see https://bloomberg.github.io/memray/run.html.
  • To use aggregated mode, cap profiler duration (worker will be terminated+restarted): --profiler_extra_arg="--aggregate" --profiler_stop_after_sec=600
  • To use custom location for saving profiles: --profile_location=gs://your-bucket/profiles
  • To configure profile uploading and post-processing intervals: --profile_upload_interval_sec=60 --profile_postprocess_interval_sec=60

To enable memory profiling with TCMalloc:

python pipeline.py --runner=DataflowRunner --profiler_agent=tcmalloc

To supply additional envirionment variables, pass them individually with this option: --profiler_extra_env_vars="HEAP_PROFILE_TIME_INTERVAL=600". For more information, see: https://gperftools.github.io/gperftools/heapprofile.html

Other options:

  • To stop after first process crash: --profiler_stop_after_crash . This might help address any instability associated with enabling a profiler.

fixes: #20298


Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:

  • Mention the appropriate issue in your description (for example: addresses #123), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, comment fixes #<ISSUE NUMBER> instead.
  • Update CHANGES.md with noteworthy changes.
  • If this contribution is large, please file an Apache Individual Contributor License Agreement.

See the Contributor Guide for more tips on how to make review process smoother.

To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md

GitHub Actions Tests Status (on master branch)

Build python source distribution and wheels
Python tests
Java tests
Go tests

See CI.md for more information about GitHub Actions CI or the workflows README to see a list of phrases to trigger workflows.

@codecov

codecov Bot commented Jun 9, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 0% with 255 lines in your changes missing coverage. Please review.
✅ Project coverage is 57.71%. Comparing base (d9117f4) to head (d1f3b01).
⚠️ Report is 11 commits behind head on master.

Files with missing lines Patch % Lines
sdks/python/container/profiler.go 0.00% 199 Missing ⚠️
sdks/python/container/boot.go 0.00% 56 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master   #38853      +/-   ##
============================================
- Coverage     57.77%   57.71%   -0.06%     
  Complexity    12969    12969              
============================================
  Files          2509     2510       +1     
  Lines        260525   260768     +243     
  Branches      10658    10658              
============================================
- Hits         150516   150507       -9     
- Misses       104318   104570     +252     
  Partials       5691     5691              
Flag Coverage Δ
go 28.64% <0.00%> (-0.10%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@tvalentyn

Copy link
Copy Markdown
Contributor Author

/gemini review

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces profiling capabilities to the Apache Beam Python SDK worker harness, adding pipeline options to configure profiling agents (like memray and tcmalloc) and implementing background tasks in the Go bootloader to manage profiling, GCS uploads, and post-processing. Feedback on these changes highlights several critical improvements: replacing a potentially failing syscall.Kill with Go's standard process signaling, resolving an ARM64 compatibility issue by avoiding hardcoded absolute paths for tcmalloc, executing memray post-processing sequentially to prevent worker OOMs, implementing a cleanup mechanism for local profile files to avoid disk exhaustion, and fixing a goroutine/ticker leak by handling context cancellation.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread sdks/python/container/boot.go
Comment thread sdks/python/container/profiler.go
Comment thread sdks/python/container/profiler.go Outdated
Comment thread sdks/python/container/profiler.go
Comment thread sdks/python/container/profiler.go Outdated
@tvalentyn tvalentyn changed the title Add memory profiler for Python Add instrumentation for memory profiling for Python SDK Jun 10, 2026
@tvalentyn tvalentyn changed the title Add instrumentation for memory profiling for Python SDK Add instrumentation for memory profiling in Python SDK Jun 10, 2026
@tvalentyn tvalentyn marked this pull request as ready for review June 10, 2026 02:09
@gemini-code-assist

Copy link
Copy Markdown
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@tvalentyn

Copy link
Copy Markdown
Contributor Author

R: @jrmccluskey

@tvalentyn tvalentyn assigned tvalentyn and jrmccluskey and unassigned tvalentyn Jun 10, 2026
@github-actions

Copy link
Copy Markdown
Contributor

Stopping reviewer notifications for this pull request: review requested by someone other than the bot, ceding control. If you'd like to restart, comment assign set of reviewers

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Improve memory profiling for users of Portable Beam Python

2 participants