Skip to content

Improve handling of FASTAPI Asyncio in API#4924

Open
JC-wk wants to merge 6 commits into
microsoft:mainfrom
JC-wk:asyncio-prevent-garbage-collection
Open

Improve handling of FASTAPI Asyncio in API#4924
JC-wk wants to merge 6 commits into
microsoft:mainfrom
JC-wk:asyncio-prevent-garbage-collection

Conversation

@JC-wk

@JC-wk JC-wk commented Jun 5, 2026

Copy link
Copy Markdown
Collaborator

Resolves #4923

What is being addressed

This pull request resolves a reliability issue where background Service Bus message processors (DeploymentStatusUpdater and AirlockStatusUpdater) could potentially be silently garbage-collected. It refactors task lifecycle management to use application-scoped state, implements safe concurrent task cancellation during shutdown, and enhances background worker observability.

Problem Description

Previously, background workers were instantiated and launched using:

asyncio.create_task(deploymentStatusUpdater.receive_messages())
asyncio.create_task(airlockStatusUpdater.receive_messages())

In Python, the asyncio event loop only maintains weak references to tasks. Discarding the return value of asyncio.create_task() without keeping a strong reference elsewhere left the task instances vulnerable to garbage collection during runtime, leading to silent processing failures.

Additionally, this introduced other architectural concerns:

  • No Lifecycle Scoping: Global mutable state (e.g. module-level sets) poses contamination and race risks across multiple test runs or application reloads.
  • Unsafe Mutation: Iterating over active tasks while they remove themselves from the set on completion via a done callback triggers a RuntimeError: Set changed size during iteration.
  • Swallowed Exceptions: If any background loop crashed, the exception went completely unlogged and unnoticed, or tracebacks were dropped.

Solution

Refactored lifespan in api_app/main.py to implement a correct and robust asyncio background worker pattern:

1. App-Scoped State Tracking

Instead of a global module-level set, background tasks are now registered under the FastAPI app.state lifecycle:

app.state.background_tasks = set()

This ensures task contexts are cleanly isolated per application instance.

2. Task Naming and Observability

Tasks are created with human-readable names for improved debuggability and stack tracing:

  • name="deployment-status-updater"
  • name="airlock-status-updater"

3. Safe, Controlled Shutdown

During teardown (after yield), tasks are copied to a list before cancellation to prevent mutation-during-iteration errors.

tasks = list(app.state.background_tasks)
for task in tasks:
    task.cancel()

4. Traceback Preservation on Failures

Exceptions raised by background tasks are retrieved and logged using exc_info=result to preserve full tracebacks for debugging:

results = await asyncio.gather(*tasks, return_exceptions=True)
for task, result in zip(tasks, results):
    if isinstance(result, Exception) and not isinstance(result, asyncio.CancelledError):
        logger.error(f"Task {task.get_name()} failed", exc_info=result)

Changes

Testing

  • Verified the changes by running the local test suite using pytest.
  • All 673 test cases completed successfully with no regressions.

How is this addressed

  • api_app/main.py
  • Updated CHANGELOG.md
  • Increment api version

@JC-wk JC-wk requested a review from a team as a code owner June 5, 2026 12:33
Copilot AI review requested due to automatic review settings June 5, 2026 12:33
@github-actions

github-actions Bot commented Jun 5, 2026

Copy link
Copy Markdown

Unit Test Results

673 tests   673 ✅  8s ⏱️
  1 suites    0 💤
  1 files      0 ❌

Results for commit a51b3eb.

♻️ This comment has been updated with latest results.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request improves reliability of long-running background Service Bus workers in the FastAPI API by managing their lifecycle within the application lifespan, aiming to prevent silent task garbage collection and to make shutdown behavior more controlled/observable.

Changes:

  • Track background worker tasks in app.state to keep strong references scoped to the FastAPI application lifecycle.
  • Add controlled shutdown logic that cancels tracked tasks, awaits completion, and logs failures.
  • Add an unreleased changelog entry describing the fix.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
api_app/main.py Refactors FastAPI lifespan to track, name, cancel, and gather background tasks during shutdown.
CHANGELOG.md Adds an Unreleased BUG FIXES entry for the background task lifecycle fix.

Comment thread api_app/main.py
Comment thread CHANGELOG.md
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

API Background tasks created with asyncio.create_task could be garbage collected

2 participants