feat: add FastAPI middleware for per-request emissions tracking#1203
feat: add FastAPI middleware for per-request emissions tracking#1203davidberenstein1957 wants to merge 7 commits into
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #1203 +/- ##
==========================================
+ Coverage 88.88% 89.17% +0.28%
==========================================
Files 45 50 +5
Lines 4302 4509 +207
==========================================
+ Hits 3824 4021 +197
- Misses 478 488 +10 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Hello, thanks for this. There is a problem with your branch : there are many changes that are already merged. Can you do a rebase ? |
f44740f to
15c0ccf
Compare
|
@benoit-cty I have reached out to some people at FastAPI, if they would be interested in a quick review :) For visibility, we could also consider deploying it as a standalone integration, but let's see if people like it. |
Ship optional codecarbon[fastapi] integration with CodeCarbonMiddleware, configurable response headers, route-based task naming, and lifespan helper for shared app-level tracking. Co-authored-by: Cursor <cursoragent@cursor.com>
Add tests for app tracking mode, deprecated include_emissions_header, on_request_complete callbacks, header preset edge cases, and routing formatters so Codecov patch coverage meets the PR threshold. Co-authored-by: Cursor <cursoragent@cursor.com>
…iltering Update the FastAPI middleware to support include and exclude patterns for request tracking, allowing users to specify which endpoints to measure. Refactor routing helpers for improved clarity and add support for deferred measurement. Update documentation to reflect new features and usage examples. Co-authored-by: Cursor <cursoragent@cursor.com>
Refactor FastAPI middleware and routing code for improved readability by adjusting line breaks and indentation. Remove the product telemetry link from the documentation navigation. Update test cases for consistency in formatting and structure.
Refactor the test for FastAPI middleware to handle deferred task execution using a thread pool. This change enhances the test's reliability by ensuring that asynchronous tasks are properly awaited in a separate thread, improving overall test coverage and stability.
Refactor the exception handling in gpu_amd.py to specifically catch AttributeError when importing amdsmi. Update the warning message to provide clearer guidance on ensuring proper configuration of amdsmi for AMD GPU metrics.
eabd526 to
bb6995c
Compare
Document latency for sync vs deferred logging on a MiniLM embedder workload and add a reproducible scripts/benchmark_fastapi_middleware.py runner. Co-authored-by: Cursor <cursoragent@cursor.com>
SaboniAmine
left a comment
There was a problem hiding this comment.
Thanks David this a great PR! Left a few questions, and saw that you have already prepared a benchmark script, do you have any numbers / graph already computed to share ?
| HeaderConfig = Union[bool, str, Sequence[str], Mapping[str, str], None] | ||
| HeaderFormatter = Callable[[EmissionsData, Request], Mapping[str, str]] | ||
|
|
||
| FIELD_UNITS: dict[str, str] = { |
There was a problem hiding this comment.
Can this be transformed in an Enum ? I have the feeling it could be leveraged elsewhere for labels and it could be maintained from the core package, not only the fastapi integration. Thoughts ?
| "wue": "l-per-kwh", | ||
| } | ||
|
|
||
| HEADER_PRESETS: dict[str, dict[str, str]] = { |
There was a problem hiding this comment.
Same, a collection of Enums here would make sense no ?
| return f"X-CodeCarbon-{title}{suffix}" | ||
|
|
||
|
|
||
| def resolve_header_mapping(config: HeaderConfig) -> dict[str, str]: |
There was a problem hiding this comment.
Pretty sure this defensive method could be converted in a way more straightforward (and clear) one, if the unit tests demonstrate resilience to each corner case :)
| } | ||
| ) | ||
|
|
||
| HTTP_METHODS = frozenset( |
There was a problem hiding this comment.
Same, this could benefit from being an Enum
| request: "Request", | ||
| formatter: Callable[["Request"], str] | None = None, | ||
| ) -> str: | ||
| """Derive a stable label like ``GET /items/{item_id}`` for task-scoped tracking. |
There was a problem hiding this comment.
This could lead to task override if resource is being accessed by 2 parallel or sequential requests. Maybe we could add some random here if the goal of this method is to provide a name for the codecarbon task concept
| from collections.abc import Awaitable, Callable, Iterable | ||
| from typing import Any | ||
|
|
||
| try: |
There was a problem hiding this comment.
This defensive import isn't needed, Starlette is installed with fastapi as a requirement for this plugin to work, you can assume that it will be present when end user will execute this code.
| async def _start_request_tracker(self) -> EmissionsTracker: | ||
| return await asyncio.to_thread(self._create_and_start_tracker) | ||
|
|
||
| async def _stop_request_tracker( |
There was a problem hiding this comment.
Stopping the tracker asynchronously will have an impact on measured value. Indeed, reading the value before emitting the OutputData object is made synchronously. Lets discuss this live I faced the same issue when dealing with the vLLM integration
| response_headers: HeaderConfig | None = None, | ||
| include_emissions_header: bool = False, | ||
| header_formatter: HeaderFormatter | None = None, | ||
| task_name_formatter: Callable[[Request], str] | None = None, |
|
|
||
| Use **`request`** unless you have measured a need for a shared tracker. For production APIs, prefer **`app`** mode with a lifespan handler and `save_to_file=False` to avoid per-request tracker startup cost. | ||
|
|
||
| ## Performance |
There was a problem hiding this comment.
Could we have a benchmark vs raw FastAPI, and with another telemetry tool ?
Yes I'm thinking about Logfire but could be any other package injected as a middleware
|
|
||
|
|
||
| @asynccontextmanager | ||
| async def lifespan(app: FastAPI): |
There was a problem hiding this comment.
I'm not sure about explicit lifespan explicit declaration, could it be merged with the add_codecarbon_middleware method ?
My understanding is that the developer implementing the webserver leveraging the FastAPI would like to keep ownership of this lifespan function, and I'm not sure how to stack multiple constraints with this formalism. Do we have examples about how it's managed for similar libraries ?
Summary
codecarbon[fastapi]integration withCodeCarbonMiddlewareto measure CO₂ emissions per HTTP request across all FastAPI/Starlette routesheader_formattercallback) so clients can read emissions, energy, duration, and more from response headerscreate_codecarbon_lifespanfor shared app-level tracking, route-based task naming, path exclusions, and two tracking modes (requestfor concurrency-safe per-request trackers,appfor lower-overhead serialized measurement)Test plan
uv run pytest tests/integrations/ -v— 17 tests passuv run --extra fastapi uvicorn examples.fastapi_middleware:app --reloadthencurl -i localhost:8000/predictuv run task docsUsage
See docs/how-to/fastapi.md for full configuration options.
Made with Cursor