Skip to content

fix(mcp): use DetachedStdioTransport to fix Chromium orphan process leak#173

Open
hernandez42 wants to merge 1 commit into
OpenBMB:mainfrom
hernandez42:fix/mcp-detached-transport-chromium-leak
Open

fix(mcp): use DetachedStdioTransport to fix Chromium orphan process leak#173
hernandez42 wants to merge 1 commit into
OpenBMB:mainfrom
hernandez42:fix/mcp-detached-transport-chromium-leak

Conversation

@hernandez42

Copy link
Copy Markdown

Summary

Fix Chromium/Playwright orphan process leak caused by incorrect process group semantics in McpClient.

Root Cause

StdioClientTransport spawns tsx with detached=false (default), making tsx a child of PilotDeck's process group (PGID = PilotDeck PID). When McpClient.close() calls kill(-pid, SIGKILL):

  • The target is PilotDeck's process group (PGID = PilotDeck PID)
  • tsx's PID ≠ PilotDeck's PGID, so the signal misses tsx's process group
  • Result: tsx → node playwright-mcp → Chromium survive as orphans
  • Over time: 30+ orphaned Chromium processes accumulate → OOM

Fix

  1. New DetachedStdioTransport.ts: Drop-in replacement for StdioClientTransport that spawns the child with detached: true. Now the child becomes its own process group leader (PID = PGID), so kill(-pid, SIGKILL) atomically wipes the entire tree.

  2. McpClient.ts:

    • Replace StdioClientTransport with DetachedStdioTransport
    • Lazy _transportPid capture after transport.start()
    • Atomic process-group SIGKILL in close() (no graceful SIGTERM cascade)
  3. BackgroundTaskRuntime.ts: Use killChildProcessGroup() instead of child.kill() in stop() for consistent process-group semantics.

Changes

File Change
src/mcp/client/DetachedStdioTransport.ts NEW — detached: true transport with proper env inheritance
src/mcp/client/McpClient.ts Use DetachedStdioTransport, atomic PGID kill in close()
src/task/runtime/BackgroundTaskRuntime.ts killChildProcessGroup in stop()

Severity

High — causes memory leak (Chromium processes never exit), eventually triggers OOM and kills PilotDeck.

Root cause: StdioClientTransport spawns tsx with default detached=false,
making tsx a child of PilotDeck's process group. When McpClient.close()
calls kill(-pid, SIGKILL), the target is PilotDeck's group (not tsx's),
so the signal misses — Chromium grandchild processes survive as orphans.

Fix: Replace StdioClientTransport with DetachedStdioTransport which spawns
the child with detached:true. Now PID == PGID, so kill(-pid, SIGKILL)
atomically wipes the entire process tree (tsx → node playwright-mcp → Chromium).

Also update BackgroundTaskRuntime.stop() to use killChildProcessGroup()
instead of child.kill() for consistent process-group semantics.

Changes:
- src/mcp/client/DetachedStdioTransport.ts: NEW — drop-in replacement for
  StdioClientTransport with detached:true + proper env inheritance
- src/mcp/client/McpClient.ts: use DetachedStdioTransport, lazy _transportPid
  capture, atomic process-group SIGKILL in close()
- src/task/runtime/BackgroundTaskRuntime.ts: killChildProcessGroup in stop()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant