Skip to content

P2-T2: Self-healing stale socket and PID file recovery#124

Merged
SoundBlaster merged 8 commits intomainfrom
feature/P2-T2-stale-socket-recovery
Mar 1, 2026
Merged

P2-T2: Self-healing stale socket and PID file recovery#124
SoundBlaster merged 8 commits intomainfrom
feature/P2-T2-stale-socket-recovery

Conversation

@SoundBlaster
Copy link
Owner

Description

Fixes a critical (P0) issue where a broker daemon crash leaves stale broker.sock and broker.pid files on disk. The proxy's _spawn_broker_if_needed previously checked only socket_path.exists() and skipped spawning if the socket file was present — even if nothing was listening. This silently blocked all future broker-mode sessions until the user manually deleted the files.

Changes:

  • src/mcpbridge_wrapper/broker/proxy.py — Replace plain exists() guard with a connect-based liveness check. If socket.connect() raises OSError (covers ConnectionRefusedError, TOCTOU race, and non-socket files), both stale files are removed and spawn proceeds normally.
  • src/mcpbridge_wrapper/broker/daemon.py — Register atexit.register(self._cleanup_files) after successful startup so socket/PID files are removed on abnormal interpreter exits (unhandled exceptions, sys.exit()), in addition to the existing SIGTERM/SIGINT handlers.

Tests:

  • Updated test_spawn_noop_when_socket_exists to mock socket.connect() success (live broker scenario).
  • Added TestBrokerProxyStaleSocket (3 tests): stale socket triggers spawn, stale files are removed, live socket skips spawn.
  • Added TestBrokerDaemonAtExit (1 test): atexit.register called with _cleanup_files after start().

Type of Change

  • Bug fix
  • New feature
  • Documentation update
  • Refactoring
  • CI/CD improvement

Quality Gates

  • make test - All tests pass with ≥90% coverage (719 passed, 5 skipped; 91.78% coverage)
  • make lint - No linting errors (ruff check src/ — all checks passed)
  • make format - Code is properly formatted
  • make typecheck - Type checking passes
  • make doccheck - Documentation is synced with DocC (if docs changed)

Documentation Sync

  • Documentation changes are synced with DocC catalog (or N/A) — no docs changed

Testing

  • Added/updated tests for new functionality
  • All tests pass locally
  • Manually tested the changes

Checklist

  • Code follows the project's style guidelines
  • Self-review completed
  • Comments added for complex code
  • Documentation updated (if needed)
  • No new warnings generated
  • PR title is descriptive

@chatgpt-codex-connector
Copy link

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.
To continue using code reviews, you can upgrade your account or add credits to your account and enable them for code reviews in your settings.

@SoundBlaster SoundBlaster merged commit b85d360 into main Mar 1, 2026
10 checks passed
@SoundBlaster SoundBlaster deleted the feature/P2-T2-stale-socket-recovery branch March 1, 2026 02:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant