Skip to content

feat(sandbox): stale session detection & lazy reconnect#39

Merged
usirin merged 15 commits intomainfrom
umut/stale-sessions
Feb 16, 2026
Merged

feat(sandbox): stale session detection & lazy reconnect#39
usirin merged 15 commits intomainfrom
umut/stale-sessions

Conversation

@usirin
Copy link
Member

@usirin usirin commented Feb 16, 2026

Summary

  • Add connected: Record<string, boolean> to protocol state/layout messages, remove SessionExitMessage
  • Backend: derive connection status from this.terminals, keep buffers/channels on disconnect, broadcast layout update
  • Backend: lazy reconnect — when user types into a dead pane, reconnectTerminal() wakes the sandbox
  • Frontend: track paneConnected state, thread through PaneLayout, render "Disconnected — press any key to reconnect" overlay on stale panes

Test plan

  • Open wormhole, create terminal session
  • Wait ~10min for sandbox to sleep (or manually kill sandbox container)
  • Verify pane shows "Disconnected" overlay instead of appearing alive
  • Press any key in disconnected pane — should reconnect and resume
  • Verify turbo run typecheck passes (confirmed locally)

🤖 Generated with Claude Code

usirin and others added 9 commits February 15, 2026 17:34
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…sage

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Update design doc: channels kept assigned across disconnect (not
released), enabling seamless keystroke-triggered reconnect.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@claude
Copy link

claude bot commented Feb 16, 2026

Reviewed commit: ea8b3b1

Critical Issues

1. Race Condition in Reconnection Logic (WormholeServer.ts:105-108)

Severity: High

The reconnecting Set prevents concurrent reconnections for the same ptyId, but there's a race window between checking !this.reconnecting.has(ptyId) and actually calling reconnectTerminal(). Multiple rapid keystrokes could trigger multiple reconnect attempts before the first one adds the ptyId to the Set.

// Line 105-108
} else if (!this.reconnecting.has(ptyId)) {
    // Terminal disconnected — lazy reconnect
    await this.reconnectTerminal(ptyId, payload);
}

Issue: The check and the this.reconnecting.add(ptyId) (line 556) are not atomic. If two keystrokes arrive before the first reconnectTerminal reaches line 556, both will pass the check.

Impact: Multiple concurrent sandbox wake requests, wasted resources, possible state corruption if both succeed.

Recommendation: Add the ptyId to reconnecting Set immediately before calling reconnectTerminal, or use a different synchronization primitive.


2. Missing Resize Propagation for Disconnected Panes (WormholeServer.ts:364-368)

Severity: Medium

handlePaneResize updates paneSizes but only forwards the resize to the terminal if it's connected:

private async handlePaneResize(msg: Protocol.PaneResizeMessage): Promise<void> {
    this.paneSizes.set(msg.paneId, {cols: msg.cols, rows: msg.rows});
    const termWs = this.terminals.get(msg.paneId);
    if (!termWs) return;  // ← silently drops resize
    termWs.send(JSON.stringify({type: "resize", cols: msg.cols, rows: msg.rows}));
}

Issue: If a pane is disconnected when the user resizes it, the new dimensions are saved to paneSizes but never applied to the terminal. When reconnection happens later (via reconnectTerminal), it uses the saved size, which is correct. However, if the terminal reconnects through a different path (e.g., reconnectAllTerminals on page reload), the terminal might not get the updated dimensions if the resize message was lost.

Impact: Low in practice since reconnectTerminal uses the saved size. But worth documenting this behavior or adding a layout update broadcast when a resize is dropped.


3. Unbounded Memory Growth in paneSizes Map (WormholeServer.ts)

Severity: Medium

paneSizes is never cleaned up when panes are closed:

private async handlePaneClose(msg: Protocol.PaneCloseMessage): Promise<void> {
    // ... closes terminal, updates layout, releases channel ...
    // But never calls: this.paneSizes.delete(msg.paneId)
}

Issue: Every created pane adds an entry to paneSizes, but handlePaneClose doesn't remove it. Over a long session with many pane open/close cycles, this accumulates stale entries.

Impact: Memory leak in the Durable Object. Given the small size of each entry (2 numbers), this is low-severity but should be fixed.

Recommendation: Add this.paneSizes.delete(msg.paneId) in handlePaneClose.


4. Default Connection Status is Wrong (PaneLayout.tsx:82)

Severity: Low

const isConnected = paneConnected[window.key] ?? true;

Issue: When paneConnected[window.key] is undefined (e.g., race during initial state load), the default is true, meaning the pane appears connected even if the server hasn't sent status yet.

Impact: Brief visual inconsistency on load. If a pane is actually disconnected but the frontend hasn't received connected status yet, it won't show the overlay.

Recommendation: Consider defaulting to false or undefined and showing a loading state, or ensure the backend always includes all panes in the connected record (which it does via buildConnectedRecord, so this is mostly a defensive check).


Performance Considerations

5. Repeated broadcastLayoutUpdate Calls

Every disconnect/reconnect/error broadcasts a full layout update to all clients. For a session with many terminals disconnecting simultaneously (e.g., sandbox hibernation affecting multiple tabs), this could send many redundant broadcasts.

Recommendation: Consider debouncing broadcastLayoutUpdate or batching connection status changes.


6. JSON.stringify for Focus Comparison (PaneLayout.tsx:81)

const isFocused = JSON.stringify(path) === JSON.stringify(focus);

This runs on every render for every pane. For large layouts, this is inefficient.

Recommendation: Use a proper deep equality check or a path string key. (Note: This is pre-existing code, not introduced in this PR, but worth flagging.)


Security

No security issues identified. Input validation is handled at the protocol layer (Effect Schema).


Recommendations Summary

  1. Fix race condition in reconnection check (critical)
  2. Clean up paneSizes on pane close (memory leak)
  3. Consider default connection status logic (minor UX)
  4. Add retry limit for reconnect (as noted in TASKS.md — good follow-up)

Overall, the implementation is solid. The lazy reconnect design is clean and the protocol changes are well-structured. Main concern is the race condition in the reconnection trigger path.

usirin and others added 6 commits February 15, 2026 17:54
…sions

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@usirin usirin merged commit f867a5b into main Feb 16, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant