Skip to content

Conversation

@jmcphers
Copy link
Collaborator

This change does not fix any known problems, but is intended to help us get better diagnostics on issues like #8000 wherein the supervisor appears to be timing out at startup. Specifically, it seems possible that the supervisor is actually exited in these cases or is taking extra long to start, so:

  • make the timeout configurable so that if there actually is a situation in which it takes a super long time for first boot (AV scan? insanely slow hardware? something else?) it is survivable
  • move the close listener handler to before we open the terminal, so that if the terminal closes immediately after starting we'll catch it
  • periodically while polling, check to see if the process is alive; it's possible it is exiting but the terminal isn't closing or the close listener isn't getting triggered

@jmcphers jmcphers requested a review from sharon-wang October 20, 2025 18:26
@github-actions
Copy link

github-actions bot commented Oct 20, 2025

E2E Tests 🚀
This PR will run tests tagged with: @:critical

readme  valid tags

Copy link
Member

@sharon-wang sharon-wang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! One suggestion to change comments/log messages with "10 seconds" to instead refer to the startupTimeout setting


// ECONNREFUSED (for TCP) and ENOENT (for sockets) are normal
// conditions during startup; the server isn't ready yet. Keep
// trying up to 10 seconds from the time we got a process ID
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a few places where "10 seconds" is referenced, including an error message Timed out waiting for connection file to be ` + `created at ${connectionFile} after 10 seconds Shall we change this to "startupTimeout seconds"?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh thanks, I should have fixed those up! Addressed in 96ec61f.

// Consult configuration to see if we should show this terminal
const config = vscode.workspace.getConfiguration('kernelSupervisor');
const showTerminal = config.get<boolean>('showTerminal', false);
const startupTimeout = config.get<number>('startupTimeout', 10) * 1000;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be useful to log these config settings to the output pane?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! Done in 96ec61f.

@jmcphers jmcphers merged commit 9f589c5 into main Oct 22, 2025
12 checks passed
@jmcphers jmcphers deleted the feature/kallichore-startup-diagnose branch October 22, 2025 16:19
@github-actions github-actions bot locked and limited conversation to collaborators Oct 22, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants