Skip to content

Add webMethods IS lifecycle support#227

Open
aurbroszniowski wants to merge 9 commits intoTerracotta-OSS:masterfrom
aurbroszniowski:IS-support
Open

Add webMethods IS lifecycle support#227
aurbroszniowski wants to merge 9 commits intoTerracotta-OSS:masterfrom
aurbroszniowski:IS-support

Conversation

@aurbroszniowski
Copy link
Copy Markdown
Member

@aurbroszniowski aurbroszniowski commented Oct 14, 2025

This PR adds the support of IS
Added internal classes to handle the new functionality (WebMIsServerInstance, WebMIsInstall, WebMIsTopology, WebMIsServerState, WebMIsServer, WebMIsConfigurationContext, WebMIs)

WebMIs is the face of the API (equivalent of Tsa), with start(), stop(), install(), uninstall()

this is reflected along the core code, like AgentController, Agent

Distribution120Controller has been added to support the import-export tools (only available in 12.0)

The License has been updated to be able to contain Terracotta or IS license.

A FileTrailer class has been added to monitor the IS log file and know when IS is started/stopped, because unlike Terracotta the startup script is spawning the process and is not blocking.

Additionally,zt-process-killer has been removed because it uses WMIC on windows to kill processes, which is now not available by default. ProcessUtil.java is based now on ProcessHandle, with fallback code using OS specific process kill commands.

@aurbroszniowski aurbroszniowski force-pushed the IS-support branch 8 times, most recently from a57a21f to bb7a9d1 Compare October 16, 2025 09:15
@aurbroszniowski aurbroszniowski force-pushed the IS-support branch 5 times, most recently from 02e8b4a to d2ff169 Compare November 12, 2025 16:45
@aurbroszniowski aurbroszniowski force-pushed the IS-support branch 2 times, most recently from 719ddb4 to 450f696 Compare November 17, 2025 14:25
@mathieucarbou-ibm mathieucarbou-ibm self-requested a review November 21, 2025 12:51
}

public void setupLocalInstall(License license, String kitInstallationPath, boolean offline, TerracottaCommandLineEnvironment env) {
// TODO REFACTOR with below setupLocalInstall method according to parameter Distribution
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, what do you mean by that ? support kitInstallationPath ?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean merge the two setupLocalInstall methods because one needs the kitInstallationPath, the other not, but part of their logic is similar. I made it as a TODO because the merge is not trivial and didn't want to introduce more changes in that PR but I would fix that for the next PR.


while (!Thread.currentThread().isInterrupted()) {
long fileLength = raf.length();
if (fileLength < filePointer) {
Copy link
Copy Markdown
Member

@mathieucarbou-ibm mathieucarbou-ibm Nov 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when this can happen ?
because if it is log rotation, then the file descriptor is changing anyway right ? So the raf is not valid anymore ?
And if the length (or content) changes, should the filepointer be set to 0 instead to read from start ?

Copy link
Copy Markdown
Member Author

@aurbroszniowski aurbroszniowski Nov 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • When fileLength < filePointer occurs: filePointer tracks how many bytes we’ve already streamed. If someone truncates the same file (e.g., uses : > logfile), the underlying file shrinks while our handle stays open, so the next raf.length() drops below the last pointer value. In that case, that guard rewinds the pointer to the new end so we don’t go past EOF.
    If rotation renames the file and a fresh log is created at the same path, the RandomAccessFile continues to follow the original renamed file because the descriptor stays bound
    to the old inode. You won’t see the new log until you reopen the file; this class doesn’t handle that case at all. So the if (fileLength < filePointer) block isn’t for rotation-by-rename; it’s only for truncation on the same file.
  • Should we reset to 0 instead? Not if the file was truncated in-place. Setting filePointer = fileLength (line 50) keeps us at the end, waiting for new data, which matches standard tail -f semantics after truncation.
    Resetting to zero would replay old content whenever someone truncates the log (or truncate-to-zero during rotation), leading to duplicate lines in the output. We’d only want to reset to zero if we explicitly reopen a new file instance.

if (fileLength > filePointer) {
raf.seek(filePointer);
String rawLine;
while ((rawLine = raf.readLine()) != null) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we sure that when length > fp, a full line (or several complete lines) will be available to read ? I.e. if some lines are partially written to the file, what is happening ?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The if (fileLength > filePointer) guard only tells us that the file has grown.
RandomAccessFile.readLine() reads until it encounters \n, \r, or \r\n. If it hits EOF first but has already consumed some characters, it returns that partial string (only returns null when EOF is reached immediately without reading anything). So if a writer appends part of a log line but doesn’t flush a newline yet, this tailer will still emit that fragment, and when the writer later completes the line with more bytes + newline, the tailer will emit a second “line” for the remainder.

Comment on lines +93 to +101
while (System.nanoTime() < deadline) {
if (!handle.isAlive()) return true;
try {
Thread.sleep(50);
} catch (InterruptedException ie) {
Thread.currentThread().interrupt();
break;
}
}
Copy link
Copy Markdown
Member

@mathieucarbou-ibm mathieucarbou-ibm Nov 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you could just wait for the handle to complete with a timeout with
handle.onExit().get(timeout.toNanos(), TimeUnit.NANOS...)
(simply use the completable future and check its states)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you're right, it would probably work but is it worth the complexity ? Besides it would use extra resource on the ProcessMonitor while that's only a short-time window check

return !handle.isAlive();
}

private static int runCommand(long timeout, TimeUnit unit, String... cmd)
Copy link
Copy Markdown
Member

@mathieucarbou-ibm mathieucarbou-ibm Nov 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i am wondering about the reliability of the timeout here for the commands... These are all spawning processes and on test VM that are very heavily loaded, starting a new process might be slow. So I am wondering if we should not just wait without any timeout, just catching interrupt exception in case our thread gets interrupted by the test (and in this case we destroy the processes if we are interrupted)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing the timeout would make the test vulnerable to a stuck OS command: a stalled taskkill or kill could block the entire test run. If the tiny helper processes can't finish quickly even on loaded machines, the VM config should better be reviewed, or if there are legitimate timeoutes, we could bump the timeout values.
I believe that the InterruptedException is already handled: if the calling thread is interrupted, we destroy the helper process (line 113) and propagate.

Copy link
Copy Markdown
Member

@mathieucarbou-ibm mathieucarbou-ibm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just have a few comments (see above)

@aurbroszniowski aurbroszniowski force-pushed the IS-support branch 4 times, most recently from 5c7bb55 to a8966b4 Compare December 1, 2025 10:15
@aurbroszniowski aurbroszniowski force-pushed the IS-support branch 3 times, most recently from b63d678 to 023041e Compare December 15, 2025 15:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants