Add webMethods IS lifecycle support#227
Add webMethods IS lifecycle support#227aurbroszniowski wants to merge 9 commits intoTerracotta-OSS:masterfrom
Conversation
a57a21f to
bb7a9d1
Compare
02e8b4a to
d2ff169
Compare
719ddb4 to
450f696
Compare
client-internal/src/main/java/org/terracotta/angela/client/WebMIs.java
Outdated
Show resolved
Hide resolved
common/src/main/java/org/terracotta/angela/common/distribution/Distribution43Controller.java
Outdated
Show resolved
Hide resolved
common/src/main/java/org/terracotta/angela/common/distribution/DistributionController.java
Outdated
Show resolved
Hide resolved
common/src/main/java/org/terracotta/angela/common/distribution/Distribution120Controller.java
Outdated
Show resolved
Hide resolved
common/src/main/java/org/terracotta/angela/common/distribution/Distribution107Controller.java
Outdated
Show resolved
Hide resolved
common/src/main/java/org/terracotta/angela/common/distribution/Distribution107Controller.java
Show resolved
Hide resolved
common/src/main/java/org/terracotta/angela/common/distribution/Distribution107Controller.java
Outdated
Show resolved
Hide resolved
common/src/main/java/org/terracotta/angela/common/util/Pids.java
Outdated
Show resolved
Hide resolved
| } | ||
|
|
||
| public void setupLocalInstall(License license, String kitInstallationPath, boolean offline, TerracottaCommandLineEnvironment env) { | ||
| // TODO REFACTOR with below setupLocalInstall method according to parameter Distribution |
There was a problem hiding this comment.
Hi, what do you mean by that ? support kitInstallationPath ?
There was a problem hiding this comment.
I mean merge the two setupLocalInstall methods because one needs the kitInstallationPath, the other not, but part of their logic is similar. I made it as a TODO because the merge is not trivial and didn't want to introduce more changes in that PR but I would fix that for the next PR.
common/src/main/java/org/terracotta/angela/common/distribution/Distribution.java
Outdated
Show resolved
Hide resolved
|
|
||
| while (!Thread.currentThread().isInterrupted()) { | ||
| long fileLength = raf.length(); | ||
| if (fileLength < filePointer) { |
There was a problem hiding this comment.
when this can happen ?
because if it is log rotation, then the file descriptor is changing anyway right ? So the raf is not valid anymore ?
And if the length (or content) changes, should the filepointer be set to 0 instead to read from start ?
There was a problem hiding this comment.
- When fileLength < filePointer occurs: filePointer tracks how many bytes we’ve already streamed. If someone truncates the same file (e.g., uses : > logfile), the underlying file shrinks while our handle stays open, so the next raf.length() drops below the last pointer value. In that case, that guard rewinds the pointer to the new end so we don’t go past EOF.
If rotation renames the file and a fresh log is created at the same path, the RandomAccessFile continues to follow the original renamed file because the descriptor stays bound
to the old inode. You won’t see the new log until you reopen the file; this class doesn’t handle that case at all. So the if (fileLength < filePointer) block isn’t for rotation-by-rename; it’s only for truncation on the same file. - Should we reset to 0 instead? Not if the file was truncated in-place. Setting filePointer = fileLength (line 50) keeps us at the end, waiting for new data, which matches standard tail -f semantics after truncation.
Resetting to zero would replay old content whenever someone truncates the log (or truncate-to-zero during rotation), leading to duplicate lines in the output. We’d only want to reset to zero if we explicitly reopen a new file instance.
| if (fileLength > filePointer) { | ||
| raf.seek(filePointer); | ||
| String rawLine; | ||
| while ((rawLine = raf.readLine()) != null) { |
There was a problem hiding this comment.
Are we sure that when length > fp, a full line (or several complete lines) will be available to read ? I.e. if some lines are partially written to the file, what is happening ?
There was a problem hiding this comment.
The if (fileLength > filePointer) guard only tells us that the file has grown.
RandomAccessFile.readLine() reads until it encounters \n, \r, or \r\n. If it hits EOF first but has already consumed some characters, it returns that partial string (only returns null when EOF is reached immediately without reading anything). So if a writer appends part of a log line but doesn’t flush a newline yet, this tailer will still emit that fragment, and when the writer later completes the line with more bytes + newline, the tailer will emit a second “line” for the remainder.
| while (System.nanoTime() < deadline) { | ||
| if (!handle.isAlive()) return true; | ||
| try { | ||
| Thread.sleep(50); | ||
| } catch (InterruptedException ie) { | ||
| Thread.currentThread().interrupt(); | ||
| break; | ||
| } | ||
| } |
There was a problem hiding this comment.
I think you could just wait for the handle to complete with a timeout with
handle.onExit().get(timeout.toNanos(), TimeUnit.NANOS...)
(simply use the completable future and check its states)
There was a problem hiding this comment.
you're right, it would probably work but is it worth the complexity ? Besides it would use extra resource on the ProcessMonitor while that's only a short-time window check
| return !handle.isAlive(); | ||
| } | ||
|
|
||
| private static int runCommand(long timeout, TimeUnit unit, String... cmd) |
There was a problem hiding this comment.
i am wondering about the reliability of the timeout here for the commands... These are all spawning processes and on test VM that are very heavily loaded, starting a new process might be slow. So I am wondering if we should not just wait without any timeout, just catching interrupt exception in case our thread gets interrupted by the test (and in this case we destroy the processes if we are interrupted)
There was a problem hiding this comment.
Removing the timeout would make the test vulnerable to a stuck OS command: a stalled taskkill or kill could block the entire test run. If the tiny helper processes can't finish quickly even on loaded machines, the VM config should better be reviewed, or if there are legitimate timeoutes, we could bump the timeout values.
I believe that the InterruptedException is already handled: if the calling thread is interrupted, we destroy the helper process (line 113) and propagate.
mathieucarbou-ibm
left a comment
There was a problem hiding this comment.
I just have a few comments (see above)
5c7bb55 to
a8966b4
Compare
b63d678 to
023041e
Compare
023041e to
d1aeb05
Compare
d1aeb05 to
5b88152
Compare
This PR adds the support of IS
Added internal classes to handle the new functionality (WebMIsServerInstance, WebMIsInstall, WebMIsTopology, WebMIsServerState, WebMIsServer, WebMIsConfigurationContext, WebMIs)
WebMIs is the face of the API (equivalent of Tsa), with start(), stop(), install(), uninstall()
this is reflected along the core code, like AgentController, Agent
Distribution120Controller has been added to support the import-export tools (only available in 12.0)
The License has been updated to be able to contain Terracotta or IS license.
A FileTrailer class has been added to monitor the IS log file and know when IS is started/stopped, because unlike Terracotta the startup script is spawning the process and is not blocking.
Additionally,zt-process-killer has been removed because it uses WMIC on windows to kill processes, which is now not available by default. ProcessUtil.java is based now on ProcessHandle, with fallback code using OS specific process kill commands.