Investigate issue #3302 driver behavior when upsd aborts#3368
Investigate issue #3302 driver behavior when upsd aborts#3368jimklimov wants to merge 17 commits intonetworkupstools:masterfrom
Conversation
|
❌ Build nut 2.8.4.4369-master failed (commit 1c5d56839b by @jimklimov) |
fd80697 to
97eca57
Compare
|
✅ Build nut 2.8.4.4370-master completed (commit 0a7e64ee19 by @jimklimov) |
…proctag() is called) [networkupstools#3302, networkupstools#3368] Signed-off-by: Jim Klimov <jimklimov+nut@gmail.com>
…etworkupstools#3302, networkupstools#3368] Signed-off-by: Jim Klimov <jimklimov+nut@gmail.com>
…rkupstools#3302, networkupstools#3368] Signed-off-by: Jim Klimov <jimklimov+nut@gmail.com>
…gs [networkupstools#3302, networkupstools#3368] Signed-off-by: Jim Klimov <jimklimov+nut@gmail.com>
…ally and flip to specified upsname later [networkupstools#3302, networkupstools#3368] Signed-off-by: Jim Klimov <jimklimov+nut@gmail.com>
…sing setproctag() [networkupstools#3302, networkupstools#3368] Did not work for parallel scanning threads where it would be most useful, because they are in same process space... Signed-off-by: Jim Klimov <jimklimov+nut@gmail.com>
…pthreads so far [networkupstools#3302, networkupstools#3368] Signed-off-by: Jim Klimov <jimklimov+nut@gmail.com>
|
❌ Build nut 2.8.4.4371-master failed (commit 0f2f5925f5 by @jimklimov) |
|
✅ Build nut 2.8.4.4373-master completed (commit dd1c3aa017 by @jimklimov) |
|
NOTE: After #3363 it seems that UPDATE: Older Windows builds did similarly (tested with 2.8.4.1572-1572+g69e282b3b+v2.8.5+rc5 and a small swarm of 50 drivers, to be under 64 connections):
Older Linux build (2.8.4.1541.9-1550+g7cd79ab73, with 3 dummy devices from NIT):
|
…proctag() is called) [networkupstools#3302, networkupstools#3368] Signed-off-by: Jim Klimov <jimklimov+nut@gmail.com>
5ca8690 to
cf14d94
Compare
…etworkupstools#3302, networkupstools#3368] Signed-off-by: Jim Klimov <jimklimov+nut@gmail.com>
…rkupstools#3302, networkupstools#3368] Signed-off-by: Jim Klimov <jimklimov+nut@gmail.com>
…gs [networkupstools#3302, networkupstools#3368] Signed-off-by: Jim Klimov <jimklimov+nut@gmail.com>
|
A ZIP file with standard source tarball and another tarball with pre-built docs for commit 5911559 is temporarily available: NUT-tarballs-PR-3368.zip. |
|
Rebased after offloading relatively neutral but massive changes into master branch via PRs linked above. |
|
✅ Build nut 2.8.4.4478-master completed (commit 23564cd710 by @jimklimov)
|
|
✅ Build nut 2.8.4.4479-master completed (commit 15dcee8e68 by @jimklimov)
|
|
✅ Build nut 2.8.4.4487-master completed (commit 5971c6bf53 by @jimklimov)
|
|
✅ Build nut 2.8.5.4494-master completed (commit 92e7c3b437 by @jimklimov)
|
|
✅ Build nut 2.8.5.4495-master completed (commit 5a4d3aeedc by @jimklimov)
|
Signed-off-by: Jim Klimov <jimklimov+nut@gmail.com>
…etworkupstools#3302] Signed-off-by: Jim Klimov <jimklimov+nut@gmail.com>
…d) once [networkupstools#3302] Signed-off-by: Jim Klimov <jimklimov+nut@gmail.com>
…once [networkupstools#3302] Signed-off-by: Jim Klimov <jimklimov+nut@gmail.com>
…ools#3302, networkupstools#3368] Signed-off-by: Jim Klimov <jimklimov+nut@gmail.com>
…rrno() not plain upslogx() [networkupstools#3302] Signed-off-by: Jim Klimov <jimklimov+nut@gmail.com>
… failed to open existing NAMED_PIPE and so move on [networkupstools#3302] Signed-off-by: Jim Klimov <jimklimov+nut@gmail.com>
…-check before retry - POSIX part [networkupstools#3302] Signed-off-by: Jim Klimov <jimklimov+nut@gmail.com>
…-check before retry - also for WIN32 [networkupstools#3302] Also revised WaitForSingleObject() result checking - there has to be a chance to succeed ;) Signed-off-by: Jim Klimov <jimklimov+nut@gmail.com>
…isconnect() [networkupstools#3368] Signed-off-by: Jim Klimov <jimklimov+nut@gmail.com>
…upstools#3368] Signed-off-by: Jim Klimov <jimklimov+nut@gmail.com>
…_disconnect() implementation [networkupstools#3302] Signed-off-by: Jim Klimov <jimklimov+nut@gmail.com>
…iling CreateNamedPipe() [networkupstools#3302] Signed-off-by: Jim Klimov <jimklimov+nut@gmail.com>
…ing error codes; document the methods [networkupstools#3302] Signed-off-by: Jim Klimov <jimklimov+nut@gmail.com>
Signed-off-by: Jim Klimov <jimklimov+nut@gmail.com>
…) for faults in NSS setup [networkupstools#3379, networkupstools#3331] Signed-off-by: Jim Klimov <jimklimov+nut@gmail.com>
…te() [networkupstools#3379, networkupstools#3331] Signed-off-by: Jim Klimov <jimklimov+nut@gmail.com>
Start by poking
upsdrvctlfor both WIN32 and POSIX builds...Includes code from PR #3367 to try reproducing the issue.
UPDATE: Maybe specific to
dummy-ups, reproduced both for standalone starts of the driver program directly, one driver viaupsdrvctl(note: the latter does not seem to propagate the exit-code and returns0, at least on Windows, probably should indicate an error), and a swarm of drivers viaupsdrvctl(also exits with code0even if all drivers died abruptly). Sometimes it took several starts ofupsdto be killed a few seconds later.In all these cases the final words were like:
upsdsometimes logs the clean-up:dummy-upsside it seems to always end with the sameentering parse_data_file()call (and exit-code 127) after failing to write to the server:I don't think I've reproduced nor ruled out the problem on non-Windows builds yet.
Per GDB and added debug-logging traces, it seems to crash around
malloc()calls, whether in PCONF context init or invupslog()a bit before it gets there: