Skip to content

Clients do not know STARTTLS setup actually failed until they send a query #3387

@jimklimov

Description

@jimklimov

This PR also introduces better NSS error reporting (older methods did not always work) and generally more legible logging messages in upsd:

...
Tue Mar 31 10:05:49 UTC 2026 [INFO] [testcase_sandbox_start_upsd_alone] Query listing from UPSD by UPSC (driver not running yet)
   3.083398     [D2:1188807:upsd] mainloop: polling returned 1 hits
   3.083456     [D3:1188807:upsd] mainloop: Incoming connection from SERVER [FD 3]
   3.083481     [D2:1188807:upsd] Connect from 127.0.0.1
...
   3.085990     [D2:1188807:upsd] mainloop: polling returned 1 hits
   3.086050     [D3:1188807:upsd] mainloop: Incoming data from CLIENT [127.0.0.1, FD 7]
   3.086072     [D6:1188807:upsd] Entering check_command: STARTTLS
   3.086116     [D6:1188807:upsd] check_command: Calling command handler for STARTTLS
   3.086228     [D2:1188807:upsd] write: [destfd=7] [len=12] [OK STARTTLS]
...
   3.119002     Client 127.0.0.1 did not provide any certificate while we require one.
   3.119050     [D1:1188495:upsd] nss_error -12285 (SSL_ERROR_NO_CERTIFICATE) in net_starttls / SSL_ForceHandshake : Unable to find the certificate or key necessary for authentication.
...
   3.156396     [D2:1188807:upsd] mainloop: polling returned 1 hits
   3.156411     [D3:1188807:upsd] mainloop: Disconnect CLIENT [127.0.0.1, FD 7] due to POLLHUP POLLERR: Success
   3.156422     [D2:1188807:upsd] Disconnect from 127.0.0.1

Although on the client side the error is not as visible:

   0.000127     [D1:1188516:upsc] Starting NUT client: upsc
   0.000139     [D1:1188516:upsc:libupsclient] upscli_init_default_connect_timeout: upscli_default_connect_timeout=10.000000 sec assigned from: envvar_secs
   0.000155     [D1:1188516:upsc:localhost:12345] upsname='(null)' hostname='localhost' port='12345'
   0.000781     [D1:1188516:upsc:libupsclient:localhost:12345] NUT_QUIET_INIT_SSL='false' value was not recognized, ignored
   0.000839     Init SSL without certificate database
   0.002457     Connecting in SSL to 'localhost' (no certificate name specified)
   0.056755     Do not intend to authenticate server localhost
   0.056933     Self-certificate name not configured
   0.057112     SSL handshake done successfully with server localhost
   0.057123     Connected to NUT server localhost in SSL
   0.057126     Certificate verification (by client) is disabled
   0.057133     [D1:1188516:upsc:localhost:12345] Calling list_upses()
   0.057296     [D1:1188516:upsc:libupsclient:localhost:12345] upscli_disconnect: We logged out, and server did not reply in a short time frame
   0.057370     [D1:1188516:upsc:localhost:12345] list_upses: got code -1, upserror 37
   0.057378     Error: SSL error #-5978, message too short to be displayed
   0.057627     [D1:1188516:upsc:localhost:12345] clean_exit: finished, exiting
[FATAL] [testcase_sandbox_start_upsd_alone] upsd does not respond on port 12345 (1):
Tue Mar 31 10:01:46 UTC 2026 [INFO] Stopping test daemons
   3.129906     mainloop: Interrupted system call
   3.230119     Signal 15: exiting

I was under the impression that the server would tell the client (maybe in plaintext ERR ... code?) what the trouble is. Not sure what #-5978 was - maybe mis-interpretation of plaintext bytes in what seems to be a TLS channel? Although with security involved, maybe that is too much already, rather than dropping the connection? Gotta investigate later what this is about => #3331

Maybe we should accept the attempt with any cert or lack thereof, just to drop it gracefully?

Originally posted by @jimklimov in #3368 (comment)

The core issue is that client sends "STARTTLS", server replies "OK STARTTLS" after some basic checks, and the communications continue in crypto channel (or its setup breaks off). There is no other plaintext or trust-anything channel to pass back errors about why the communications failed (e.g. "I do not trust your cert!").

Maybe some post-setup dialog should happen, so it is not that say e.g. upscli_sslinit succeeded and list_ups crashed, but would report the incomplete dialog setup at the time it matters - as part of actually failed upscli_sslinit.

Not sure if this needs a NUT networked protocol extension or can be done within its confines (e.g. optional post-dialog support?)

Metadata

Metadata

Assignees

No one assigned

    Labels

    SSL/NSSIssues and PRs about SSL, TLS and other crypto-related mattersimpacts-release-2.8.4Issues reported against NUT release 2.8.4 (maybe vanilla or with minor packaging tweaks)impacts-release-2.8.5Issues reported against NUT release 2.8.5 (maybe vanilla or with minor packaging tweaks)service/daemon start/stopGeneral subject for starting and stopping NUT daemons (drivers, server, monitor); also BG/FG/Debug

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions