Skip to content

Comments

Embed an HTTP Server in ci-operator#4958

Open
danilo-gemoli wants to merge 1 commit intoopenshift:mainfrom
danilo-gemoli:feat/ci-operator/embed-http-server
Open

Embed an HTTP Server in ci-operator#4958
danilo-gemoli wants to merge 1 commit intoopenshift:mainfrom
danilo-gemoli:feat/ci-operator/embed-http-server

Conversation

@danilo-gemoli
Copy link
Contributor

@danilo-gemoli danilo-gemoli commented Feb 19, 2026

Run an HTTP in ci-operator that will host routes for the lease proxy server, see #4877.
Add no routes so far, but add the http.ServeMux into Config so that is can be used by stepLeaseProxyServer.

Summary by CodeRabbit

  • New Features

    • HTTP server lifecycle management is now integrated into the CI operator, enabling concurrent server operation alongside primary operator functions with enhanced configuration propagation to downstream components.
  • Improvements

    • Enhanced runtime error handling with aggregation of multiple errors across various processing stages, enabling better error visibility and comprehensive diagnostic information reporting.

@openshift-ci-robot
Copy link
Contributor

Pipeline controller notification
This repo is configured to use the pipeline controller. Second-stage tests will be triggered either automatically or after lgtm label is added, depending on the repository configuration. The pipeline controller will automatically detect which contexts are required and will utilize /test Prow commands to trigger the second stage.

For optional jobs, comment /test ? to see a list of all defined jobs. To trigger manually all jobs from second stage use /pipeline required command.

This repository is configured in: automatic mode

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Feb 19, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: danilo-gemoli

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 19, 2026
@coderabbitai
Copy link

coderabbitai bot commented Feb 19, 2026

Walkthrough

Adds an HTTP server lifecycle to the operator Run flow (startup, mux injection into Config, and graceful shutdown), changes Run to return an aggregated error slice, and updates configuration and podspec to expose/use the HTTP server IP environment variable constant.

Changes

Cohort / File(s) Summary
Operator Run & HTTP server
cmd/ci-operator/main.go
Introduces HTTP server startup tied to Run's context, injects srvMux into cfg.HTTPServerMux, defers server close with aggregated error capture, changes Run() signature to func (o *options) Run() (errs []error), and replaces many early returns with appends to errs.
Config struct
pkg/defaults/config.go
Adds net/http import and a new exported field HTTPServerMux *http.ServeMux to Config for downstream components to register handlers.
API constants
pkg/api/constant.go
Adds exported constant CIOperatorHTTPServerIPEnvVarName = "HTTP_SERVER_IP" alongside existing port constant.
Podspec env usage
pkg/prowgen/podspec.go
Replaces literal "HTTP_SERVER_IP" env var name with the new api.CIOperatorHTTPServerIPEnvVarName constant in the podspec generation helper.

Sequence Diagram(s)

sequenceDiagram
    participant Run as "Operator Run"
    participant HTTP as "HTTP Server\n(http.Server + ServeMux)"
    participant Components as "Downstream Components\n(config consumers)"
    participant Cleanup as "Cleanup/Shutdown"

    Run->>HTTP: create ServeMux, construct http.Server with BaseContext(ctx)
    Run->>Components: set cfg.HTTPServerMux = srvMux
    Run->>HTTP: start server (goroutine) -> ListenAndServe on IP:Port
    Components->>HTTP: register handlers on ServeMux
    HTTP-->>Run: unexpected server error -> cancel main context / append to errs
    Run->>Cleanup: defer server.Close(), aggregate close errors into errs
    Run->>Run: execute initialization, graph, steps -> append errors to errs
    Run->>Cleanup: return accumulated errs
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name Status Explanation Resolution
Test Structure And Quality ❓ Inconclusive PR modifies core ci-operator functionality including HTTP server lifecycle and Run() signature changes, but test file modifications cannot be confirmed from available git information. Verify whether cmd/ci-operator/main_test.go and related test files were modified to cover new HTTP server functionality and changed Run() return signature.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Embed an HTTP Server in ci-operator' clearly and concisely summarizes the main change: adding HTTP server functionality to the ci-operator component.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Stable And Deterministic Test Names ✅ Passed PR modifies only source files with no test files added or modified, so no test names to evaluate for stability.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Comment @coderabbitai help to get the list of available commands and usage tips.

@danilo-gemoli
Copy link
Contributor Author

@coderabbitai review

@coderabbitai
Copy link

coderabbitai bot commented Feb 19, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

1 similar comment
@coderabbitai
Copy link

coderabbitai bot commented Feb 19, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@cmd/ci-operator/main.go`:
- Around line 1076-1093: The code is checking the wrong error variable (errs)
after nodes.TopologicalSort() and calculateGraph(); change those checks to
inspect sortErrs and calculateGraphErrs respectively: after calling
nodes.TopologicalSort() check if sortErrs != nil, append a descriptive error
(e.g. results.ForReason("building_graph").ForError(errors.New("could not sort
nodes"))) and then append sortErrs into errs and return; similarly after
calculateGraph(stepList) check if calculateGraphErrs != nil, append
calculateGraphErrs into errs and return. Update the error-append logic around
stepList/sortErrs and graph/calculateGraphErrs so graph errors are not silently
ignored.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@cmd/ci-operator/main.go`:
- Around line 1076-1093: The code is checking the wrong error variable (errs)
after TopologicalSort and calculateGraph, so sortErrs and calculateGraphErrs are
ignored; update the TopologicalSort handling to check if sortErrs != nil (append
results.ForReason("building_graph").ForError(errors.New("could not sort nodes"))
plus sortErrs to errs and return) and similarly check if calculateGraphErrs !=
nil (append calculateGraphErrs to errs and return) after calling
calculateGraph(stepList), referencing variables stepList, sortErrs,
calculateGraphErrs, nodes.TopologicalSort(), and calculateGraph() to locate
where to change.

@danilo-gemoli danilo-gemoli force-pushed the feat/ci-operator/embed-http-server branch from a9eefa0 to c552404 Compare February 19, 2026 13:38
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
cmd/ci-operator/main.go (1)

988-996: Consider using srv.Shutdown() for graceful HTTP server shutdown.

srv.Close() immediately closes all active connections without waiting for in-flight requests to complete. If the lease proxy server (mentioned in PR objectives) has requests in progress, they will be abruptly terminated.

Using srv.Shutdown(ctx) with a timeout context would allow active requests to complete gracefully before closing.

♻️ Suggested change for graceful shutdown
 defer func() {
 	logrus.Infof("Ran for %s", time.Since(start).Truncate(time.Second))
 	o.metricsAgent.Stop()
 	if srv != nil {
-		if err := srv.Close(); err != nil {
+		shutdownCtx, shutdownCancel := context.WithTimeout(context.Background(), 5*time.Second)
+		defer shutdownCancel()
+		if err := srv.Shutdown(shutdownCtx); err != nil {
 			errs = append(errs, fmt.Errorf("close http server: %w", err))
 		}
 	}
 }()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/ci-operator/main.go` around lines 988 - 996, The defer currently calls
srv.Close() which force-closes active connections; replace this with a graceful
shutdown using srv.Shutdown(ctx): create a timeout context (e.g.,
context.WithTimeout) before calling srv.Shutdown(ctx), call
o.metricsAgent.Stop() after Shutdown completes, and append any Shutdown error to
errs (similar to the existing Close error handling). Ensure you cancel the
context and handle/format the error as fmt.Errorf("shutdown http server: %w",
err) so in-flight requests are given time to finish.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@cmd/ci-operator/main.go`:
- Around line 988-996: The defer currently calls srv.Close() which force-closes
active connections; replace this with a graceful shutdown using
srv.Shutdown(ctx): create a timeout context (e.g., context.WithTimeout) before
calling srv.Shutdown(ctx), call o.metricsAgent.Stop() after Shutdown completes,
and append any Shutdown error to errs (similar to the existing Close error
handling). Ensure you cancel the context and handle/format the error as
fmt.Errorf("shutdown http server: %w", err) so in-flight requests are given time
to finish.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Feb 19, 2026

@danilo-gemoli: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/images c552404 link true /test images
ci/prow/breaking-changes c552404 link false /test breaking-changes

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants