Skip to content

OTA-1863: implement separate deployments with multi-layer autoscaling#1061

Open
fao89 wants to merge 1 commit intoopenshift:masterfrom
fao89:feature/separate-deployments-autoscaling-resilience
Open

OTA-1863: implement separate deployments with multi-layer autoscaling#1061
fao89 wants to merge 1 commit intoopenshift:masterfrom
fao89:feature/separate-deployments-autoscaling-resilience

Conversation

@fao89
Copy link
Member

@fao89 fao89 commented Feb 13, 2026

  • Split Cincinnati into independent graph-builder and policy-engine pods
  • Fix KEDA incident vulnerability by using base metrics instead of recording rules
  • Add HPA fallback autoscaling for resilience when KEDA unavailable
  • Enable 10-15x faster recovery with optimized startup probes (5s vs 300s)
  • Switch from localhost to Kubernetes DNS service communication
  • Add comprehensive incident prevention alerts and monitoring

🤖 Generated with Claude Code

Co-Authored-By: Claude noreply@anthropic.com
Signed-off-by: Fabricio Aguiar fabricio.aguiar@gmail.com

rh-pre-commit.version: 2.3.2
rh-pre-commit.check-secrets: ENABLED

@coderabbitai
Copy link

coderabbitai bot commented Feb 13, 2026

Walkthrough

The Justfile is modified to implement a multi-service readiness workflow that waits for two pods (graph-builder and policy-engine) to achieve Ready status instead of a single pod, and introduces internal and Kubernetes DNS connectivity tests between the services.

Changes

Cohort / File(s) Summary
Multi-service readiness flow
Justfile
Replaces single Cincinnati pod readiness check with dual pod readiness checks for graph-builder and policy-engine. Adds pod name retrieval for both services and introduces internal connectivity test (localhost:8081 on policy-engine) and inter-service DNS test (cincinnati-graph-builder:8080 from policy-engine pod). Retains external route access test.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

No actionable comments were generated in the recent review. 🎉


Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Feb 13, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: fao89

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 13, 2026
- Split Cincinnati into independent graph-builder and policy-engine pods
- Fix KEDA incident vulnerability by using base metrics instead of recording rules
- Add HPA fallback autoscaling for resilience when KEDA unavailable
- Enable 10-15x faster recovery with optimized startup probes (5s vs 300s)
- Switch from localhost to Kubernetes DNS service communication
- Add comprehensive incident prevention alerts and monitoring

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: Fabricio Aguiar <fabricio.aguiar@gmail.com>

rh-pre-commit.version: 2.3.2
rh-pre-commit.check-secrets: ENABLED
@fao89 fao89 force-pushed the feature/separate-deployments-autoscaling-resilience branch from b811fac to 6fb76df Compare February 13, 2026 18:13
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Feb 13, 2026

@fao89: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/cargo-test 6fb76df link true /test cargo-test

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@fao89 fao89 changed the title feat: implement separate deployments with multi-layer autoscaling OTA-1863: implement separate deployments with multi-layer autoscaling Feb 18, 2026
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Feb 18, 2026
@openshift-ci-robot
Copy link

openshift-ci-robot commented Feb 18, 2026

@fao89: This pull request references OTA-1863 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set.

Details

In response to this:

  • Split Cincinnati into independent graph-builder and policy-engine pods
  • Fix KEDA incident vulnerability by using base metrics instead of recording rules
  • Add HPA fallback autoscaling for resilience when KEDA unavailable
  • Enable 10-15x faster recovery with optimized startup probes (5s vs 300s)
  • Switch from localhost to Kubernetes DNS service communication
  • Add comprehensive incident prevention alerts and monitoring

🤖 Generated with Claude Code

Co-Authored-By: Claude noreply@anthropic.com
Signed-off-by: Fabricio Aguiar fabricio.aguiar@gmail.com

rh-pre-commit.version: 2.3.2
rh-pre-commit.check-secrets: ENABLED

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants