Skip to content

Conversation

mysiki
Copy link

@mysiki mysiki commented Mar 28, 2025

Add support of VCluster (https://www.vcluster.com/) (resolve my issue : #415)

VCluster (VC) work on top of physical cluster with workload sync between virtual cluster and physical cluster. During the sync, VC rename the pod to avoid naming collision. VC only sync the pod, not the other stuff like RS, Deployment, ... The sync is done on the physical namespace where vcluster is present.

Summarize design :

Physical cluster Namespace Physical cluster pod name Virtual cluster Namespace Virtual cluster pod name
vcluster-foo myappfront-1234-x-myappns-x-vclusterfooname myappns myappfront-1234
vcluster-foo myappback-123-x-myappns-x-vclusterfooname myappns myappback-123
vcluster-foo otherapp-1234-x-otherappns-x-vclusterfooname otherappns otherapp-1234
vcluster-foo otherapp-4567-x-otherappns-x-vclusterfooname otherappns otherapp-4567
vcluster-foo otherappwithreallylongnammmmme-1234567891234567891234-x-othera-f5fef5e6fe otherappns otherappwithreallylongnammmmme-1234567891234567891234

The renaming is done following algo present in added function get_vcluster_pod_real_name.

To reduce pod number and cluster load, usually metrics are read read from physical cluster (avoid to have all exporter pod inside VC).

Because of VC only sync pod, KRR need to run connected to the VC (to detect DP, STS,..). But the pod name and pod namespace in prometheus are the one present in physical cluster.

This PR add 2 new arguments that are mandatory to calculate the pod name and pod namespace for prom query and function to use it.
I also add debug logger on prom query (can be remove if you think is useless :) )

Readme is update

Test done with success on my VC (with VC args) and without (like before this change).

@CLAassistant
Copy link

CLAassistant commented Mar 28, 2025

CLA assistant check
All committers have signed the CLA.

@mysiki mysiki force-pushed the feature/vcluster branch from b281597 to 8f5cbeb Compare March 28, 2025 15:51
@mysiki
Copy link
Author

mysiki commented May 21, 2025

some news ?

Use f-strings, rename variables, add doc strings
Copy link

coderabbitai bot commented Aug 12, 2025

Walkthrough

Introduces VCluster support across configuration, CLI, and Prometheus query builders. Adds helper methods for vcluster pod/namespace mapping, updates CPU/Memory loaders to use them and log PromQL, adds README documentation, and makes a trivial requirements.txt newline fix.

Changes

Cohort / File(s) Summary of changes
VCluster config & CLI
robusta_krr/core/models/config.py, robusta_krr/main.py
Added Config fields vcluster_name and vcluster_namespace. Exposed corresponding Typer CLI options and threaded them into Config.
Prometheus metric base helpers (VCluster)
robusta_krr/core/integrations/prometheus/metrics/base.py
Added hashlib import and two methods: get_vcluster_pod_real_name and get_pod_namespace to map vcluster pod/namespace to host-cluster equivalents.
CPU metrics: vcluster support + logging
robusta_krr/core/integrations/prometheus/metrics/cpu.py
Switched pod selectors to vcluster-aware names, computed namespace via helper, and logged constructed PromQL queries.
Memory metrics: vcluster support + logging
robusta_krr/core/integrations/prometheus/metrics/memory.py
Applied same vcluster-aware pod/namespace handling and PromQL logging across memory loaders; added module logger.
Docs updates
README.md
Added VCluster usage section (including sample command), plus minor whitespace/formatting tweaks.
Dependency housekeeping
requirements.txt
Added trailing newline; no dependency changes.

Sequence Diagram(s)

sequenceDiagram
  participant User
  participant CLI as KRR CLI
  participant Config
  participant Loader as Metric Loader
  participant Prom as Prometheus

  User->>CLI: run krr ... --vcluster-name --vcluster-namespace
  CLI->>Config: build Config(vcluster_name, vcluster_namespace, ...)
  CLI->>Loader: instantiate with settings
  Loader->>Loader: get_vcluster_pod_real_name()/get_pod_namespace()
  Loader->>Prom: query PromQL (vcluster-aware)
  Prom-->>Loader: timeseries
  Loader-->>CLI: results
  CLI-->>User: output
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~15–25 minutes

Suggested reviewers

  • aantn
  • arikalon1
✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (4)
README.md (2)

17-17: Minor: Fix redundant spacing in HTML tag.

There's an extra space in the paragraph alignment attribute.

-  <p align="center">
+  <p align="center">

408-422: Fix grammar and punctuation issues in VCluster documentation.

The documentation has a few style issues that should be addressed for clarity and consistency.

-KRR supports VCluster software when Prometheus is outside of the VCluster (on physical cluster or centralized). Because of VCluster pod renaming, you need to provide :
+KRR supports VCluster software when Prometheus is outside the VCluster (on physical cluster or centralized). Because of VCluster pod renaming, you need to provide:

-Other parameters like namespace selector, pod selector etc work as expected.
+Other parameters like namespace selector, pod selector, etc. work as expected.
robusta_krr/core/integrations/prometheus/metrics/base.py (1)

296-296: Fix typo in docstring.

There's a typo in the docstring.

-        string: the pod namepace in the host cluster.
+        string: the pod namespace in the host cluster.
robusta_krr/core/integrations/prometheus/metrics/memory.py (1)

82-82: Remove unnecessary trailing whitespace.

There's trailing whitespace on the line that should be removed for consistency.

-    
+
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 54ab6ef and 8222344.

📒 Files selected for processing (7)
  • README.md (3 hunks)
  • requirements.txt (1 hunks)
  • robusta_krr/core/integrations/prometheus/metrics/base.py (2 hunks)
  • robusta_krr/core/integrations/prometheus/metrics/cpu.py (6 hunks)
  • robusta_krr/core/integrations/prometheus/metrics/memory.py (8 hunks)
  • robusta_krr/core/models/config.py (1 hunks)
  • robusta_krr/main.py (2 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (2)
robusta_krr/core/integrations/prometheus/metrics/cpu.py (1)
robusta_krr/core/integrations/prometheus/metrics/base.py (3)
  • get_vcluster_pod_real_name (265-285)
  • get_pod_namespace (287-302)
  • get_prometheus_cluster_label (80-89)
robusta_krr/core/integrations/prometheus/metrics/memory.py (3)
robusta_krr/core/integrations/prometheus/metrics/base.py (4)
  • get_vcluster_pod_real_name (265-285)
  • get_pod_namespace (287-302)
  • get_prometheus_cluster_label (80-89)
  • get_query (92-105)
robusta_krr/core/integrations/prometheus/metrics/cpu.py (3)
  • get_query (15-33)
  • get_query (45-66)
  • get_query (76-94)
robusta_krr/core/models/objects.py (1)
  • K8sObjectData (38-107)
🪛 LanguageTool
README.md

[style] ~411-~411: This phrase is redundant. Consider using “outside”.
Context: ...ts VCluster software when Prometheus is outside of the VCluster (on physical cluster or ce...

(OUTSIDE_OF)


[style] ~416-~416: In American English, abbreviations like “etc.” require a period.
Context: ...s like namespace selector, pod selector etc work as expected. ```sh krr simple --v...

(ETC_PERIOD)

🔇 Additional comments (12)
requirements.txt (1)

56-57: LGTM! File formatting improvement.

Adding the trailing newline follows standard conventions and improves file consistency. This is a cosmetic improvement with no impact on functionality.

robusta_krr/core/models/config.py (1)

85-88: LGTM! Configuration fields for VCluster support.

The new configuration fields are properly typed as optional strings with appropriate defaults. The placement after the internal _logging_console attribute follows the existing code organization pattern.

README.md (1)

408-422: No duplicate VCluster documentation found
The grep search only returns a single <details><summary>VCluster</summary> block in README.md (lines 408–422). There isn’t a second VCluster section to remove—please ignore this suggestion.

Likely an incorrect or invalid review comment.

robusta_krr/core/integrations/prometheus/metrics/base.py (3)

7-7: LGTM! Added hashlib import for VCluster pod name hashing.

The import is correctly added and will be used for the SHA-256 hashing functionality in the VCluster pod name generation.


287-302: LGTM! VCluster namespace mapping implementation.

The namespace mapping logic is straightforward and correct - it returns the host cluster namespace when in a VCluster context, otherwise returns the original namespace.


265-285: Review VCluster Pod Naming Algorithm Needs Manual Confirmation

I didn’t find any references or examples of the exact “host cluster” pod-name format in our repo or docs, so please:

  • Cross-check this implementation against VCluster’s official naming scheme (e.g. in the VCluster documentation or source).
  • Add a comment or link to the precise doc/source you used for reference.
  • Consider adding a small unit test covering:
    • A name under 63 chars (no hash), and
    • A name over 63 chars (truncation + 10-char SHA-256 suffix).

This will ensure our truncation logic (52 chars + “-” + first 10 chars of SHA-256) perfectly mirrors VCluster’s behavior.

robusta_krr/core/integrations/prometheus/metrics/memory.py (5)

4-6: LGTM! Added logging support for memory metrics.

The logging import and logger initialization follows the same pattern used in the CPU metrics file, providing consistency across the codebase.


15-30: LGTM! Consistent VCluster integration for MemoryLoader.

The implementation correctly follows the established pattern from CPU metrics:

  • Uses get_vcluster_pod_real_name() for pod selector construction
  • Uses get_pod_namespace() for namespace mapping
  • Introduces prom_query variable with debug logging
  • Maintains the same query structure with VCluster-aware parameters

38-56: LGTM! VCluster integration for MaxMemoryLoader.

The changes are consistent with the pattern established in other loaders, properly integrating VCluster support for maximum memory usage metrics.


63-82: LGTM! VCluster integration for MemoryAmountLoader.

The implementation is consistent with other memory loaders and properly handles VCluster pod/namespace mapping for memory data point counting.


91-122: LGTM! VCluster integration for MaxOOMKilledMemoryLoader.

The VCluster integration is properly implemented for the OOM-killed memory metrics. The complex query structure with joins is maintained while correctly using the VCluster-aware pod selectors and namespaces.

robusta_krr/main.py (1)

388-390: Config wiring for vcluster fields looks good

The new CLI options are correctly propagated into Config. Assuming Config accepts Optional[str] for these fields, this is aligned.

Comment on lines +16 to +33
pods_selector = "|".join(self.get_vcluster_pod_real_name(pod.name, object.namespace) for pod in object.pods)
pods_namespace = self.get_pod_namespace(object.namespace)
cluster_label = self.get_prometheus_cluster_label()
return f"""
prom_query = f"""
max(
rate(
container_cpu_usage_seconds_total{{
namespace="{object.namespace}",
pod=~"{pods_selector}",
container="{object.container}"
{cluster_label}
}}[{step}]
)
) by (container, pod, job)
"""
rate(
container_cpu_usage_seconds_total{{
namespace="{pods_namespace}",
pod=~"{pods_selector}",
container="{object.container}"
{cluster_label}
}}[{step}]
)
) by (container, pod, job)
"""
logger.debug(f"{prom_query}")

return prom_query
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Escape pod names for regex and anchor matches; minor logging nit

Unescaped pod names in a regex matcher can overmatch (e.g., dots in names). Anchor the regex to avoid partial matches and escape pod names. Also, avoid unnecessary f-string in logger.debug.

Apply this diff:

-from .base import PrometheusMetric, QueryType
+from .base import PrometheusMetric, QueryType
 import logging
+import re
@@
-        pods_selector = "|".join(self.get_vcluster_pod_real_name(pod.name, object.namespace) for pod in object.pods)
+        pods_selector = "|".join(
+            re.escape(self.get_vcluster_pod_real_name(pod.name, object.namespace))
+            for pod in object.pods
+        )
+        pods_selector = f"^({pods_selector})$" if pods_selector else r"^$"
@@
-                            pod=~"{pods_selector}",
+                            pod=~"{pods_selector}",
@@
-        logger.debug(f"{prom_query}")
+        logger.debug(prom_query)
@@
-            pods_selector = "|".join(self.get_vcluster_pod_real_name(pod.name, object.namespace) for pod in object.pods)
+            pods_selector = "|".join(
+                re.escape(self.get_vcluster_pod_real_name(pod.name, object.namespace))
+                for pod in object.pods
+            )
+            pods_selector = f"^({pods_selector})$" if pods_selector else r"^$"
@@
-                                pod=~"{pods_selector}",
+                                pod=~"{pods_selector}",
@@
-            logger.debug(f"{prom_query}")
+            logger.debug(prom_query)
@@
-        pods_selector = "|".join(self.get_vcluster_pod_real_name(pod.name, object.namespace) for pod in object.pods)
+        pods_selector = "|".join(
+            re.escape(self.get_vcluster_pod_real_name(pod.name, object.namespace))
+            for pod in object.pods
+        )
+        pods_selector = f"^({pods_selector})$" if pods_selector else r"^$"
@@
-                        pod=~"{pods_selector}",
+                        pod=~"{pods_selector}",
@@
-        logger.debug(f"{prom_query}")
+        logger.debug(prom_query)

Notes:

  • Using re.escape ensures literal matching of pod names in the regex.
  • Anchoring with ^(...)$ prevents partial matches.
  • When no pods exist, '^$' ensures no match rather than a potentially surprising catch-all.

Also applies to: 46-67, 77-95

Comment on lines +323 to +334
vcluster_namespace: str = typer.Option(
None,
"--vcluster-namespace",
help="The vcluster namespace on physical cluster",
rich_help_panel="VCluster Settings",
),
vcluster_name: str = typer.Option(
None,
"--vcluster-name",
help="The vcluster name on physical cluster",
rich_help_panel="VCluster Settings",
),
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Make vcluster CLI options Optional[str] and clarify help about paired usage

Both options default to None but are annotated as str. Align to Optional[str] and clarify help that both must be provided together when using VCluster to avoid misconfiguration.

Apply this diff:

-                vcluster_namespace: str = typer.Option(
+                vcluster_namespace: Optional[str] = typer.Option(
                     None,
                     "--vcluster-namespace",
-                    help="The vcluster namespace on physical cluster",
+                    help="The vcluster namespace on the physical cluster (required when using --vcluster-name).",
                     rich_help_panel="VCluster Settings",
                 ),
-                vcluster_name: str = typer.Option(
+                vcluster_name: Optional[str] = typer.Option(
                     None,
                     "--vcluster-name",
-                    help="The vcluster name on physical cluster",
+                    help="The vcluster name on the physical cluster (required when using --vcluster-namespace).",
                     rich_help_panel="VCluster Settings",
                 ),

As a follow-up, add a paired-args validation right before constructing Config to prevent partial configuration:

# Enforce that either both vcluster_* are provided or neither
if bool(vcluster_name) ^ bool(vcluster_namespace):
    raise click.BadOptionUsage(
        "--vcluster-name/--vcluster-namespace",
        "Both --vcluster-name and --vcluster-namespace must be provided together when using VCluster."
    )
🤖 Prompt for AI Agents
robusta_krr/main.py around lines 323 to 334, the CLI options vcluster_namespace
and vcluster_name are typed as str but default to None and lack clear paired-use
help; change their type annotations to Optional[str] and update their help text
to state that both must be provided together when using VCluster, and add a
validation immediately before constructing Config that checks for a partial set
(XOR) of these two values and raises click.BadOptionUsage with a message that
both --vcluster-name and --vcluster-namespace must be provided together.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants