-
Notifications
You must be signed in to change notification settings - Fork 22
[PULSE-223] feat: Implement metrics collection for Plane Enterprise #174
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,165 @@ | ||
| {{- if .Values.metrics.enabled }} | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Remove trailing blank line at end of file. Line 165 (EOF) has unnecessary blank lines. YAMLLint reports: "too many blank lines (1 > 0)". Apply this fix: ]
-{{- end }}
-
+{{- end }}🧰 Tools🪛 YAMLlint (1.37.1)[error] 1-1: syntax error: expected the node content, but found '-' (syntax) 🤖 Prompt for AI Agents |
||
| apiVersion: v1 | ||
| kind: ConfigMap | ||
| metadata: | ||
| namespace: {{ .Release.Namespace }} | ||
| name: {{ .Release.Name }}-metrics-config | ||
| labels: | ||
| app.kubernetes.io/name: plane-enterprise | ||
| app.kubernetes.io/instance: {{ .Release.Name }} | ||
| app.kubernetes.io/component: metrics-agent | ||
| data: | ||
| agent-config.yaml: | | ||
| # OpenTelemetry Agent Configuration for Plane Enterprise | ||
| # Collects Kubernetes metrics and Plane API metrics | ||
| receivers: | ||
| # Kubernetes metrics collection | ||
| prometheus: | ||
| config: | ||
| global: | ||
| scrape_interval: {{ .Values.metrics.agent.scrape_interval | default "300s" }} | ||
| evaluation_interval: {{ .Values.metrics.agent.scrape_interval | default "300s" }} | ||
| external_labels: | ||
| # Multi-tenant identification | ||
| plane_installation_uuid: "{{ include "plane.metrics.installationUUID" . }}" | ||
| plane_installation_type: "{{ .Values.metrics.installation.type | default "kubernetes" }}" | ||
| plane_version: "{{ .Values.metrics.installation.plane_version | default .Values.planeVersion }}" | ||
| cluster_name: "{{ .Values.metrics.installation.cluster_name | default "default" }}" | ||
| license_domain: "{{ .Values.license.licenseDomain }}" | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🧩 Analysis chainScript executed: #!/bin/bash
# First, examine the metrics-config.yaml file to understand full context
cat -n charts/plane-enterprise/templates/config-secrets/metrics-config.yamlRepository: makeplane/helm-charts <shell_result> stdout: </shell_result> Script executed: #!/bin/bash
# Search for privacy, compliance, and data handling documentation
find . -type f \( -name "README*" -o -name "PRIVACY*" -o -name "COMPLIANCE*" -o -name "*.md" \) | head -20
rg -i "privacy|gdpr|ccpa|pii|sensitive|data.*retention" --type markdown --type txt -A2 -B2Repository: makeplane/helm-charts <shell_result> stdout: </shell_result> Script executed: #!/bin/bash
# Check README for metrics telemetry documentation
cat charts/plane-enterprise/README.md | head -200
# Also search for license_domain references
echo "=== Searching for license_domain references ==="
rg -n "license_domain|licenseDomain" -B2 -A2Repository: makeplane/helm-charts <shell_result> stdout: </shell_result> Script executed: #!/bin/bash
# Search for any issues, comments, or discussions about metrics privacy/PII
rg -n "TODO|FIXME|XXX|HACK" charts/plane-enterprise/templates/config-secrets/metrics-config.yaml
# Search for any configuration that would allow disabling specific labels
rg -n "metrics.*label|external.*label|telemetry.*privacy" --type yaml -A2 -B2
# Check if there's any documentation about metrics telemetry
find . -type f -name "*.md" -exec grep -l "metrics\|telemetry" {} \;Repository: makeplane/helm-charts <shell_result> stdout: </shell_result> Confirm license domain is exported to external telemetry without privacy controls or documentation. The concern is valid.
The label should be:
🤖 Prompt for AI Agents |
||
| scrape_configs: | ||
| # cAdvisor metrics (container resource usage for this namespace only) | ||
| - job_name: 'kubernetes-cadvisor' | ||
| scheme: https | ||
| tls_config: | ||
| ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt | ||
| insecure_skip_verify: true | ||
| bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token | ||
| kubernetes_sd_configs: | ||
| - role: node | ||
| relabel_configs: | ||
| - action: labelmap | ||
| regex: __meta_kubernetes_node_label_(.+) | ||
| - target_label: __address__ | ||
| replacement: kubernetes.default.svc:443 | ||
| - source_labels: [__meta_kubernetes_node_name] | ||
| regex: (.+) | ||
| target_label: __metrics_path__ | ||
| replacement: /api/v1/nodes/$1/proxy/metrics/cadvisor | ||
| # Filter metrics to only include containers in our namespace | ||
| metric_relabel_configs: | ||
| - source_labels: [namespace] | ||
| action: keep | ||
| regex: {{ .Release.Namespace }} | ||
| - source_labels: [namespace] | ||
| action: drop | ||
| regex: ^$ | ||
| # Plane API /metrics endpoint - scrape service | ||
| - job_name: 'plane-api' | ||
| kubernetes_sd_configs: | ||
| - role: service | ||
| namespaces: | ||
| names: | ||
| - {{ .Release.Namespace }} | ||
| relabel_configs: | ||
| # Only scrape the API service | ||
| - source_labels: [__meta_kubernetes_service_name] | ||
| action: keep | ||
| regex: {{ .Release.Name }}-api | ||
| # Set metrics path to /metrics | ||
| - target_label: __metrics_path__ | ||
| replacement: /metrics | ||
| # Set port to 8000 (API service port) | ||
| - source_labels: [__address__] | ||
| target_label: __address__ | ||
| regex: ([^:]+)(?::\d+)? | ||
| replacement: $1:8000 | ||
| - action: labelmap | ||
| regex: __meta_kubernetes_service_label_(.+) | ||
| - source_labels: [__meta_kubernetes_namespace] | ||
| action: replace | ||
| target_label: kubernetes_namespace | ||
| - source_labels: [__meta_kubernetes_service_name] | ||
| action: replace | ||
| target_label: kubernetes_service_name | ||
| processors: | ||
| batch: | ||
| timeout: {{ .Values.metrics.agent.batch.timeout | default "60s" }} | ||
| send_batch_size: {{ .Values.metrics.agent.batch.send_batch_size | default 4096 }} | ||
| memory_limiter: | ||
| limit_mib: {{ regexReplaceAll "Mi|Gi" (.Values.metrics.agent.memoryLimit | default "256Mi") "" | int }} | ||
| spike_limit_mib: {{ div (regexReplaceAll "Mi|Gi" (.Values.metrics.agent.memoryLimit | default "256Mi") "") 4 | int }} | ||
| check_interval: 0.1s | ||
| # Resource processing to add tenant identification | ||
| resource: | ||
| attributes: | ||
| - key: plane.installation.uuid | ||
| value: "{{ include "plane.metrics.installationUUID" . }}" | ||
| action: upsert | ||
| - key: plane.installation.type | ||
| value: "{{ .Values.metrics.installation.type | default "kubernetes" }}" | ||
| action: upsert | ||
| - key: plane.installation.cluster | ||
| value: "{{ .Values.metrics.installation.cluster_name | default "default" }}" | ||
| action: upsert | ||
| exporters: | ||
| # Export to remote telemetry server | ||
| {{- if .Values.metrics.telemetry.http_endpoint }} | ||
| {{- $cleanEndpoint := .Values.metrics.telemetry.http_endpoint | trimPrefix "https://" | trimPrefix "http://" | trimSuffix "/v1/traces" | trimSuffix "/v1/metrics" }} | ||
| {{- $endpointWithPort := $cleanEndpoint }} | ||
| {{- if not (contains ":" $cleanEndpoint) }} | ||
| {{- $endpointWithPort = printf "%s:443" $cleanEndpoint }} | ||
| {{- end }} | ||
| otlp/http: | ||
| endpoint: {{ $endpointWithPort }} | ||
| compression: gzip | ||
| tls: | ||
| # TODO: Add production TLS configuration | ||
| insecure: true | ||
| retry_on_failure: | ||
| enabled: true | ||
| initial_interval: 5s | ||
| max_interval: 30s | ||
| max_elapsed_time: 300s | ||
| sending_queue: | ||
| enabled: true | ||
| num_consumers: 4 | ||
| queue_size: 100 | ||
| headers: | ||
| {{- range $key, $value := .Values.metrics.telemetry.headers }} | ||
| {{ $key }}: {{ $value | quote }} | ||
| {{- end }} | ||
| {{- end }} | ||
| {{- if .Values.metrics.agent.debug.enabled }} | ||
| # Debug logging (development only) | ||
| logging: | ||
| loglevel: debug | ||
| sampling_initial: 5 | ||
| sampling_thereafter: 200 | ||
| {{- end }} | ||
| extensions: | ||
| health_check: | ||
| endpoint: 0.0.0.0:13133 | ||
| pprof: | ||
| endpoint: 0.0.0.0:1777 | ||
| service: | ||
| extensions: [health_check, pprof] | ||
| pipelines: | ||
| metrics: | ||
| receivers: [prometheus] | ||
| processors: [memory_limiter, resource, batch] | ||
| exporters: [ | ||
| {{- if .Values.metrics.telemetry.http_endpoint }}otlp/http{{- if .Values.metrics.agent.debug.enabled }},{{- end }}{{- end }} | ||
| {{- if .Values.metrics.agent.debug.enabled }}logging{{- end }} | ||
| ] | ||
| {{- end }} | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,90 @@ | ||
| {{- if .Values.metrics.enabled }} | ||
| apiVersion: apps/v1 | ||
| kind: Deployment | ||
| metadata: | ||
| namespace: {{ .Release.Namespace }} | ||
| name: {{ .Release.Name }}-metrics-agent | ||
| labels: | ||
| app.kubernetes.io/name: plane-enterprise | ||
| app.kubernetes.io/instance: {{ .Release.Name }} | ||
| app.kubernetes.io/component: metrics-agent | ||
| spec: | ||
| replicas: {{ .Values.metrics.agent.replicas | default 1 }} | ||
| selector: | ||
| matchLabels: | ||
| app.kubernetes.io/name: plane-enterprise | ||
| app.kubernetes.io/instance: {{ .Release.Name }} | ||
| app.kubernetes.io/component: metrics-agent | ||
| template: | ||
| metadata: | ||
| labels: | ||
| app.kubernetes.io/name: plane-enterprise | ||
| app.kubernetes.io/instance: {{ .Release.Name }} | ||
| app.kubernetes.io/component: metrics-agent | ||
| annotations: | ||
| checksum/config: {{ include (print $.Template.BasePath "/config-secrets/metrics-config.yaml") . | sha256sum }} | ||
| spec: | ||
| serviceAccountName: {{ .Release.Name }}-srv-account | ||
| containers: | ||
| - name: otel-agent | ||
| image: {{ .Values.metrics.agent.image }}:{{ .Values.metrics.agent.tag }} | ||
| imagePullPolicy: {{ .Values.metrics.agent.imagePullPolicy | default "IfNotPresent" }} | ||
| args: | ||
| - --config=/etc/otel-agent/agent-config.yaml | ||
| ports: | ||
| - name: metrics | ||
| containerPort: 8888 | ||
| protocol: TCP | ||
| - name: health | ||
| containerPort: 13133 | ||
| protocol: TCP | ||
| livenessProbe: | ||
| httpGet: | ||
| path: / | ||
| port: health | ||
| initialDelaySeconds: 30 | ||
| periodSeconds: 30 | ||
| timeoutSeconds: 5 | ||
| failureThreshold: 5 | ||
| readinessProbe: | ||
| httpGet: | ||
| path: / | ||
| port: health | ||
| initialDelaySeconds: 10 | ||
| periodSeconds: 10 | ||
| timeoutSeconds: 3 | ||
| failureThreshold: 3 | ||
| resources: | ||
| limits: | ||
| memory: {{ .Values.metrics.agent.memoryLimit | default "256Mi" }} | ||
| cpu: {{ .Values.metrics.agent.cpuLimit | default "100m" }} | ||
| requests: | ||
| memory: {{ div (regexReplaceAll "Mi|Gi" (.Values.metrics.agent.memoryLimit | default "256Mi") "") 2 | printf "%dMi" }} | ||
| cpu: {{ div (regexReplaceAll "m" (.Values.metrics.agent.cpuLimit | default "100m") "") 2 | printf "%dm" }} | ||
| volumeMounts: | ||
| - name: agent-config | ||
| mountPath: /etc/otel-agent | ||
| readOnly: true | ||
| env: | ||
| - name: PLANE_INSTALLATION_UUID | ||
| value: {{ include "plane.metrics.installationUUID" . | quote }} | ||
| - name: PLANE_INSTALLATION_TYPE | ||
| value: {{ .Values.metrics.installation.type | default "kubernetes" | quote }} | ||
| - name: KUBERNETES_NODE_NAME | ||
| valueFrom: | ||
| fieldRef: | ||
| fieldPath: spec.nodeName | ||
| - name: KUBERNETES_POD_NAME | ||
| valueFrom: | ||
| fieldRef: | ||
| fieldPath: metadata.name | ||
| - name: KUBERNETES_NAMESPACE | ||
| valueFrom: | ||
| fieldRef: | ||
| fieldPath: metadata.namespace | ||
| volumes: | ||
| - name: agent-config | ||
| configMap: | ||
| name: {{ .Release.Name }}-metrics-config | ||
| {{- end }} | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
Script executed:
Repository: makeplane/helm-charts
<shell_result>
Repository: makeplane/helm-charts
Exit code: 1
stdout:
</shell_result>
Script executed:
Repository: makeplane/helm-charts
<shell_result>
Repository: makeplane/helm-charts
Exit code: 0
stdout:
</shell_result>
Script executed:
Repository: makeplane/helm-charts
<shell_result>
Repository: makeplane/helm-charts
Exit code: 0
stdout:
</shell_result>
plane.metrics.installationUUIDgenerates inconsistent UUIDs across uses when emptyEach call to
include "plane.metrics.installationUUID"executesuuidv4independently when.Values.metrics.installation.uuidis empty. Within a single release, this produces three different UUIDs for:PLANE_INSTALLATION_UUIDenvironment variable (metrics-agent pod)plane_installation_uuidPrometheus external label (metrics-config)plane.installation.uuidOTel resource attribute (metrics-config)This breaks the concept of a single stable installation identifier and makes telemetry correlation impossible. The
values.yamlcomment "Auto-generated if empty" is misleading.Recommended fix:
.Values.metrics.installation.uuid(update comment and defaults), orlookupso the UUID persists across template renders and upgrades.