Skip to content

Commit 637cf11

Browse files
authored
Fix Grafana dashboards for Broker with honorLabels, remove unnecessary *_created metrics and improve docs (#593)
* Drop _created metrics for broker and proxy * Enable all metrics by default for broker * change default dashboard * Remove messy dashboards * Enable default dashboards in Grafana * Add testing values with more aggressive disk cleanup * Add VictoriaMetrics debugging instructions * Set honorLabels to true * Document disabling monitoring * Set password in testing values * Fix linting issue detected by kubeconform
1 parent e6f0580 commit 637cf11

File tree

6 files changed

+243
-14
lines changed

6 files changed

+243
-14
lines changed

README.md

Lines changed: 64 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -280,23 +280,53 @@ You can also checkout out the example values file for different deployments.
280280

281281
In order to disable the victoria-metrics-k8s-stack, you can add the following to your `values.yaml`.
282282
Victoria Metrics components can also be disabled and enabled individually if you only need specific monitoring features.
283-
Please refer to the default [`values.yaml`](charts/pulsar/values.yaml).
284283

285284
```yaml
285+
# disable VictoriaMetrics and related components
286286
victoria-metrics-k8s-stack:
287287
enabled: false
288288
victoria-metrics-operator:
289289
enabled: false
290+
vmsingle:
291+
enabled: false
292+
vmagent:
293+
enabled: false
290294
kube-state-metrics:
291295
enabled: false
292296
prometheus-node-exporter:
293297
enabled: false
294298
grafana:
295299
enabled: false
300+
301+
Additionally, you'll need to set each component's `podMonitor` property to `false`.
302+
303+
```yaml
304+
# disable pod monitors
305+
autorecovery:
306+
podMonitor:
307+
enabled: false
308+
bookkeeper:
309+
podMonitor:
310+
enabled: false
311+
oxia:
312+
server:
313+
podMonitor:
314+
enabled: false
315+
coordinator:
316+
podMonitor:
317+
enabled: false
318+
broker:
319+
podMonitor:
320+
enabled: false
321+
proxy:
322+
podMonitor:
323+
enabled: false
324+
zookeeper:
325+
podMonitor:
326+
enabled: false
296327
```
297328

298-
Additionally, you'll need to set each component's `podMonitor` property to `false`. This is shown in some [examples](./examples) and is
299-
verified in some [tests](./.ci/clusters).
329+
This is shown in some [examples/values-disable-monitoring.yaml](examples/values-disable-monitoring.yaml).
300330

301331
## Pulsar Manager
302332

@@ -497,6 +527,36 @@ We've done our best to make these charts as seamless as possible,
497527
occasionally troubles do surface outside of our control. We've collected
498528
tips and tricks for troubleshooting common issues. Please examine these first before raising an [issue](https://github.com/apache/pulsar-helm-chart/issues/new/choose), and feel free to add to them by raising a [Pull Request](https://github.com/apache/pulsar-helm-chart/compare)!
499529

530+
### VictoriaMetrics Troubleshooting
531+
532+
In example commands, k8s is namespace `pulsar` replace with your deployment namespace.
533+
534+
#### VictoriaMetrics Web UI
535+
536+
Connecting to `vmsingle` pod for web UI.
537+
538+
```shell
539+
kubectl port-forward -n pulsar $(kubectl get pods -n pulsar -l app.kubernetes.io/name=vmsingle -o jsonpath='{.items[0].metadata.name}') 8429:8429
540+
```
541+
542+
Now you can access the UI at http://localhost:8429 and http://localhost:8429/vmui (for similar UI as in Prometheus)
543+
544+
#### VictoriaMetrics Scraping debugging UI - Active Targets
545+
546+
Connection to `vmagent` pod for debugging targets.
547+
548+
```shell
549+
kubectl port-forward -n pulsar $(kubectl get pods -n pulsar -l app.kubernetes.io/name=vmagent -o jsonpath='{.items[0].metadata.name}') 8429:8429
550+
```
551+
552+
Now you can access the UI at http://localhost:8429
553+
554+
Active Targets UI
555+
- http://localhost:8429/targets
556+
557+
Scraping Configuration
558+
- http://localhost:8429/config
559+
500560
## Release Process
501561

502-
See [RELEASE.md](RELEASE.md)
562+
See [RELEASE.md](RELEASE.md)

charts/pulsar/templates/_monitor.tpl

Lines changed: 15 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -55,6 +55,8 @@ spec:
5555
scheme: http
5656
interval: {{ $valuesPath.podMonitor.interval }}
5757
scrapeTimeout: {{ $valuesPath.podMonitor.scrapeTimeout }}
58+
# Set honor labels to true to allow overriding namespace label with Pulsar's namespace label
59+
honorLabels: true
5860
{{- if index $root.Values "victoria-metrics-k8s-stack" "enabled" }}
5961
relabelConfigs:
6062
{{- else }}
@@ -71,14 +73,26 @@ spec:
7173
- sourceLabels: [__meta_kubernetes_pod_name]
7274
action: replace
7375
targetLabel: kubernetes_pod_name
74-
{{- with $valuesPath.podMonitor.metricRelabelings }}
76+
{{- if or $valuesPath.podMonitor.metricRelabelings (and $valuesPath.podMonitor.dropUnderscoreCreatedMetrics (index $valuesPath.podMonitor.dropUnderscoreCreatedMetrics "enabled")) }}
7577
{{- if index $root.Values "victoria-metrics-k8s-stack" "enabled" }}
7678
metricRelabelConfigs:
7779
{{- else }}
7880
metricRelabelings:
7981
{{- end }}
82+
{{- if and $valuesPath.podMonitor.dropUnderscoreCreatedMetrics (index $valuesPath.podMonitor.dropUnderscoreCreatedMetrics "enabled") }}
83+
# Drop metrics that end with _created, auto-created by metrics library to match OpenMetrics format
84+
- sourceLabels: [__name__]
85+
{{- if and (hasKey $valuesPath.podMonitor.dropUnderscoreCreatedMetrics "excludePatterns") $valuesPath.podMonitor.dropUnderscoreCreatedMetrics.excludePatterns }}
86+
regex: "(?!{{ $valuesPath.podMonitor.dropUnderscoreCreatedMetrics.excludePatterns | join "|" }}).*_created$"
87+
{{- else }}
88+
regex: ".*_created$"
89+
{{- end }}
90+
action: drop
91+
{{- end }}
92+
{{- with $valuesPath.podMonitor.metricRelabelings }}
8093
{{ toYaml . | indent 8 }}
8194
{{- end }}
95+
{{- end }}
8296
selector:
8397
matchLabels:
8498
{{- include "pulsar.matchLabels" $root | nindent 6 }}

charts/pulsar/templates/broker-configmap.yaml

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -76,7 +76,21 @@ data:
7676

7777
# Broker settings
7878
clusterName: {{ template "pulsar.cluster.name" . }}
79+
80+
# Enable all metrics by default
7981
exposeTopicLevelMetricsInPrometheus: "true"
82+
exposeConsumerLevelMetricsInPrometheus: "true"
83+
exposeProducerLevelMetricsInPrometheus: "true"
84+
exposeManagedLedgerMetricsInPrometheus: "true"
85+
exposeManagedCursorMetricsInPrometheus: "true"
86+
exposeBundlesMetricsInPrometheus: "true"
87+
exposePublisherStats: "true"
88+
exposePreciseBacklogInPrometheus: "true"
89+
replicationMetricsEnabled: "true"
90+
splitTopicAndPartitionLabelInPrometheus: "true"
91+
aggregatePublisherStatsByProducerName: "true"
92+
bookkeeperClientExposeStatsToPrometheus: "true"
93+
8094
numHttpServerThreads: "8"
8195
statusFilePath: "{{ template "pulsar.home" . }}/logs/status"
8296

charts/pulsar/values.yaml

Lines changed: 46 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -975,6 +975,16 @@ broker:
975975
enabled: true
976976
interval: 60s
977977
scrapeTimeout: 60s
978+
# Removes metrics that end with _created suffix
979+
# These metrics are automatically generated by the Prometheus client library to comply with OpenMetrics format
980+
# and aren't currently used. Disable this if you need to use these metrics or add an exclusion pattern when
981+
# a specific metric is needed.
982+
dropUnderscoreCreatedMetrics:
983+
enabled: true
984+
# Optional regex pattern to exclude specific metrics from being dropped
985+
# excludePatterns:
986+
# - pulsar_topic_load_times_created
987+
# Custom metric relabelings to apply to all metrics
978988
metricRelabelings:
979989
# - action: labeldrop
980990
# regex: cluster
@@ -1222,6 +1232,16 @@ proxy:
12221232
enabled: true
12231233
interval: 60s
12241234
scrapeTimeout: 60s
1235+
# Removes metrics that end with _created suffix
1236+
# These metrics are automatically generated by the Prometheus client library to comply with OpenMetrics format
1237+
# and aren't currently used. Disable this if you need to use these metrics or add an exclusion pattern when
1238+
# a specific metric is needed.
1239+
dropUnderscoreCreatedMetrics:
1240+
enabled: true
1241+
# Optional regex pattern to exclude specific metrics from being dropped
1242+
# excludePatterns:
1243+
# - pulsar_proxy_new_connections_created
1244+
# Custom metric relabelings to apply to all metrics
12251245
metricRelabelings:
12261246
# - action: labeldrop
12271247
# regex: cluster
@@ -1534,16 +1554,32 @@ victoria-metrics-k8s-stack:
15341554
sidecar:
15351555
dashboards:
15361556
enabled: false
1557+
# grafana.ini settings
1558+
grafana.ini:
1559+
analytics:
1560+
check_for_updates: false
1561+
dashboards:
1562+
default_home_dashboard_path: /var/lib/grafana/dashboards/pulsar/overview.json
1563+
15371564
## Configure Pulsar dashboards for Grafana
15381565
dashboardProviders:
15391566
dashboardproviders.yaml:
15401567
apiVersion: 1
15411568
providers:
1569+
- name: 'default'
1570+
orgId: 1
1571+
folder: ''
1572+
type: file
1573+
disableDeletion: true
1574+
editable: true
1575+
allowUiUpdates: true
1576+
options:
1577+
path: /var/lib/grafana/dashboards/default
15421578
- name: oxia
15431579
orgId: 1
15441580
folder: Oxia
15451581
type: file
1546-
disableDeletion: false
1582+
disableDeletion: true
15471583
editable: true
15481584
allowUiUpdates: true
15491585
options:
@@ -1552,12 +1588,20 @@ victoria-metrics-k8s-stack:
15521588
orgId: 1
15531589
folder: Pulsar
15541590
type: file
1555-
disableDeletion: false
1591+
disableDeletion: true
15561592
editable: true
15571593
allowUiUpdates: true
15581594
options:
15591595
path: /var/lib/grafana/dashboards/pulsar
15601596
dashboards:
1597+
default:
1598+
victoriametrics:
1599+
gnetId: 10229
1600+
revision: 38
1601+
datasource: VictoriaMetrics
1602+
kubernetes:
1603+
gnetId: 14205
1604+
datasource: VictoriaMetrics
15611605
oxia:
15621606
oxia-containers:
15631607
url: https://raw.githubusercontent.com/lhotari/pulsar-grafana-dashboards/master/oxia/oxia-containers.json
@@ -1576,10 +1620,6 @@ victoria-metrics-k8s-stack:
15761620
pulsar:
15771621
bookkeeper-compaction:
15781622
url: https://raw.githubusercontent.com/lhotari/pulsar-grafana-dashboards/master/pulsar/bookkeeper-compaction.json
1579-
bookkeeper-read-cache:
1580-
url: https://raw.githubusercontent.com/lhotari/pulsar-grafana-dashboards/master/pulsar/bookkeeper-read-cache.json
1581-
bookkeeper-read-use:
1582-
url: https://raw.githubusercontent.com/lhotari/pulsar-grafana-dashboards/master/pulsar/bookkeeper-read-use.json
15831623
bookkeeper:
15841624
url: https://raw.githubusercontent.com/lhotari/pulsar-grafana-dashboards/master/pulsar/bookkeeper.json
15851625
broker-cache-by-broker:
@@ -1612,13 +1652,10 @@ victoria-metrics-k8s-stack:
16121652
url: https://raw.githubusercontent.com/lhotari/pulsar-grafana-dashboards/master/pulsar/proxy.json
16131653
sockets:
16141654
url: https://raw.githubusercontent.com/lhotari/pulsar-grafana-dashboards/master/pulsar/sockets.json
1615-
tenant:
1616-
url: https://raw.githubusercontent.com/lhotari/pulsar-grafana-dashboards/master/pulsar/tenant.json
16171655
topic:
16181656
url: https://raw.githubusercontent.com/lhotari/pulsar-grafana-dashboards/master/pulsar/topic.json
16191657
zookeeper:
16201658
url: https://raw.githubusercontent.com/lhotari/pulsar-grafana-dashboards/master/pulsar/zookeeper.json
1621-
16221659
## Node exporter component
16231660
prometheus-node-exporter:
16241661
enabled: true
Lines changed: 58 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,58 @@
1+
#
2+
# Licensed to the Apache Software Foundation (ASF) under one
3+
# or more contributor license agreements. See the NOTICE file
4+
# distributed with this work for additional information
5+
# regarding copyright ownership. The ASF licenses this file
6+
# to you under the Apache License, Version 2.0 (the
7+
# "License"); you may not use this file except in compliance
8+
# with the License. You may obtain a copy of the License at
9+
#
10+
# http://www.apache.org/licenses/LICENSE-2.0
11+
#
12+
# Unless required by applicable law or agreed to in writing,
13+
# software distributed under the License is distributed on an
14+
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
15+
# KIND, either express or implied. See the License for the
16+
# specific language governing permissions and limitations
17+
# under the License.
18+
#
19+
20+
# disable monitoring
21+
victoria-metrics-k8s-stack:
22+
enabled: false
23+
victoria-metrics-operator:
24+
enabled: false
25+
vmsingle:
26+
enabled: false
27+
vmagent:
28+
enabled: false
29+
kube-state-metrics:
30+
enabled: false
31+
prometheus-node-exporter:
32+
enabled: false
33+
grafana:
34+
enabled: false
35+
36+
# disable pod monitors
37+
autorecovery:
38+
podMonitor:
39+
enabled: false
40+
bookkeeper:
41+
podMonitor:
42+
enabled: false
43+
oxia:
44+
server:
45+
podMonitor:
46+
enabled: false
47+
coordinator:
48+
podMonitor:
49+
enabled: false
50+
broker:
51+
podMonitor:
52+
enabled: false
53+
proxy:
54+
podMonitor:
55+
enabled: false
56+
zookeeper:
57+
podMonitor:
58+
enabled: false

examples/values-testing.yaml

Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,46 @@
1+
#
2+
# Licensed to the Apache Software Foundation (ASF) under one
3+
# or more contributor license agreements. See the NOTICE file
4+
# distributed with this work for additional information
5+
# regarding copyright ownership. The ASF licenses this file
6+
# to you under the Apache License, Version 2.0 (the
7+
# "License"); you may not use this file except in compliance
8+
# with the License. You may obtain a copy of the License at
9+
#
10+
# http://www.apache.org/licenses/LICENSE-2.0
11+
#
12+
# Unless required by applicable law or agreed to in writing,
13+
# software distributed under the License is distributed on an
14+
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
15+
# KIND, either express or implied. See the License for the
16+
# specific language governing permissions and limitations
17+
# under the License.
18+
#
19+
20+
# disabled AntiAffinity
21+
affinity:
22+
anti_affinity: false
23+
24+
victoria-metrics-k8s-stack:
25+
grafana:
26+
adminPassword: verysecureword123
27+
28+
bookkeeper:
29+
configData:
30+
# more aggressive disk cleanup
31+
journalMaxSizeMB: "256"
32+
majorCompactionInterval: "600"
33+
minorCompactionInterval: "300"
34+
compactionRateByEntries: "5000"
35+
gcWaitTime: "60000"
36+
37+
broker:
38+
configData:
39+
# more aggressive disk cleanup
40+
managedLedgerMinLedgerRolloverTimeMinutes: "1"
41+
managedLedgerMaxLedgerRolloverTimeMinutes: "5"
42+
# configure deletion of inactive topics
43+
brokerDeleteInactiveTopicsMaxInactiveDurationSeconds: "86400"
44+
45+
proxy:
46+
replicaCount: 1

0 commit comments

Comments
 (0)