You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: content/en/gpu_monitoring/setup.md
+5-5Lines changed: 5 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -8,15 +8,15 @@ This page provides instructions on setting up Datadog's GPU Monitoring on your i
8
8
9
9
To begin using Datadog's GPU Monitoring, your environment must meet the following criteria:
10
10
11
-
- You are a Datadog user with active Datadog infrastructure hosts. If not, see the following guides:
11
+
- You have a Datadog account and are running the Datadog Agent on your infrastructure hosts. If not, see the following guides:
12
12
-[Install the Datadog Agent on Kubernetes][1]
13
13
-[Install the Datadog Agent on Docker][7]
14
14
-[Install the Datadog Agent on non-containerized Linux][8]
15
15
- The NVIDIA drivers are installed on the hosts. If using Kubernetes, the NVIDIA device plugin for Kubernetes is installed ([directly][3], or through [NVIDIA GPU Operator][4])
16
16
17
17
#### Minimum version requirements
18
18
19
-
-**Datadog Agent**: version 7.72.2
19
+
-**Datadog Agent**: v7.72.2
20
20
-**Operating system**: Linux
21
21
- (Optional) For advanced eBPF metrics, Linux kernel version 5.8
22
22
-**NVIDIA driver**: version 450.51
@@ -26,10 +26,10 @@ If using Kubernetes, the following requirements must be met:
26
26
-[**Datadog Operator**][5]: version 1.18, _or_[**Datadog Helm chart**][6]: version 3.137.3
27
27
-**Kubernetes**: 1.22 with PodResources API active
28
28
29
-
## Set up GPU Monitoring on a uniform Kubernetes cluster or non-Kubernetes environment
29
+
## Set up GPU Monitoring on a uniform Kubernetes or non-Kubernetes environment
30
30
31
31
The following instructions are the basic steps to set up GPU Monitoring in the following environments:
32
-
- In a Kubernetes cluster where **all**the nodes have GPU devices
32
+
- In a Kubernetes cluster where **all** nodes have GPU devices
33
33
- In a non-Kubernetes environment, such as Docker or non-containerized Linux.
34
34
35
35
{{< tabs >}}
@@ -260,7 +260,7 @@ Additionally, to enable advanced eBPF-based metrics such as GPU core utilization
260
260
261
261
## Set up GPU Monitoring on a mixed Kubernetes cluster
262
262
263
-
In a mixed cluster, some nodes have GPU devices while other nodes do not. Due to the way that the NVIDIA device plugin for Kubernetes works, the agent needs to have a specific runtime class to access the GPU devices that is only present on GPU nodes. Therefore, two separate DaemonSets, one for the runtime class in GPU nodes, and another for non-GPU nodes, are required.
263
+
In a mixed cluster, some nodes have GPU devices while other nodes do not. Due to the way that the NVIDIA device plugin for Kubernetes works, the Agent needs to have a specific runtime class to access the GPU devices that is only present on GPU nodes. Therefore, two separate DaemonSets, one for the runtime class in GPU nodes, and another for non-GPU nodes, are required.
264
264
265
265
The recommended method to set up the Agent in this case is using the Datadog Operator, version 1.20 or greater, which provides features to make this setup easier. However, for compatibility reasons instructions are also provided for Helm installations or for older versions of the Datadog Operator.
0 commit comments