You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/profiling.md
+3Lines changed: 3 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -363,6 +363,9 @@ The following options are available for GPU profiling:
363
363
*`gpu_pm_sample_interval_us`: Sets the sampling interval in microseconds
364
364
for CUPTI PM sampling. Defaults to `500`.
365
365
*`gpu_pm_sample_buffer_size_per_gpu_mb`: Sets the system memory buffer size per device in MB for CUPTI PM sampling. Defaults to 64MB. The maximum supported value is 4GB.
366
+
*`gpu_num_chips_to_profile_per_task`: Specifies the number of GPU devices to
367
+
profile per task. If set to 0, all available GPUs will be profiled. This can
368
+
be used to decrease the trace collection size.
366
369
*`gpu_dump_graph_node_mapping`: If enabled, dumps CUDA graph node
367
370
mapping information into the trace. Defaults to `False`.
0 commit comments