Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 7 additions & 1 deletion docs/profiling.md
Original file line number Diff line number Diff line change
Expand Up @@ -362,7 +362,13 @@ The following options are available for GPU profiling:
[NVIDIA's CUPTI documentation](https://docs.nvidia.com/cupti/main/main.html#metrics-table).
* `gpu_pm_sample_interval_us`: Sets the sampling interval in microseconds
for CUPTI PM sampling. Defaults to `500`.
* `gpu_pm_sample_buffer_size_per_gpu_mb`: Sets the system memory buffer size per device in MB for CUPTI PM sampling. Defaults to 64MB. The maximum supported value is 4GB.
* `gpu_pm_sample_buffer_size_per_gpu_mb`: Sets the system memory buffer size
per device in MB for CUPTI PM sampling. Defaults to 64MB. The maximum
supported value is 4GB.
* `gpu_num_chips_to_profile_per_task`: Specifies the number of GPU devices to
profile per task. If not specified, set to 0, or set to an invalid value,
all available GPUs will be profiled. This can be used to decrease the trace
collection size.
* `gpu_dump_graph_node_mapping`: If enabled, dumps CUDA graph node
mapping information into the trace. Defaults to `False`.

Expand Down
Loading