PytatoPyOpenCLArrayContext: add support for kernel profiling#311
PytatoPyOpenCLArrayContext: add support for kernel profiling#311
Conversation
5f2d45d to
1ce6dce
Compare
1ce6dce to
2c2e431
Compare
There was a problem hiding this comment.
Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.
Comments suppressed due to low confidence (2)
arraycontext/impl/pytato/compile.py:643
- Add tests that verify kernel profiling support works as expected when profile_kernels is enabled.
if self.actx.profile_kernels:
arraycontext/impl/pytato/init.py:299
- Add tests for the new profiling methods (e.g., _wait_and_transfer_profile_events, get_profiling_data_for_kernel, reset_profiling_data_for_kernel, tabulate_profiling_data) to ensure that profiling data is collected and summarized correctly.
self.profile_kernels = profile_kernels
alexfikl
left a comment
There was a problem hiding this comment.
Got excited about the profiling so I skimmed a bit and left some comments, hope that's ok!
arraycontext/impl/pytato/__init__.py
Outdated
| """Reset profiling data for kernel `kernel_name`.""" | ||
| self.profile_results.pop(kernel_name, None) | ||
|
|
||
| def tabulate_profiling_data(self) -> pytools.Table: |
There was a problem hiding this comment.
This looks more like a helper function than a class method?
There was a problem hiding this comment.
Dunno - this is currently the main way of "getting" the complete profiling data.
There was a problem hiding this comment.
Isn't the main way of getting it actx.profile_results and/or actx.get_profile_for_kernel? This is just a handy way to display it (probably one of many).
There was a problem hiding this comment.
Sure, it's just a convenience function for the user. We have the same method for eager profiling: https://github.com/illinois-ceesd/mirgecom/blob/41c6e567d44258557d95ef8453c3b82c4c89ab55/mirgecom/profiling.py#L216
There was a problem hiding this comment.
I agree with @alexfikl that this should be a function, not a method.
Co-authored-by: Alexandru Fikl <alexfikl@gmail.com>
arraycontext/impl/pytato/__init__.py
Outdated
| """Reset profiling data for kernel `kernel_name`.""" | ||
| self.profile_results.pop(kernel_name, None) | ||
|
|
||
| def tabulate_profiling_data(self) -> pytools.Table: |
There was a problem hiding this comment.
I agree with @alexfikl that this should be a function, not a method.
arraycontext/impl/pytato/__init__.py
Outdated
| """Holds a profile event that has not been collected by the profiler yet.""" | ||
|
|
||
| cl_event: cl._cl.Event | ||
| translation_unit: Any |
There was a problem hiding this comment.
Do we want the translation_unit in here? I'm not sold.
There was a problem hiding this comment.
Changed it to just the name of the translation unit in 7216ccf
arraycontext/impl/pytato/__init__.py
Outdated
| allocator=self.allocator, | ||
| **bound_arguments) | ||
|
|
||
| self.add_profiling_event(evt, pt_prg) |
There was a problem hiding this comment.
Only call the function if profiling is enabled, to avoid incurring the function call penalty. (Also below.)
arraycontext/impl/pytato/__init__.py
Outdated
| allocator=self.allocator, | ||
| **bound_arguments) | ||
|
|
||
| self.add_profiling_event(evt, pt_prg) |
There was a problem hiding this comment.
This is inaccurate: one loopy TranslationUnit will enqueue multiple kernels, so you'll need to either
- collect multiple events, or
- enqueue a marker before and after.
(Also below.)
There was a problem hiding this comment.
I implemented the "marker" strategy in 7216ccf
99cc47a to
55537cc
Compare
7f24976 to
1123c16
Compare
1fbf1ad to
03dfa33
Compare
arraycontext/impl/pytato/__init__.py
Outdated
| def get_profiling_data_for_kernel(self, kernel_name: str) \ | ||
| -> MultiCallKernelProfile: | ||
| """Return profiling data for kernel *kernel_name*.""" | ||
| self._wait_and_transfer_profile_events() | ||
|
|
||
| time = 0 | ||
| num_calls = 0 | ||
|
|
||
| if kernel_name in self.profile_results: | ||
| knl_results = self.profile_results[kernel_name] | ||
|
|
||
| num_calls = len(knl_results) | ||
|
|
||
| for r in knl_results: | ||
| time += r | ||
|
|
||
| return MultiCallKernelProfile(num_calls, time) | ||
|
|
There was a problem hiding this comment.
What I'd like from the user interface here is just minimal access to the raw data and a reset primitive, without aggregation, and without exposing the before-wait/after-wait distinction. Everything else belongs in external utility routines, IMO.
There was a problem hiding this comment.
I tried to minimize the API in 6bdb85b. What do you think?
arraycontext/impl/pytato/utils.py
Outdated
| # }}} | ||
|
|
||
|
|
||
| def tabulate_profiling_data(profile_results: dict[str, MultiCallKernelProfile])\ |
There was a problem hiding this comment.
Added the (so far only) interface to the doc in 6bdb85b.
arraycontext/impl/pytato/__init__.py
Outdated
| time += r | ||
|
|
||
| return MultiCallKernelProfile(num_calls, time) | ||
|
|
There was a problem hiding this comment.
Make sure to add the final interface to the class documentation.
There was a problem hiding this comment.
Added the (so far only) interface to the doc in 6bdb85b.
Co-authored-by: Andreas Klöckner <inform@tiker.net>
6a878e2 to
94d9d16
Compare
e986f53 to
ea691a4
Compare
ea691a4 to
fedb836
Compare
|
Thx! |
TODOs:
Please squash