Named after the Greek goddess of memory, preserves and replays the essence of your application's execution, allowing developers to revisit, analyze, and refine specific moments in code with precision.
Mneme is a tool allowing recording the execution of a GPU (CUDA/HIP) kernel and replaying that kernel as an independent executable.
For full usage instructions, tutorials, and API reference, please visit the Documentation.
- Record: Capture GPU kernels from large applications into isolated replayable units.
- Replay: Execute captured kernels independently without the original application context.
- Tune: Optimize kernel parameters (block size, grid size) and compiler passes using Python tools like Optuna.
We welcome all kinds of contributions: new features, bug fixes, documentation edits; it's all great!
To contribute, make a pull request, with develop as the destination branch.
Mneme is released under Apache License (Version 2.0) with LLVM exceptions. For more details, please see the LICENSE.
LLNL-CODE-2000766
If you use this software, please cite it as below:
@inproceedings{parasyris2023scalable,
title={Scalable Tuning of (OpenMP) GPU Applications via Kernel Record and Replay},
author={Parasyris, Konstantinos and Georgakoudis, Giorgis and Rangel, Esteban and Laguna, Ignacio and Doerfert, Johannes},
booktitle={Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis},
pages={1--14},
year={2023}
}