-
Notifications
You must be signed in to change notification settings - Fork 6k
[Cpp API Compatibility] Sync c10 CUDA stream state with Paddle's GPUContext stream #78652
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -191,10 +191,10 @@ CUDAStream getStreamFromExternal(cudaStream_t ext_stream, | |||||||||||||||||||||||||||
| #endif | ||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||
| /** | ||||||||||||||||||||||||||||
| * Set the current CUDA stream for the device of the given stream in the | ||||||||||||||||||||||||||||
| * calling thread. | ||||||||||||||||||||||||||||
| * Set the current CUDA stream for the device of the given stream. | ||||||||||||||||||||||||||||
| * | ||||||||||||||||||||||||||||
| * Implements per-thread, per-device current stream semantics. | ||||||||||||||||||||||||||||
| * Keeps the compat c10 stream state aligned with Paddle's GPUContext so | ||||||||||||||||||||||||||||
| * Paddle stream guards and c10 callers observe the same current stream. | ||||||||||||||||||||||||||||
|
Comment on lines
+196
to
+197
|
||||||||||||||||||||||||||||
| * Keeps the compat c10 stream state aligned with Paddle's GPUContext so | |
| * Paddle stream guards and c10 callers observe the same current stream. | |
| * This updates Paddle's current stream state through the shared GPUContext | |
| * stored in DeviceContextPool for the target device so Paddle stream guards | |
| * and c10 callers observe the same current stream. | |
| * | |
| * Semantics: this is not a PyTorch-style per-thread current-stream setting. | |
| * The change is effectively process-wide for the given device because other | |
| * threads using the same device may observe the updated current stream. | |
| * | |
| * Thread-safety: callers must not assume thread-local isolation. Concurrent | |
| * calls that change the current stream for the same device can affect one | |
| * another, so external synchronization may be required. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CUDAStream.cpp now stores/constructs phi::CUDAStream objects (std::unique_ptrphi::CUDAStream and std::make_uniquephi::CUDAStream(...)), but this TU only includes context_pool.h and gpu_context.h which forward-declare phi::CUDAStream; it does not include the definition from paddle/phi/core/cuda_stream.h. This will fail to compile due to incomplete type usage (make_unique / unique_ptr destructor). Add the proper include for the phi::CUDAStream definition.