-
Notifications
You must be signed in to change notification settings - Fork 127
cl_ext_buffer_device_address #1159
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
bashbaug
merged 5 commits into
KhronosGroup:main
from
pjaaskel:cl_ext_buffer_device_address
Feb 25, 2025
Merged
Changes from all commits
Commits
Show all changes
5 commits
Select commit
Hold shift + click to select a range
9b18682
cl_ext_buffer_device_address
dbc5b7e
BDA: Removed CL_MEM_DEVICE_SHARED_ADDRESS_EXT as unneeded.
7a28bf3
cl_ext_buffer_device_address to 1.0.0
5015843
cl_ext_buffer_device_address: Revision 1.0.1
8d13bfa
cl_ext_buffer_device_address: Revision 1.0.2
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,80 @@ | ||
| // Copyright 2024 The Khronos Group Inc. | ||
| // SPDX-License-Identifier: CC-BY-4.0 | ||
|
|
||
| include::{generated}/meta/{refprefix}cl_ext_buffer_device_address.txt[] | ||
|
|
||
| === Other Extension Metadata | ||
|
|
||
| *Last Modified Date*:: | ||
| 2025-02-04 | ||
| *IP Status*:: | ||
| No known IP claims. | ||
| *Contributors*:: | ||
| - Pekka Jääskeläinen, Intel + | ||
| - Karol Herbst, Red Hat + | ||
| - Ben Ashbaugh, Intel + | ||
| - Kevin Petit, Arm + | ||
| - Henry Linjamäki, Intel + | ||
|
|
||
| === Description | ||
|
|
||
| This extension provides access to raw device pointers for cl_mem buffers | ||
| without requiring a shared virtual address space between the host and | ||
| the device. | ||
|
|
||
| ==== Background | ||
|
|
||
| Shared Virtual Memory (SVM) introduced in OpenCL 2.0 is the first feature | ||
| that enables raw pointers in the OpenCL standard. Its coarse-grain | ||
| variant is relatively simple to implement on various platforms in terms of | ||
| coherency requirements, but it requires mapping the buffer's address range | ||
| to the host virtual address space. | ||
| However, various higher-level heterogeneous APIs present a memory allocation | ||
| routine which can allocate device-only memory and provide raw addresses to | ||
| it without guarentees of system-wide uniqueness. For example, minimal | ||
| implementations of OpenMP's omp_target_alloc() and CUDA/HIP's | ||
| cudaMalloc()/hipMalloc() do not require a shared address space between the host and the device. | ||
|
|
||
| Host-device unified addressing might not be a major implementation issue in | ||
| systems which can provide virtual memory across the platform, but might | ||
| bring challenges in cases where the device presents a global memory with | ||
| a disjoint address space (that can also be a physical memory address space) or, | ||
| for example, when a barebone embedded system lacks virtual memory support altogether. | ||
| This extension is targeted to complement the OpenCL SVM extension by providing | ||
| an additional lower-end step in the spectrum of type of pointers/buffers OpenCL | ||
| can allocate. | ||
|
|
||
| === New Command | ||
|
|
||
| * {clSetKernelArgDevicePointerEXT} | ||
|
|
||
| === New Types | ||
|
|
||
| * {cl_mem_device_address_EXT} | ||
|
|
||
| === New Enums | ||
|
|
||
| * {cl_mem_properties_TYPE} | ||
| ** {CL_MEM_DEVICE_PRIVATE_ADDRESS_EXT} | ||
| * {cl_mem_info_TYPE} | ||
| ** {CL_MEM_DEVICE_ADDRESS_EXT} | ||
| * {cl_kernel_exec_info_TYPE} | ||
| ** {CL_KERNEL_EXEC_INFO_DEVICE_PTRS_EXT} | ||
|
|
||
| === Version History | ||
|
|
||
| * Revision 1.0.0, 2025-01-15 | ||
| ** Initial version for detailed review. | ||
| * Revision 1.0.1, 2025-01-28 | ||
| ** Made it explicit that passing illegal pointers is legal as long as they are | ||
| not referenced. Removed CL_INVALID_ARG_VALUE as a possible error in | ||
| clSetKernelArgDevicePointerEXT() as there are no illegal pointer | ||
| cases when calling this function. Return CL_INVALID_OPERATION for | ||
| clGetMemObjectInfo() if the pointer is not a buffer device pointer. | ||
| clSetKernelExecInfo() and clSetKernelArgDevicePointerEXT() now only | ||
| error out if no devices in the context associated with kernel support | ||
| device pointers. | ||
| * Revision 1.0.2, 2025-02-04 | ||
| ** Converted the clSetKernelArgDevicePointerEXT() address parameter to | ||
| a value instead of a pointer to the value. | ||
|
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
|
|
@@ -595,6 +595,35 @@ include::{generated}/api/version-notes/CL_MEM_DEVICE_HANDLE_LIST_KHR.asciidoc[] | |||||
| {CL_MEM_DEVICE_HANDLE_LIST_END_KHR_anchor}) to associate with the | ||||||
| external memory handle. | ||||||
| endif::cl_khr_external_memory[] | ||||||
|
|
||||||
| ifdef::cl_ext_buffer_device_address[] | ||||||
|
|
||||||
| | {CL_MEM_DEVICE_PRIVATE_ADDRESS_EXT_anchor} | ||||||
|
|
||||||
| include::{generated}/api/version-notes/CL_MEM_DEVICE_PRIVATE_ADDRESS_EXT.asciidoc[] | ||||||
| | {cl_bool_TYPE} | ||||||
| | When set to {CL_TRUE}, specifies that the buffer must have a single fixed | ||||||
| device-side address for its lifetime, and the address can be queried via {clGetMemObjectInfo}. | ||||||
|
|
||||||
| Each device in the context can have their own (fixed) device-side address and | ||||||
| a copy of the created buffer which are synchronized | ||||||
| implicitly by the runtime. | ||||||
|
|
||||||
| The flag might imply that the buffer will be "pinned" permanently to | ||||||
| a device's memory, but might not be necessarily so, as long as the address | ||||||
| range of the buffer remains constant. | ||||||
|
|
||||||
| The device addresses of sub-buffers derived from {CL_MEM_DEVICE_PRIVATE_ADDRESS_EXT} | ||||||
| allocated buffers can be computed by adding the sub-buffer origin to the | ||||||
| device-specific start address. | ||||||
|
|
||||||
| If the device supports SVM and {clCreateBufferWithProperties} is called with a pointer | ||||||
| returned by {clSVMAlloc} as its _host_ptr_ argument, and {CL_MEM_USE_HOST_PTR} is | ||||||
| set in its _flags_ argument, the device-side address is guaranteed to match | ||||||
| the _host_ptr_. | ||||||
|
|
||||||
| endif::cl_ext_buffer_device_address[] | ||||||
|
|
||||||
| |==== | ||||||
|
|
||||||
| ifdef::cl_khr_external_memory[] | ||||||
|
|
@@ -662,6 +691,12 @@ ifdef::cl_khr_external_memory[] | |||||
| {CL_MEM_DEVICE_HANDLE_LIST_KHR} is specified as part of _properties_. | ||||||
| ** if _properties_ includes more than one external memory handle. | ||||||
| endif::cl_khr_external_memory[] | ||||||
| ifdef::cl_ext_buffer_device_address[] | ||||||
| * {CL_INVALID_OPERATION} | ||||||
| ** If _properties_ includes {CL_MEM_DEVICE_PRIVATE_ADDRESS_EXT} and there | ||||||
| are no devices in the context that support the {cl_ext_buffer_device_address_EXT} | ||||||
| extension. | ||||||
| endif::cl_ext_buffer_device_address[] | ||||||
|
|
||||||
| [[memory-flags-table]] | ||||||
| .List of supported memory flag values | ||||||
|
|
@@ -6463,6 +6498,20 @@ include::{generated}/api/version-notes/CL_MEM_D3D11_RESOURCE_KHR.asciidoc[] | |||||
| returns the _resource_ argument specified when _memobj_ was created. | ||||||
| endif::cl_khr_d3d11_sharing[] | ||||||
|
|
||||||
| ifdef::cl_ext_buffer_device_address[] | ||||||
| | {CL_MEM_DEVICE_ADDRESS_EXT_anchor} | ||||||
|
|
||||||
| include::{generated}/api/version-notes/CL_MEM_DEVICE_ADDRESS_EXT.asciidoc[] | ||||||
| | {cl_mem_device_address_EXT_TYPE}[] | ||||||
| | If _memobj_ was created using {clCreateBufferWithProperties} with | ||||||
| the {CL_MEM_DEVICE_PRIVATE_ADDRESS_EXT} property set to {CL_TRUE}, | ||||||
| returns a list of device addresses for the buffer, one for each | ||||||
| device in the context in the same order as the list of devices | ||||||
| passed to {clCreateContext}. | ||||||
|
|
||||||
| endif::cl_ext_buffer_device_address[] | ||||||
|
|
||||||
|
|
||||||
| |==== | ||||||
|
|
||||||
| // refError | ||||||
|
|
@@ -6472,6 +6521,11 @@ successfully. | |||||
| Otherwise, it returns one of the following errors: | ||||||
|
|
||||||
| * {CL_INVALID_MEM_OBJECT} if _memobj_ is a not a valid memory object. | ||||||
| ifdef::cl_ext_buffer_device_address[] | ||||||
| * {CL_INVALID_OPERATION} is returned for the {CL_MEM_DEVICE_ADDRESS_EXT} query if | ||||||
| the {cl_ext_buffer_device_address_EXT} is not supported or if the | ||||||
| buffer was not allocated with {CL_MEM_DEVICE_PRIVATE_ADDRESS_EXT}. | ||||||
| endif::cl_ext_buffer_device_address[] | ||||||
| * {CL_INVALID_VALUE} if _param_name_ is not one of the supported values, or | ||||||
| if the size in bytes specified by _param_value_size_ is less than size of | ||||||
| the return type specified in the | ||||||
|
|
@@ -10778,6 +10832,48 @@ Otherwise, it returns one of the following errors: | |||||
| required by the OpenCL implementation on the host. | ||||||
| -- | ||||||
|
|
||||||
| ifdef::cl_ext_buffer_device_address[] | ||||||
| [open,refpage='clSetKernelArgDevicePointerEXT',desc='Set a device pointer as the argument value for a specific argument of a kernel.',type='protos'] | ||||||
| -- | ||||||
| To set a device pointer as the argument value for a specific argument of a | ||||||
| kernel, call the function | ||||||
|
|
||||||
| include::{generated}/api/protos/clSetKernelArgDevicePointerEXT.txt[] | ||||||
| include::{generated}/api/version-notes/clSetKernelArgDevicePointerEXT.asciidoc[] | ||||||
|
|
||||||
| * _kernel_ is a valid kernel object. | ||||||
| * _arg_index_ is the argument index. | ||||||
| Arguments to the kernel are referred by indices that go from 0 for the | ||||||
| leftmost argument to _n_ - 1, where _n_ is the total number of arguments | ||||||
| declared by a kernel. | ||||||
| * _arg_value_ is the device pointer that should be used as the argument value for | ||||||
| argument specified by _arg_index_. | ||||||
| The device pointer specified is the value used by all API calls that enqueue | ||||||
| _kernel_ ({clEnqueueNDRangeKernel} and {clEnqueueTask}) until the argument | ||||||
| value is changed by a call to {clSetKernelArgDevicePointerEXT} for _kernel_. | ||||||
| The device pointer can only be used for arguments that are declared to be a | ||||||
| pointer to `global` memory allocated with {clCreateBufferWithProperties} with | ||||||
pjaaskel marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
| the {CL_MEM_DEVICE_PRIVATE_ADDRESS_EXT} property. The pointer value specified as | ||||||
| the argument value can be the pointer to the beginning of the buffer or any offset into | ||||||
| the buffer region. The device pointer value must be naturally aligned according to | ||||||
| the argument's type. It should be noted that it's legal to pass invalid | ||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
| pointers as the value (similarly to C/C++ function calls with pointer arguments) as | ||||||
| long as the kernel doesn't dereference the pointer. | ||||||
|
|
||||||
| {clSetKernelArgDevicePointerEXT} returns {CL_SUCCESS} if the argument was set | ||||||
| successfully. Otherwise, it returns one of the following errors: | ||||||
|
|
||||||
| * {CL_INVALID_KERNEL} if _kernel_ is not a valid kernel object. | ||||||
| * {CL_INVALID_OPERATION} if no devices in the context associated with _kernel_ support | ||||||
| the {cl_ext_buffer_device_address_EXT} extension. | ||||||
| * {CL_INVALID_ARG_INDEX} if _arg_index_ is not a valid argument index. | ||||||
| * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required | ||||||
| by the OpenCL implementation on the device. | ||||||
| * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources | ||||||
| required by the OpenCL implementation on the host. | ||||||
| -- | ||||||
| endif::cl_ext_buffer_device_address[] | ||||||
|
|
||||||
| [open,refpage='clSetKernelExecInfo',desc='Set additional execution information for a kernel.',type='protos'] | ||||||
| -- | ||||||
| To set additional execution information for a kernel, call the function | ||||||
|
|
@@ -10844,6 +10940,19 @@ include::{generated}/api/version-notes/CL_KERNEL_EXEC_INFO_SVM_FINE_GRAIN_SYSTEM | |||||
| If {clSetKernelExecInfo} has not been called with a value for | ||||||
| {CL_KERNEL_EXEC_INFO_SVM_FINE_GRAIN_SYSTEM}, the default value is | ||||||
| {CL_TRUE}. | ||||||
|
|
||||||
| ifdef::cl_ext_buffer_device_address[] | ||||||
| | {CL_KERNEL_EXEC_INFO_DEVICE_PTRS_EXT_anchor} | ||||||
|
|
||||||
| include::{generated}/api/version-notes/CL_KERNEL_EXEC_INFO_DEVICE_PTRS_EXT.asciidoc[] | ||||||
| | {cl_mem_device_address_EXT_TYPE}[] | ||||||
| | Device pointers must reference locations contained entirely within | ||||||
| buffers that are passed to kernel as arguments, or that are passed | ||||||
| through the execution information. Non-argument device pointers accessed | ||||||
| by the kernel must be specified by passing pointers to those buffers | ||||||
| via this {clSetKernelExecInfo} option. | ||||||
| endif::cl_ext_buffer_device_address[] | ||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm guessing we don't want to require this be called before or after argument setting. We probably ought to test both. |
||||||
|
|
||||||
| |==== | ||||||
|
|
||||||
| // refError | ||||||
|
|
@@ -10853,7 +10962,16 @@ successfully. | |||||
| Otherwise, it returns one of the following errors: | ||||||
|
|
||||||
| * {CL_INVALID_KERNEL} if _kernel_ is a not a valid kernel object. | ||||||
| * {CL_INVALID_OPERATION} if no devices in the context associated with _kernel_ support SVM. | ||||||
| * {CL_INVALID_OPERATION} for {CL_KERNEL_EXEC_INFO_SVM_PTRS} and | ||||||
| {CL_KERNEL_EXEC_INFO_SVM_FINE_GRAIN_SYSTEM} if no devices in | ||||||
| the context associated with _kernel_ support SVM. | ||||||
| ifdef::cl_ext_buffer_device_address[] | ||||||
| * {CL_INVALID_OPERATION} for {CL_KERNEL_EXEC_INFO_DEVICE_PTRS_EXT} if no | ||||||
| device in the context associated with _kernel_ support the {cl_ext_buffer_device_address_EXT} | ||||||
| extension. | ||||||
| endif::cl_ext_buffer_device_address[] | ||||||
| * {CL_INVALID_VALUE} if _param_name_ is not valid, if _param_value_ is | ||||||
| `NULL` or if the size specified by _param_value_size_ is not valid. | ||||||
| * {CL_INVALID_OPERATION} if _param_name_ is | ||||||
| {CL_KERNEL_EXEC_INFO_SVM_FINE_GRAIN_SYSTEM} and _param_value_ is {CL_TRUE} | ||||||
| and no devices in the context associated with _kernel_ support fine-grain | ||||||
|
|
||||||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.