-
Notifications
You must be signed in to change notification settings - Fork 73
Description
Why is used memory_order_seq_cst
instead of memory_order_release
/memory_order_acquire
when both flags CLK_GLOBAL_MEM_FENCE
and CLK_LOCAL_MEM_FENCE
are present? Is not release
resp. acquire
enough when both address spaces are specified?
llvm-project/amd/device-libs/opencl/src/workgroup/wgbarrier.cl
Lines 25 to 26 in 6581f3d
flags == (CLK_GLOBAL_MEM_FENCE|CLK_LOCAL_MEM_FENCE) ? | |
memory_order_seq_cst : memory_order_release, |
llvm-project/amd/device-libs/opencl/src/workgroup/wgbarrier.cl
Lines 32 to 33 in 6581f3d
flags == (CLK_GLOBAL_MEM_FENCE|CLK_LOCAL_MEM_FENCE) ? | |
memory_order_seq_cst : memory_order_acquire, |
And then there is also CLK_IMAGE_MEM_FENCE
which is treated in the same way as CLK_GLOBAL_MEM_FENCE
in atomic_work_item_fence
. But with it work_group_barrier
will no longer use memory_order_seq_cst
even when all 3 are specified (seq_cst
strangely no longer needed in that case?).
llvm-project/amd/device-libs/opencl/src/misc/awif.cl
Lines 82 to 84 in 6581f3d
// global or image is set, but not local -> fence only global memory. | |
if ((flags & CLK_LOCAL_MEM_FENCE) == 0) { | |
IMPL_ATOMIC_WORK_ITEM_FENCE(, "global"); |