-
Notifications
You must be signed in to change notification settings - Fork 2.8k
[CPU][x86] add support for sink input of SDPA #32374
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CPU][x86] add support for sink input of SDPA #32374
Conversation
Signed-off-by: HU Yuan2 <[email protected]>
Signed-off-by: HU Yuan2 <[email protected]>
Signed-off-by: HU Yuan2 <[email protected]>
Signed-off-by: HU Yuan2 <[email protected]>
Signed-off-by: HU Yuan2 <[email protected]>
Signed-off-by: HU Yuan2 <[email protected]>
Signed-off-by: HU Yuan2 <[email protected]>
f95b9c7 to
013e6f5
Compare
Signed-off-by: HU Yuan2 <[email protected]>
Signed-off-by: HU Yuan2 <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The sink input is an optional parameter, if we decide to pass this parameter from SDPA cpu node to inside kernels, it's better to explicitly pass valid pointer/Plaintensor or nullptr/Plaintensor() to kernels to avoid misusage of this parameter.
src/plugins/intel_cpu/src/nodes/kernels/scaled_attn/mha_single_token.cpp
Outdated
Show resolved
Hide resolved
src/plugins/intel_cpu/src/nodes/kernels/scaled_attn/softmax_kernel.hpp
Outdated
Show resolved
Hide resolved
src/plugins/intel_cpu/src/nodes/kernels/scaled_attn/softmax_kernel.hpp
Outdated
Show resolved
Hide resolved
src/plugins/intel_cpu/src/nodes/kernels/scaled_attn/softmax_kernel.hpp
Outdated
Show resolved
Hide resolved
src/plugins/intel_cpu/src/nodes/kernels/scaled_attn/softmax_kernel.hpp
Outdated
Show resolved
Hide resolved
src/plugins/intel_cpu/src/nodes/kernels/scaled_attn/softmax_kernel.hpp
Outdated
Show resolved
Hide resolved
src/plugins/intel_cpu/src/nodes/kernels/scaled_attn/softmax_kernel.hpp
Outdated
Show resolved
Hide resolved
continue fix clang and compiling issue Signed-off-by: HU Yuan2 <[email protected]>
Signed-off-by: HU Yuan2 <[email protected]>
Signed-off-by: HU Yuan2 <[email protected]>
Signed-off-by: HU Yuan2 <[email protected]>
Signed-off-by: HU Yuan2 <[email protected]>
Signed-off-by: HU Yuan2 <[email protected]>
Signed-off-by: HU Yuan2 <[email protected]>
Signed-off-by: HU Yuan2 <[email protected]>
Signed-off-by: HU Yuan2 <[email protected]>
Signed-off-by: HU Yuan2 <[email protected]>
Signed-off-by: HU Yuan2 <[email protected]>
src/plugins/intel_cpu/src/shape_inference/custom/scaled_attn.cpp
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Signed-off-by: HU Yuan2 <[email protected]>
|
Hi @maxnick could you please take a review? Thanks! |
| bool quant_key_by_channel); | ||
| bool quant_key_by_channel, | ||
| const ov::intel_cpu::PlainTensor& sink_input); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a short comment about the purpose of this sink input and what should be set if there is no sink input.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
already add comment, please help to check it, thanks
|
PR is waiting performance checks. |
Signed-off-by: HU Yuan2 <[email protected]>
Details:
Tickets: