ggml_vec_dot_q8_q8 mma optimize #6

shalinib-ibm · 2025-05-28T10:29:59Z

Implement dot product of two vectors using MMA.
Tried change with different batch sizes.
batch_size = 4,8,16,32 -> llama-batched-benchm similar results to base. after batch_size=64, degrades perf .
llama-bench also gives similar results to base. No perf gain.

Make sure to read the contributing guidelines before submitting a PR

Tried this change with different batch sizes. batch_size = 4,8,16,32 -> llama-batched-benchm similar results to base. after batch_size=64, degrades perf . llama-bench also gives similar results to base. Not much perf gain. Signed-off-by: Shalini Salomi Bodapati <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ggml_vec_dot_q8_q8 mma optimize #6

ggml_vec_dot_q8_q8 mma optimize #6

Uh oh!

shalinib-ibm commented May 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ggml_vec_dot_q8_q8 mma optimize #6

Are you sure you want to change the base?

ggml_vec_dot_q8_q8 mma optimize #6

Uh oh!

Conversation

shalinib-ibm commented May 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant