SIMD Acceleration of Truth-Table Operations with AVX2 by costamag · Pull Request #142 · lsils/kitty

costamag · 2025-05-13T14:13:46Z

This PR introduces vectorized bitwise operations using 256-bit AVX2 registers. Each register processes four 64-bit words in parallel, enabling efficient computation across truth tables. These operations should be preferred over traditional scalar implementations for sufficiently large truth tables.

The optimal cutoff (in number of bits) at which AVX2 becomes advantageous may vary. Compiler-dependent benchmarking is required to identify this threshold. Tests show consistent speedups for 10-input static_truth_tables and 12-input dynamic_truth_tables.

Acknowledgment: Thanks to @Michal-Atlas for suggesting the use of Single Instruction, Multiple Data (SIMD) instructions to accelerate truth-table operations.

Remark: I also experimented with a pop-count implementation based on the AVX2 Harley–Seal algorithm described in “Faster Population Counts Using AVX2 Instructions” (arXiv:1611.07612). However, on machines with sufficiently large L1 caches, the scalar version still outperforms it. A faster pop-count remains an open optimization target.

costamag · 2025-05-14T11:23:30Z

Hi @msoeken. I modified the functions to better align with the code structure in operations.hpp. Provided that the tests pass, this PR is ready for your review.

…n of scalar vs. vector operation

costamag · 2025-05-16T16:32:05Z

I will open a clean PR after refactoring.

Andrea Costamagna added 3 commits May 13, 2025 15:59

feat: SIMD operations

f5defdf

fix: corrected changelog

a7f76f7

docs: removed typo

d24e138

costamag changed the title ~~SIMD-based optimization of truth-tables with AVX2~~ SIMD Acceleration of Truth-Table Operations with AVX2 May 13, 2025

Andrea Costamagna added 2 commits May 14, 2025 13:17

perf: improve dispatch symmetry and conform fallback to scalar version.

8e911b7

test: add conservative lower bound on expected performance gains

1c636d2

costamag and others added 6 commits May 15, 2025 15:22

Merge branch 'lsils:master' into simd-opto

d7d6b6a

fix: added benchmarking functionality for machine-dependent evaluatio…

c4406bd

…n of scalar vs. vector operation

test: larger static truth table

bdcca6a

feat: automatic benchmarking to choose between vector and scalar

ba07e6b

fix: typo in simd_operations tests

63174c1

Merge branch 'lsils:master' into simd-opto

a46ae3c

costamag closed this May 16, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SIMD Acceleration of Truth-Table Operations with AVX2#142

SIMD Acceleration of Truth-Table Operations with AVX2#142
costamag wants to merge 11 commits intolsils:masterfrom
costamag:simd-opto

costamag commented May 13, 2025 •

edited

Loading

Uh oh!

costamag commented May 14, 2025

Uh oh!

costamag commented May 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

costamag commented May 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

costamag commented May 14, 2025

Uh oh!

costamag commented May 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

costamag commented May 13, 2025 •

edited

Loading