Skip to content

SIMD Acceleration of Truth-Table Operations with AVX2#142

Closed
costamag wants to merge 11 commits intolsils:masterfrom
costamag:simd-opto
Closed

SIMD Acceleration of Truth-Table Operations with AVX2#142
costamag wants to merge 11 commits intolsils:masterfrom
costamag:simd-opto

Conversation

@costamag
Copy link
Copy Markdown
Collaborator

@costamag costamag commented May 13, 2025

This PR introduces vectorized bitwise operations using 256-bit AVX2 registers. Each register processes four 64-bit words in parallel, enabling efficient computation across truth tables. These operations should be preferred over traditional scalar implementations for sufficiently large truth tables.

The optimal cutoff (in number of bits) at which AVX2 becomes advantageous may vary. Compiler-dependent benchmarking is required to identify this threshold. Tests show consistent speedups for 10-input static_truth_tables and 12-input dynamic_truth_tables.

Acknowledgment: Thanks to @Michal-Atlas for suggesting the use of Single Instruction, Multiple Data (SIMD) instructions to accelerate truth-table operations.

Remark: I also experimented with a pop-count implementation based on the AVX2 Harley–Seal algorithm described in “Faster Population Counts Using AVX2 Instructions” (arXiv:1611.07612). However, on machines with sufficiently large L1 caches, the scalar version still outperforms it. A faster pop-count remains an open optimization target.

@costamag costamag changed the title SIMD-based optimization of truth-tables with AVX2 SIMD Acceleration of Truth-Table Operations with AVX2 May 13, 2025
@costamag
Copy link
Copy Markdown
Collaborator Author

Hi @msoeken. I modified the functions to better align with the code structure in operations.hpp. Provided that the tests pass, this PR is ready for your review.

@costamag
Copy link
Copy Markdown
Collaborator Author

I will open a clean PR after refactoring.

@costamag costamag closed this May 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant