Skip to content

Conversation

@svenweb
Copy link
Contributor

@svenweb svenweb commented Nov 27, 2025

Hello,

This PR increases the speed of Length.cpp by 1.13x - 2.83x by enabling the compiler to use SIMD vector instructions with SIMD directives. This is done safely by checking #ifdef HAVE_OPEN_SIMD so if the preprocessor macro HAVE_OPEN_SIMD has not been defined then the Length.cpp runs as usual.

On x86 Linux 32GiB DRAM, Intel i9 performance varied by size of line (# of points) and whether line was a CoordinateSequence or CoordinateXY vector.

`
Points | Vector Gain | CoordinateSequence Gain

      10 |         -- |           --
     100 |         -- |           --
    1000 |      1.67x |        1.67x
   10,000 |      2.83x |        2.33x
  100,000 |      1.83x |        1.96x
 1,000,000 |      1.75x |        1.48x
10,000,000 |      1.24x |        1.13x

`

The speed and throughput testing script I used:
myLengthTest.cpp

I built and ran ctest on both the x86 Linux and an M1 Mac, passed all tests.

Thank you,
Sven

@pramsey
Copy link
Member

pramsey commented Nov 27, 2025

What would cause HAVE_OPEN_SIMD to be set though? Shouldn't there be an accompanying check in cmake or is HAVE_OPEN_SIMD just something intrinsic to some compilers?

@pramsey
Copy link
Member

pramsey commented Nov 27, 2025

Reading on this simd directive, it sounds like in general compilers are already vectorizing pretty automatically. Does your code change (removing the pt0->pt1 assignment) without the simd directive end up vectorized anyways?

@gregbaker
Copy link

in general compilers are already vectorizing pretty automatically

This code can't be fully vectorized because the compiler is obliged to do the additions in the order specified to preserve any rounding error to be exactly what you asked for. Effectively it must implement (((l0+l1)+l2)+l3)+l4. The pragma gives it permission to treat the + as commutative and associative, allowing the automatic vectorization to happen.

Removed SIMD pragma directives for length calculation.
@svenweb
Copy link
Contributor Author

svenweb commented Nov 28, 2025

Hi @pramsey ,

Reading on this simd directive, it sounds like in general compilers are already vectorizing pretty automatically. Does your code change (removing the pt0->pt1 assignment) without the simd directive end up vectorized anyways?

Yes my code change even without the simd directive or pragma vectorizes the multiplication and square root of the length of line method, with up to 2x speed and throughput increase.

Adding the HAVE_OPEN_SIMD directive and #pragma omp simd reduction(+:len) allows the compiler to also vectorize the addition as @gregbaker described, which increases performance further. This Compiler Explorer example shows the addition vectorizing with the #pragma and simd directive set.

To realize performance gains without having to set HAVE_OPEN_SIMD I have updated my commit and removed the HAVE_OPEN_SIMD directive and the #pragma. The only changes now are removing the loop-dependency of the pt0 -> pt1 assignment, which allows the compiler to auto-vectorize most of the loop.

Thank you!
Sven

@pramsey
Copy link
Member

pramsey commented Nov 28, 2025

I'm fine w/ this in principle. Can you explain to me how HAVE_OPEN_SIMD would get set? By the compiler? Does CMake need any special detection?

@svenweb
Copy link
Contributor Author

svenweb commented Dec 1, 2025

The HAVE_OPEN_SIMD would have to be set by a special detection in CMake. My experience with CMake is limited, but I think the CMake would look like this:

  • Check if compiler/toolchain supports basic OpenMP with FindOpenMP
  • If supported, set CMake OpenMP compiler flags

However detecting support for #pragma omp simd specifically could require more checks as different compilers support the SIMD subset of OpenMP unevenly, see clang has limited support for vectorization.

I could open a separate issue and continue looking into adding a reliable SIMD capability detection step to the CMake?

Thanks!

@pramsey
Copy link
Member

pramsey commented Dec 1, 2025

Yes, I'll merge this and you can research compiler feature detection.

@pramsey pramsey merged commit c2a1d40 into libgeos:main Dec 1, 2025
27 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants