Skip to content

Conversation

wheeheee
Copy link
Member

@wheeheee wheeheee commented Sep 18, 2025

Benchmarks show that the previous hack for larger values of 19 <= N <= 66, effective in Julia 1.9 and 1.10 onwards, fails again in 1.12, but it appears that this can be salvaged again by moving the stores around. This mostly restores the previous level of performance for stateless filts of vectors, but the stateful and array versions may still be left slightly worse in 1.12.

Also adjusts the SMALL_FILT_VECT_CUTOFF, reduced from 19 to 18.
Benchmarks included below:

julia> using DSP, BenchmarkTools

julia> out = zeros(10_000);

julia> @benchmark filt!($out, b, a, x) setup=(x = rand(10_000); b = rand(15); a = 1)
### PR ###

BenchmarkTools.Trial: 10000 samples with 1 evaluation per sample.
 Range (min … max):  12.600 μs … 105.800 μs  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     12.800 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   13.801 μs ±   3.269 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

  █▄ ▂▆▂  ▁       ▁▁▁▁                                         ▁
  ██▆████▆█▇▆▇▆▆▆▇█████▇█▇▇▆▆▆▄▄▅▄▅▃▃▆▇█▆▆▆▇▆▇▆▅▆▅▆▅▁▄▄▁▃▄▁▆▁▆ █
  12.6 μs       Histogram: log(frequency) by time      28.9 μs <

 Memory estimate: 176 bytes, allocs estimate: 2.

### master ###

 Range (min … max):  12.500 μs … 103.800 μs  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     13.000 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   13.971 μs ±   3.395 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

  █▇▃  █▄▁  ▂          ▂▁▁   ▁▁                                ▂
  ███▆▄███▇▇██▅▆▆▅▆▅▆▇▇███▇▇▇███▇▇▆▆▆▅▅▄▅▃▁▄▅▇▅▅▆▆█▇▅▅▄▆█▇██▇▆ █
  12.5 μs       Histogram: log(frequency) by time      25.6 μs <

 Memory estimate: 176 bytes, allocs estimate: 2.

julia> @benchmark filt!($out, b, a, x) setup=(x = rand(10_000); b = rand(30); a = 1)
### PR ###

BenchmarkTools.Trial: 10000 samples with 1 evaluation per sample.
 Range (min … max):  17.200 μs … 130.300 μs  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     17.400 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   18.544 μs ±   3.109 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ██▂     ▆▅ ▁  ▃▅        ▃             ▁  ▁                   ▂
  ███▆▆▆▁▅█████▇██▇▇▆▅▄▅▆▅██▅▇█▇▇██▇██▇███▇██▇█▇▆▆▆▆▇▇▇▅▄▄▆▄▆▇ █
  17.2 μs       Histogram: log(frequency) by time      27.4 μs <

 Memory estimate: 288 bytes, allocs estimate: 2.

### master ###

 Range (min … max):  40.100 μs … 140.900 μs  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     40.400 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   42.599 μs ±   5.875 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

  █  ▁ ▆▅▃▁▁  ▃▁                                               ▁
  ██▅███████▇████▇▆▆▅▆▆▅▅▅▅▄▄▁▅▅▄▄▁▁▃▁▁▃▁▁▁▅▃▁▅▄▃▃▃▁▄▁▁▄▄▃▁▄██ █
  40.1 μs       Histogram: log(frequency) by time      75.2 μs <

 Memory estimate: 288 bytes, allocs estimate: 2.

julia> @benchmark filt!($out, b, a, x) setup=(x = rand(10_000); b = rand(50); a = 1)
### PR ###

BenchmarkTools.Trial: 10000 samples with 1 evaluation per sample.
 Range (min … max):  26.200 μs … 147.000 μs  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     26.300 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   27.524 μs ±   4.005 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

  █▁      ▄▅     ▂▂▃▁                                          ▁
  ████▄▅▆▆███▇█▇██████▇▇▅▇▆▇██▇▇▅▅▆▄▅▅▄▄▄▄▃▃▃▄▄▄▄▁▃▃▄▁█▃▁▃▄▄▇▆ █
  26.2 μs       Histogram: log(frequency) by time      40.2 μs <

 Memory estimate: 480 bytes, allocs estimate: 2.

### master ###

 Range (min … max):  94.300 μs … 324.500 μs  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     95.200 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   98.593 μs ±  10.699 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

  █▃▁▁▇▃▁▁▁▄▁  ▃▁                                              ▁
  ████████████▇███▇▆▇▆▆▅▅▇▆▅▃▃▅▅▄▄▃▃▄▂▄▃▃▃▄▄▇▅▄▄▃▃▆▄▄▄▆▄▃▄▂▄▅▅ █
  94.3 μs       Histogram: log(frequency) by time       145 μs <

 Memory estimate: 480 bytes, allocs estimate: 2.

julia> @benchmark filt!($out, b, a, x) setup=(x = rand(10_000); b = rand(66); a = 1)
### PR ###

BenchmarkTools.Trial: 10000 samples with 1 evaluation per sample.
 Range (min … max):  26.300 μs … 198.400 μs  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     26.800 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   28.160 μs ±   4.524 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▇█▆▂▃▂▂▃▁▅▆▃▁▁ ▂▂▃▃▂▁       ▁                                ▂
  █████████████████████▇██▇▆████▆▇▅▇▅▆▅▂▄▅▃▅▄▄▅▄▄▅▄▄▄▆▅▅▃▂▂▄▄▃ █
  26.3 μs       Histogram: log(frequency) by time      40.6 μs <

 Memory estimate: 576 bytes, allocs estimate: 2.

### master ###

 Range (min … max):  157.500 μs …  2.324 ms  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     164.300 μs              ┊ GC (median):    0.00%
 Time  (mean ± σ):   168.404 μs ± 27.859 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▂▄█▆█▅▄▂▂▂▄▂▄▂▁▁                                             ▂
  ██████████████████▆▆▆▆▆▇▆▅▆▆▄▆▃▄▆▅▄▄▄▄▄▄▄▁▁▄▄▅▆▆▆██▆▆▆▅▇▆▆█▆ █
  158 μs        Histogram: log(frequency) by time       254 μs <

 Memory estimate: 576 bytes, allocs estimate: 2.
julia> versioninfo()
Julia Version 1.12.0-rc2
Commit 72cbf019d0 (2025-09-06 12:00 UTC)
Build Info:
  Official https://julialang.org release
Platform Info:
  OS: Windows (x86_64-w64-mingw32)
  CPU: 8 × 11th Gen Intel(R) Core(TM) i5-1135G7 @ 2.40GHz
  WORD_SIZE: 64
  LLVM: libLLVM-18.1.7 (ORCJIT, tigerlake)
  GC: Built with stock GC
Threads: 8 default, 1 interactive, 8 GC (on 8 virtual cores)
Environment:
  JULIA_CONDAPKG_BACKEND = Null
  JULIA_DEPOT_PATH = Q:\.julia;
  JULIA_NUM_THREADS = auto

I suppose we should bump the version for this patch?

Copy link

codecov bot commented Sep 18, 2025

Codecov Report

❌ Patch coverage is 95.45455% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 98.11%. Comparing base (7fc3842) to head (82b34c7).

Files with missing lines Patch % Lines
src/Filters/filt.jl 93.75% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #642      +/-   ##
==========================================
- Coverage   98.13%   98.11%   -0.03%     
==========================================
  Files          19       19              
  Lines        3277     3289      +12     
==========================================
+ Hits         3216     3227      +11     
- Misses         61       62       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@wheeheee wheeheee changed the title (voodoo) Patch 1.12 performance for _small_filt_fir! (voodoo) Patch _small_filt_fir! performance for 1.12 Sep 18, 2025
@wheeheee
Copy link
Member Author

These commits should fix the StoreSI version. Uses VERSION > v"1.12-" to facilitate testing with 1.12 release candidates.
So far, seems to work fine with Julia 1.10

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant