Skip to content

Increase benchmark sampling 5x for CI stability#187

Merged
NullVoxPopuli merged 10 commits intoember-tooling:mainfrom
NullVoxPopuli-ai-agent:increase-benchmark-sampling
Mar 20, 2026
Merged

Increase benchmark sampling 5x for CI stability#187
NullVoxPopuli merged 10 commits intoember-tooling:mainfrom
NullVoxPopuli-ai-agent:increase-benchmark-sampling

Conversation

@NullVoxPopuli-ai-agent
Copy link

Summary

  • Adds a Node.js module loader hook (tests/bench-sampler.mjs + tests/bench-sampler-hooks.mjs) that intercepts mitata's lib.mjs at load time and multiplies k_min_cpu_time and k_min_samples by 5x
  • No node_modules files are modified on disk — the hook rewrites the source in-memory during module loading
  • Both pnpm bench and pnpm bench:compare (used by CI) now use the higher sampling

Test plan

  • Verified pnpm bench runs successfully with the loader hook
  • Verified node_modules/mitata/src/lib.mjs is unchanged after running
  • CI benchmark comparison workflow runs with increased sampling

🤖 Generated with Claude Code

Uses a Node.js module loader hook to intercept mitata's lib.mjs at load
time and increase k_min_cpu_time and k_min_samples by 5x, without
modifying node_modules on disk.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@NullVoxPopuli
Copy link
Member

🏎️ Benchmark Comparison

Benchmark Control (avg) Experiment (avg) Δ
🟢 gts small 1.41 ms 1.32 ms -6.7%
🟢 gts medium 5.36 ms 5.25 ms -2.2%
gts large 11.90 ms 12.08 ms +1.5%
🟢 gjs small 1.04 ms 1.01 ms -2.4%
🟠 gjs medium 4.10 ms 4.19 ms +2.4%
gjs large 9.66 ms 9.74 ms +0.8%
hbs small 147.14 µs 145.07 µs -1.4%
hbs medium 1.98 ms 1.99 ms +0.4%
hbs large 10.33 ms 10.29 ms -0.4%

🟢 faster · 🔴 slower · 🟠 slightly slower · ⚪ within 2%

Full mitata output ``` clk: ~2.57 GHz cpu: AMD EPYC 7763 64-Core Processor runtime: node 24.14.0 (x64-linux) benchmark avg (min … max) p75 / p99 (min … top 1%) ------------------------------------------- ------------------------------- gts small (control) 1.41 ms/iter 1.08 ms █ (822.43 µs … 30.25 ms) 8.16 ms █ ( 4.08 kb … 7.36 mb) 531.44 kb █▆▂▂▂▁▁▂▁▁▁▁▁▁▁▁▁▁▁▁▁ gts small (experiment) 1.32 ms/iter 1.03 ms █ (784.25 µs … 30.18 ms) 6.40 ms █ ( 28.20 kb … 7.09 mb) 526.11 kb █▆▃▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁ ┌ ┐ ╷┌──┬ ╷ gts small (control) ├┤ │────────────────────────────────────────┤ ╵└──┴ ╵ ┌──┬ ╷ gts small (experiment) │ │──────────────────────────────┤ └──┴ ╵ └ ┘ 784.25 µs 4.47 ms 8.16 ms summary gts small (experiment) 1.07x faster than gts small (control) ------------------------------------------- ------------------------------- gts medium (control) 5.36 ms/iter 5.43 ms █ (3.12 ms … 34.83 ms) 28.86 ms █ ( 29.81 kb … 14.75 mb) 2.31 mb █▇▄▄▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▂▁ gts medium (experiment) 5.25 ms/iter 5.10 ms █ (3.05 ms … 34.24 ms) 29.02 ms █ (229.98 kb … 13.21 mb) 2.30 mb ██▄▄▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▂ ┌ ┐ ┌───┬ ╷ gts medium (control) │ │────────────────────────────────────────┤ └───┴ ╵ ┌───┬ ╷ gts medium (experiment) │ │────────────────────────────────────────┤ └───┴ ╵ └ ┘ 3.05 ms 16.04 ms 29.02 ms (1.21 ms … 22.96 ms) 17.77 ms █ ( 41.61 kb … 4.15 mb) 1.79 mb █▂▁▁▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁ hbs medium (experiment) 1.99 ms/iter 1.37 ms █ (1.24 ms … 23.94 ms) 17.81 ms █ (560.34 kb … 3.40 mb) 1.79 mb █▂▁▁▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁ ┌ ┐ ┌─┬ ╷ hbs medium (control) │ │──────────────────────────────────────────┤ └─┴ ╵ ┌─┬ ╷ hbs medium (experiment) │ │──────────────────────────────────────────┤ └─┴ ╵ └ ┘ 1.21 ms 9.51 ms 17.81 ms summary hbs medium (control) 1x faster than hbs medium (experiment) ------------------------------------------- ------------------------------- hbs large (control) 10.33 ms/iter 9.51 ms █ (7.15 ms … 33.60 ms) 25.69 ms █ ( 6.51 mb … 9.94 mb) 8.04 mb █▄▂▂▂▂▂▂▁▁▁▁▁▁▁▁▁▃▃▂▁ hbs large (experiment) 10.29 ms/iter 9.65 ms █ (7.22 ms … 33.76 ms) 24.40 ms █ ( 6.81 mb … 10.48 mb) 8.04 mb █▃▁▂▂▂▁▂▂▁▁▁▁▁▁▁▁▁▂▄▂ ┌ ┐ ┌───────┬ ╷ hbs large (control) │ │────────────────────────────────────┤ └───────┴ ╵ ┌───────┬ ╷ hbs large (experiment) │ │─────────────────────────────────┤ └───────┴ ╵ └ ┘ 7.15 ms 16.42 ms 25.69 ms summary hbs large (experiment) 1x faster than hbs large (control) ```

- Enable .gc('inner') on all bench calls — forces GC between iterations
  and subtracts GC time from measurements, eliminating GC-spike noise
- Switch PR comment summary from avg to p50 (median), matching the CLI
  formatter — median is far more robust to outliers from CPU scaling
  and system noise on shared CI runners

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@NullVoxPopuli
Copy link
Member

grr

🏎️ Benchmark Comparison

Benchmark Control (p50) Experiment (p50) Δ
🟢 gts small 9.66 ms 9.45 ms -2.2%
🟢 gts medium 22.60 ms 20.61 ms -8.8%
🟢 gts large 33.97 ms 31.62 ms -6.9%
🔴 gjs small 8.43 ms 8.94 ms +6.1%
🟢 gjs medium 16.39 ms 15.61 ms -4.8%
gjs large 26.45 ms 26.38 ms -0.3%
hbs small 704.81 µs 691.00 µs -2.0%
🟢 hbs medium 11.56 ms 11.14 ms -3.6%
hbs large 27.71 ms 28.16 ms +1.6%

🟢 faster · 🔴 slower · 🟠 slightly slower · ⚪ within 2%

Full mitata output ``` clk: ~3.37 GHz cpu: Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz runtime: node 24.14.0 (x64-linux) benchmark avg (min … max) p75 / p99 (min … top 1%) ------------------------------------------- ------------------------------- gts small (control) 9.86 ms/iter 12.97 ms █ ▃ (2.64 ms … 18.31 ms) 17.80 ms █▁▁▁▂▂▅▁██▆▅▂█▆▄▁▅▂▅▂ gc( 44.94 ms … 70.10 ms) 875.01 kb (475.12 kb… 1.60 mb) gts small (experiment) 9.06 ms/iter 12.61 ms █ (2.66 ms … 16.79 ms) 16.76 ms █▁▁▁▂▃▃▃▃▅▅▅▄▃▃▅▃▂▁▅▃ gc( 44.69 ms … 71.25 ms) 837.25 kb (218.52 kb… 1.38 mb) ┌ ┐ ╷ ┌───────┬─────────┐ ╷ gts small (control) ├────────────┤ │ ├─────────────┤ ╵ └───────┴─────────┘ ╵ ╷┌─────────────────┬──────────┐ ╷ gts small (experiment) ├┤ │ ├───────────┤ ╵└─────────────────┴──────────┘ ╵ └ ┘ 2.64 ms 10.22 ms 17.80 ms summary gts small (experiment) 1.09x faster than gts small (control) ------------------------------------------- ------------------------------- gts medium (control) 22.23 ms/iter 25.12 ms ▂▅█ ▅▂ ██▅▅ ▂ (14.08 ms … 30.00 ms) 29.67 ms ▄▄▄▄███▄██▇████▇▇█▄▇▄ gc( 45.45 ms … 57.61 ms) 2.84 mb ( 1.99 mb… 4.52 mb) gts medium (experiment) 20.25 ms/iter 22.63 ms ▃ ▃▆█▆ █ (13.00 ms … 26.97 ms) 26.17 ms ▆▁▆▄▆█▄▆█▄██████▆█▆▁▄ gc( 45.68 ms … 59.56 ms) 2.65 mb ( 2.19 mb… 4.65 mb) ┌ ┐ ╷ ┌─────────┬───────┐ ╷ gts medium (control) ├───────────┤ │ ├───────────┤ ╵ └─────────┴───────┘ ╵ ╷ ┌─────┬─────┐ ╷ gts medium (experiment) ├─────────────┤ │ ├─────────┤ ╵ └─────┴─────┘ ╵ └ ┘ 13.00 ms 21.34 ms 29.67 ms (4.91 ms … 14.53 ms) 14.46 ms ▄▃▁▃▆▁▁▄▁▄▆▆▄████▆▁██ gc( 41.13 ms … 56.76 ms) 2.08 mb ( 1.65 mb… 3.54 mb) hbs medium (experiment) 11.19 ms/iter 13.20 ms █▇▂ ▂ (5.01 ms … 16.51 ms) 15.35 ms ▄▃▁▁▃▄▆▃▃▆▄███▃▃██▆▁▄ gc( 41.54 ms … 56.76 ms) 2.03 mb ( 1.98 mb… 2.66 mb) ┌ ┐ ╷ ┌───┬───────┐ ╷ hbs medium (control) ├──────────────────────┤ │ ├─────┤ ╵ └───┴───────┘ ╵ ╷ ┌────┬────────┐ ╷ hbs medium (experiment) ├─────────────────────┤ │ ├────────┤ ╵ └────┴────────┘ ╵ └ ┘ 4.91 ms 10.13 ms 15.35 ms summary hbs medium (experiment) 1x faster than hbs medium (control) ------------------------------------------- ------------------------------- hbs large (control) 28.04 ms/iter 30.06 ms ▂█ ▆ (19.87 ms … 36.19 ms) 35.29 ms ▃▃▁▁▃▇██▅▇████▅▇▅▃▁▇▃ gc( 44.98 ms … 59.76 ms) 9.03 mb ( 8.23 mb… 9.93 mb) hbs large (experiment) 28.30 ms/iter 30.23 ms ▂ ▆▂▂ █▂▂ (21.52 ms … 36.01 ms) 35.96 ms ▃▃█▅▇▁███▇███▇▃▅▁▃▅▃▅ gc( 43.63 ms … 59.30 ms) 9.02 mb ( 8.62 mb… 10.06 mb) ┌ ┐ ╷ ┌───────┬─────┐ ╷ hbs large (control) ├──────────────┤ │ ├─────────────┤ ╵ └───────┴─────┘ ╵ ╷ ┌──────┬────┐ ╷ hbs large (experiment) ├───────────┤ │ ├───────────────┤ ╵ └──────┴────┘ ╵ └ ┘ 19.87 ms 27.91 ms 35.96 ms summary hbs large (control) 1.01x faster than hbs large (experiment) ```

Instead of patching mitata internals via ESM loader hooks, run each
parse call in a loop (1000/200/100 for small/medium/large) so that
individual GC spikes get diluted within each sample. This is simpler
and more effective than increasing mitata's sampling constants.

Also reverts gc('inner') which was counterproductive — the 44-71ms
GC between iterations trashed CPU caches and increased variance.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@NullVoxPopuli
Copy link
Member

Much closer:

🏎️ Benchmark Comparison

Benchmark Control (p50) Experiment (p50) Δ
🟢 gts small 1187.34 ms 1146.50 ms -3.4%
gts medium 881.35 ms 864.56 ms -1.9%
gts large 987.79 ms 983.24 ms -0.5%
gjs small 951.11 ms 957.78 ms +0.7%
gjs medium 739.28 ms 746.75 ms +1.0%
gjs large 838.20 ms 832.56 ms -0.7%
hbs small 132.06 ms 131.35 ms -0.5%
hbs medium 393.34 ms 394.06 ms +0.2%
hbs large 1026.01 ms 1034.63 ms +0.8%

🟢 faster · 🔴 slower · 🟠 slightly slower · ⚪ within 2%

Full mitata output
clk: ~3.31 GHz
cpu: Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz
runtime: node 24.14.0 (x64-linux)

benchmark                   avg (min … max) p75 / p99    (min … top 1%)
------------------------------------------- -------------------------------
gts small (control)             1.20 s/iter    1.27 s       █     █        
                          (1.01 s … 1.55 s)    1.31 s ▅▅    █ ▅   █ ▅  ▅ ▅▅
                    (147.56 mb … 179.38 mb) 156.85 mb ██▁▁▁▁█▁█▁▁▁█▁█▁▁█▁██

gts small (experiment)          1.19 s/iter    1.21 s       █   █          
                          (1.01 s … 1.53 s)    1.29 s ▅     █ ▅▅█ ▅ ▅   ▅ ▅
                    (146.79 mb … 165.75 mb) 153.08 mb █▁▁▁▁▁█▁███▁█▁█▁▁▁█▁█

                             ┌                                            ┐
                             ╷             ┌──────────────┬─────────┐     ╷
         gts small (control) ├─────────────┤              │         ├─────┤
                             ╵             └──────────────┴─────────┘     ╵
                             ╷            ┌─────────────┬───┐           ╷
      gts small (experiment) ├────────────┤             │   ├───────────┤
                             ╵            └─────────────┴───┘           ╵
                             └                                            ┘
                             1.01 s              1.16 s              1.31 s

summary
  gts small (experiment)
   1.01x faster than gts small (control)
                    (144.42 mb … 161.46 mb) 148.15 mb ██▁██▁▁▁▁▁▁█▁█▁▁██▁▁█

                             ┌                                            ┐
                              ╷     ┌───────────────────┬───────┐         ╷
         gjs large (control)  ├─────┤                   │       ├─────────┤
                              ╵     └───────────────────┴───────┘         ╵
                             ╷┌───────────────────────┬────────┐         ╷
      gjs large (experiment) ├┤                       │        ├─────────┤
                             ╵└───────────────────────┴────────┘         ╵
                             └                                            ┘
                             801.16 ms         888.73 ms          976.30 ms

summary
  gjs large (experiment)
   1.01x faster than gjs large (control)

------------------------------------------- -------------------------------
hbs small (control)          143.33 ms/iter 143.23 ms            █         
                    (112.77 ms … 264.59 ms) 146.65 ms ▅ ▅   ▅ ▅  █   ▅▅ ▅▅▅
                    ( 33.71 mb …  37.16 mb)  36.12 mb █▁█▁▁▁█▁█▁▁█▁▁▁██▁███

hbs small (experiment)       141.62 ms/iter 141.42 ms                █    █
                    (112.87 ms … 262.69 ms) 146.13 ms ▅▅  ▅ ▅ ▅  ▅   █ ▅  █
                    ( 33.87 mb …  36.77 mb)  36.04 mb ██▁▁█▁█▁█▁▁█▁▁▁█▁█▁▁█

                             ┌                                            ┐
                             ╷            ┌───────────────────────────┬   ╷
         hbs small (control) ├────────────┤                           │───┤
                             ╵            └───────────────────────────┴   ╵
                             ╷        ┌────────────────────────────┬     ╷
      hbs small (experiment) ├────────┤                            │─────┤
                             ╵        └────────────────────────────┴     ╵
                             └                                            ┘
                             112.77 ms         129.71 ms          146.65 ms

summary
  hbs small (experiment)
   1.01x faster than hbs small (control)

------------------------------------------- -------------------------------
hbs medium (control)         410.81 ms/iter 418.17 ms  ██    █             
                    (369.92 ms … 558.33 ms) 452.19 ms ▅██   ▅█    ▅    ▅  ▅
                    ( 48.94 mb …  95.57 mb)  84.41 mb ███▁▁▁██▁▁▁▁█▁▁▁▁█▁▁█

hbs medium (experiment)      412.74 ms/iter 413.22 ms █  █                 
                    (370.90 ms … 559.78 ms) 456.46 ms █▅ █ ▅ ▅▅ ▅      ▅  ▅
                    ( 48.20 mb …  95.20 mb)  84.18 mb ██▁█▁█▁██▁█▁▁▁▁▁▁█▁▁█

                             ┌                                            ┐
                             ╷ ┌──────────────────┬───┐                 ╷
        hbs medium (control) ├─┤                  │   ├─────────────────┤
                             ╵ └──────────────────┴───┘                 ╵
                              ╷  ┌─────────────────┬┐                     ╷
     hbs medium (experiment)  ├──┤                 │├─────────────────────┤
                              ╵  └─────────────────┴┘                     ╵
                             └                                            ┘
                             369.92 ms         413.19 ms          456.46 ms

summary
  hbs medium (control)
   1x faster than hbs medium (experiment)

------------------------------------------- -------------------------------
hbs large (control)             1.03 s/iter    1.04 s  █                   
                       (974.15 ms … 1.19 s)    1.09 s  █         █         
                    (159.70 mb … 213.33 mb) 172.54 mb ██▁█▁▁▁▁▁███▁▁▁▁▁▁█▁█

hbs large (experiment)          1.04 s/iter    1.05 s                █     
                       (972.18 ms … 1.20 s)    1.07 s ▅▅▅ ▅      ▅▅ ▅█▅   ▅
                    (159.19 mb … 212.75 mb) 172.15 mb ███▁█▁▁▁▁▁▁██▁███▁▁▁█

                             ┌                                            ┐
                              ╷ ┌───────────────────┬┐                    ╷
         hbs large (control)  ├─┤                   │├────────────────────┤
                              ╵ └───────────────────┴┘                    ╵
                             ╷   ┌────────────────────┬────┐        ╷
      hbs large (experiment) ├───┤                    │    ├────────┤
                             ╵   └────────────────────┴────┘        ╵
                             └                                            ┘
                             972.18 ms            1.03 s             1.09 s

summary
  hbs large (control)
   1.01x faster than hbs large (experiment)

Small: 1000→2000, medium: 200→500. The previous run showed gts small
still drifting at -3.4% while medium/large were within 1%. More
iterations per sample further dilutes GC and scheduling noise.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@NullVoxPopuli
Copy link
Member

sus

🏎️ Benchmark Comparison

Benchmark Control (p50) Experiment (p50) Δ
🟢 gts small 2132.91 ms 2039.75 ms -4.4%
🔴 gts medium 2212.93 ms 2370.23 ms +7.1%
gts large 1095.88 ms 1110.33 ms +1.3%
gjs small 1835.45 ms 1822.92 ms -0.7%
🟢 gjs medium 1997.86 ms 1935.61 ms -3.1%
gjs large 904.84 ms 905.85 ms +0.1%
hbs small 281.36 ms 279.46 ms -0.7%
hbs medium 996.69 ms 988.20 ms -0.9%
🟢 hbs large 1052.77 ms 1030.81 ms -2.1%

🟢 faster · 🔴 slower · 🟠 slightly slower · ⚪ within 2%

Full mitata output ``` clk: ~3.10 GHz cpu: AMD EPYC 7763 64-Core Processor runtime: node 24.14.0 (x64-linux) benchmark avg (min … max) p75 / p99 (min … top 1%) ------------------------------------------- ------------------------------- gts small (control) 2.22 s/iter 2.29 s █ █ █ (1.95 s … 2.73 s) 2.40 s ▅▅ ▅ ▅█ ▅ █ █ (286.61 mb … 326.65 mb) 309.96 mb ██▁█▁▁▁██▁▁█▁▁▁█▁▁▁▁█ gts small (experiment) 2.15 s/iter 2.23 s █ █ (1.88 s … 2.68 s) 2.31 s ▅ ▅ ▅ █▅ ▅▅ ▅ █ (295.43 mb … 323.14 mb) 312.31 mb █▁▁█▁█▁██▁▁▁▁██▁█▁▁▁█ ┌ ┐ ╷ ┌────────────────┬─────┐ ╷ gts small (control) ├─────┤ │ ├─────────┤ ╵ └────────────────┴─────┘ ╵ ╷ ┌─────────────┬──────┐ ╷ gts small (experiment) ├────────┤ │ ├──────┤ ╵ └─────────────┴──────┘ ╵ └ ┘ 1.88 s 2.14 s 2.40 s summary gts small (experiment) 1.03x faster than gts small (control) ------------------------------------------- ------------------------------- gts medium (control) 2.28 s/iter 2.33 s █ (2.07 s … 2.80 s) 2.39 s ▅ ▅▅▅▅▅ █ ▅ ▅ ▅ (344.27 mb … 364.06 mb) 354.16 mb █▁▁▁▁█████▁█▁▁▁▁█▁█▁█ gts medium (experiment) 2.48 s/iter 2.40 s █ (2.22 s … 2.99 s) 2.91 s █▅ (244.74 mb … 359.36 mb) 320.67 mb ▇▁▇▇██▁▁▁▁▁▁▁▁▁▁▁▇▁▁▇ ┌ ┐ ╷ ┌─────┬──┐ ╷ gts medium (control) ├────┤ │ ├──┤ ╵ └─────┴──┘ ╵ ╷ ┌───────┬ ╷ gts medium (experiment) ├─────┤ │──────────────────────┤ ╵ └───────┴ ╵ └ ┘ 2.07 s 2.49 s 2.91 s (935.11 ms … 1.23 s) 1.06 s ▅▅█ ▅ ▅ █▅▅ ▅ (181.07 mb … 234.72 mb) 193.72 mb ███▁▁█▁▁▁▁█▁▁███▁▁▁▁█ hbs medium (experiment) 1.01 s/iter 1.04 s █ █ █ (936.12 ms … 1.18 s) 1.05 s █ █ ▅▅ ▅ ▅▅█ (187.74 mb … 232.52 mb) 205.45 mb █▁▁█▁▁▁▁██▁▁▁▁▁█▁▁███ ┌ ┐ ╷ ┌──────────────────────┬───┐ ╷ hbs medium (control) ├───┤ │ ├─────────────┤ ╵ └──────────────────────┴───┘ ╵ ╷ ┌────────────────────┬────────────┐╷ hbs medium (experiment) ├─────┤ │ ├┤ ╵ └────────────────────┴────────────┘╵ └ ┘ 935.11 ms 995.94 ms 1.06 s summary hbs medium (control) 1x faster than hbs medium (experiment) ------------------------------------------- ------------------------------- hbs large (control) 1.07 s/iter 1.09 s █ (991.87 ms … 1.25 s) 1.15 s ▅█ ▅ ▅▅▅ ▅▅ ▅ ▅ (166.53 mb … 214.17 mb) 173.51 mb ██▁█▁▁▁███▁▁██▁█▁▁▁▁█ hbs large (experiment) 1.06 s/iter 1.09 s █ █ (982.06 ms … 1.24 s) 1.15 s ▅ █ █ ▅ ▅ ▅ ▅▅ ▅ (165.13 mb … 215.05 mb) 172.79 mb █▁█▁█▁█▁█▁▁█▁▁██▁▁▁▁█ ┌ ┐ ╷┌───────────────────┬────┐ ╷ hbs large (control) ├┤ │ ├───────────────┤ ╵└───────────────────┴────┘ ╵ ╷ ┌────────────────┬───────┐ ╷ hbs large (experiment) ├────┤ │ ├──────────────┤ ╵ └────────────────┴───────┘ ╵ └ ┘ 982.06 ms 1.06 s 1.15 s summary hbs large (experiment) 1.01x faster than hbs large (control) ```

@NullVoxPopuli-ai-agent
Copy link
Author

Locally I get this:

clk: ~4.34 GHz
cpu: AMD Ryzen 5 7640U w/ Radeon 760M Graphics
runtime: node 24.14.0 (x64-linux)

benchmark                   avg (min … max) p75 / p99    (min … top 1%)
------------------------------------------- -------------------------------
gts small                       1.62 s/iter    1.64 s          █           
                          (1.49 s … 1.89 s)    1.70 s          █    █      
                    ( 19.75 mb … 334.70 mb) 226.65 mb █▁▁▁▁▁▁▁█████▁█▁▁▁▁▁█

gts medium                      1.61 s/iter    1.61 s      █               
                          (1.52 s … 1.82 s)    1.69 s █    █               
                    ( 17.02 mb … 334.09 mb) 115.52 mb █▁▁▁▁█▁▁████▁▁█▁▁▁▁▁█

gts large                    798.60 ms/iter 824.84 ms █                    
                    (746.98 ms … 863.79 ms) 844.06 ms █  ▅▅   ▅ ▅▅▅   ▅  ▅▅
                    (131.50 mb … 186.60 mb) 165.12 mb █▁▁██▁▁▁█▁███▁▁▁█▁▁██

gjs small                       1.37 s/iter    1.37 s        █             
                          (1.29 s … 1.48 s)    1.44 s        █  █          
                    (263.99 mb … 318.90 mb) 295.27 mb █▁▁▁▁▁███▁██▁▁█▁▁▁▁▁█

gjs medium                      1.40 s/iter    1.42 s         █            
                          (1.32 s … 1.55 s)    1.44 s ▅ ▅     █ ▅ ▅ ▅▅▅▅  ▅
                    (344.20 kb … 297.89 mb) 192.04 mb █▁█▁▁▁▁▁█▁█▁█▁████▁▁█

gjs large                    667.91 ms/iter 681.34 ms █          █         
                    (628.96 ms … 707.84 ms) 698.22 ms █    ▅  ▅  █▅ ▅▅▅   ▅
                    ( 95.76 mb … 159.65 mb) 141.23 mb █▁▁▁▁█▁▁█▁▁██▁███▁▁▁█

hbs small                    150.73 ms/iter 153.12 ms █ █    ██    █████ ██
                    (133.75 ms … 188.95 ms) 156.54 ms █ █    ██    █████ ██
                    ( 29.89 mb …  73.36 mb)  66.04 mb █▁█▁▁▁▁██▁▁▁▁█████▁██

hbs medium                   609.58 ms/iter 614.75 ms █     █              
                    (586.76 ms … 643.40 ms) 639.88 ms █▅   ▅█  ▅▅▅  ▅     ▅
                    (185.75 mb … 235.76 mb) 204.97 mb ██▁▁▁██▁▁███▁▁█▁▁▁▁▁█

hbs large                    693.60 ms/iter 698.59 ms    ██                
                    (657.53 ms … 760.45 ms) 757.98 ms ▅▅ ██ ▅▅▅ ▅         ▅
                    (164.19 mb … 171.37 mb) 165.86 mb ██▁██▁███▁█▁▁▁▁▁▁▁▁▁█

@NullVoxPopuli-ai-agent
Copy link
Author

Locally (with pnpm bench:compare against main):

   Benchmark    Control (p50)   Experiment (p50)       Δ
   ──────────   ─────────────   ────────────────   ─────
🟠 gts small       1780.45 ms         1828.19 ms   +2.7%
🟢 gts medium      2008.25 ms         1967.50 ms   -2.0%
⚪ gts large        909.93 ms          927.65 ms   +1.9%
🟢 gjs small       1630.47 ms         1586.89 ms   -2.7%
⚪ gjs medium      1654.66 ms         1674.45 ms   +1.2%
⚪ gjs large        773.01 ms          782.74 ms   +1.3%
⚪ hbs small        218.63 ms          221.51 ms   +1.3%
⚪ hbs medium       816.90 ms          814.66 ms   -0.3%
🟢 hbs large        857.20 ms          798.76 ms   -6.8%

🟢 faster · 🔴 slower · 🟠 slightly slower · ⚪ within 2%

- New run-bench.sh wrapper: applies taskset -c 0 (CPU pinning) and
  nice -n -20 (high priority, when running as root) for pnpm bench
- bench-compare.mjs: also uses nice -n -20 when running as root
- local-bench-summary.sh: offers to apply CPU governor and boost
  fixes interactively instead of just warning

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@NullVoxPopuli-ai-agent
Copy link
Author

CI results (fork PR so it logged instead of commenting):

Benchmark Control (p50) Experiment (p50) Δ
🟢 gts small 2170.87 ms 2011.74 ms -7.3%
gts medium 2189.03 ms 2149.99 ms -1.8%
gts large 1038.21 ms 1032.35 ms -0.6%
gjs small 1750.23 ms 1761.16 ms +0.6%
gjs medium 1878.35 ms 1843.58 ms -1.9%
gjs large 851.10 ms 839.80 ms -1.3%
hbs small 272.10 ms 277.21 ms +1.9%
hbs medium 951.07 ms 960.49 ms +1.0%
hbs large 964.87 ms 963.10 ms -0.2%

🟢 faster · 🔴 slower · 🟠 slightly slower · ⚪ within 2%

8/9 within 2% on a shared CI runner. gts small is the only outlier at -7.3% — likely still some GC noise on the smallest/fastest fixture even with 2000 iterations.

Wraps each parse call with mitata's do_not_optimize() to ensure V8
doesn't dead-code eliminate any part of the return value.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@NullVoxPopuli-ai-agent
Copy link
Author

Local results with do_not_optimize (QEMU/KVM VM, CPU pinned, no sudo):

   Benchmark    Control (p50)   Experiment (p50)       Δ
   ──────────   ─────────────   ────────────────   ─────
⚪ gts small       1911.44 ms         1943.01 ms   +1.7%
🟢 gts medium      2043.34 ms         2001.45 ms   -2.1%
⚪ gts large        942.90 ms          948.63 ms   +0.6%
⚪ gjs small       1657.04 ms         1647.85 ms   -0.6%
⚪ gjs medium      1801.37 ms         1766.68 ms   -1.9%
🟢 gjs large        821.40 ms          784.83 ms   -4.5%
⚪ hbs small        219.81 ms          220.76 ms   +0.4%
⚪ hbs medium       769.12 ms          782.72 ms   +1.8%
⚪ hbs large        849.42 ms          856.81 ms   +0.9%

🟢 faster · 🔴 slower · 🟠 slightly slower · ⚪ within 2%

7/9 within 2% — even in a VM without root priority.

NullVoxPopuli and others added 2 commits March 20, 2026 11:50
Instead of requiring the user to run the whole command as root (which
loses PATH and can't find node/pnpm), use 'sudo nice -n -20 node ...'
so only the nice call is elevated. Activates automatically when
passwordless sudo is available.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Extract shared formatTime, deltaEmoji, parsePairs into bench-utils.mjs
- Simplify run-bench.sh to just CPU pinning (no sudo/nice complexity)
- Simplify bench-compare.mjs to just CPU pinning
- Replace interactive sudo prompt in local-bench-summary.sh with a
  "tips for more reliable results" section printed after results

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
NullVoxPopuli and others added 2 commits March 20, 2026 12:16
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@NullVoxPopuli
Copy link
Member

it's at least better locally

@NullVoxPopuli NullVoxPopuli merged commit 266b1e8 into ember-tooling:main Mar 20, 2026
34 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants