HACK: Implement Numba VM with caching of individual nodes #1604

ricardoV94 · 2025-09-02T15:52:04Z

Disclaimer: this is still 100% on hack status, and I don't understand half of the things I did

When we tried #811 it was obvious that numba compile times were prohibitive.

This PR tries a different approach (still at the hacking stage), of using a mode more like the CVM, where individual nodes are jit_compiled, but the whole graph (i.e., the "VM") is not. This allows reusing pre-compiled/cached nodes across different functions, bringing the compilation cost down.

It requires interfacing with the numba cache locator to direct it to cached objects, which requires defining our own cache keys. Numba usually uses the file line position and contents as the cache key, but this doesn't work for dynamically generated files (at least not if stored in a random temp file) nor really for nested functions like those built for Elemwise, Also some Ops are string-generated, and others are regular python functions with globals which numba can usually cache. All this has to be re-examined.

We are also not calling njit on inner Ops (the store_core_outputs / ScalarOp) of Elemwise, but instead doing register_jittable. This was needed for caching to work, because if we njit a function we always get a new object and once serialized the numba cache key will differ, whereas register_jitable overloads the function but returns it unchanged, which doesn't change the cache key.

This requires us to move the jit away from the dispatch functionality.

Results:

Second pass over tests/tensor/rewriting/test_basic.py (to allow compiling everything first):
2s with C_VM backend
54s with Numba backend
34s with Numba VM without cache
4s with Numba VM with cache

We're finally approaching the speed of the previous backend (at least for single function compilation + eval). Probably could get it there with more optimizing, but a small slowdown is acceptable.

TODO:

We are still writing python strings to the filesystem to compile them, this is probably not needed as explored in Cache numba stuff #1326 (last commit?)
We have to compile some functions that don't really need so we can cache it, such as with Elemwise. This is related to https://numba.discourse.group/t/caching-redefined-functions/3057 but I don't yet have a clear picture.
Proper cache keys, I just hacked some quick things. Perhaps use the source code of the generated functions?
- Composite key is certainly broken
- Cache whole FunctionGraph, this would avoid recompiling identical graphs in the regular Numba mode, not just NumbaCVM (it's also needed for correct cache of Composite/Blockwise/Scan,OpFromGraph (i.e., anything with inner Ops)).
Figure out what happens with Ops that run with object mode?
Handle functions with pointers / large constants that can't traditionally be cached (not sure what's happening now). Related to cache=True failures with locally defined functions numba/numba#10098
Benchmark slowdown from the "VM" approach in realistic functions. Consider using/adapting CVM to orchestrate the calls to the individuals nodes (would need to use the thunk approach). Right now the VM is the python source code generated by the outermost unjitted FunctionGraph

review-notebook-app · 2025-09-02T15:52:10Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

jorenham

left some comments; hope you don't mind

jorenham · 2025-09-05T11:04:36Z

pytensor/link/numba/dispatch/basic.py

-    kwargs.setdefault("cache", config.numba__cache)
-    kwargs.setdefault("no_cpython_wrapper", True)
-    kwargs.setdefault("no_cfunc_wrapper", True)
+    kwargs.setdefault("cache", True)


Note that the numba currently can't detect changes in other modules, which could lead to outdated caches. I'm guessing that's why they've decided to make it opt-in

Not sure what you mean? So far numba caching has been fine, to the extent that it was actually used (not much). It plays a much bigger role in this PR approach, but we are also pretty-much customizing it's behavior completely.

I was talking about this note from the docs (https://numba.readthedocs.io/en/stable/user/jit.html#cache):

Caching of compiled functions has several known limitations:

The caching of compiled functions is not performed on a function-by-function basis. The cached function is the the main jit function, and all secondary functions (those called by the main function) are incorporated in the cache of the main function.

Cache invalidation fails to recognize changes in functions defined in a different file. This means that when a main jit function calls functions that were imported from a different module, a change in those other modules will not be detected and the cache will not be updated. This carries the risk that “old” function code might be used in the calculations.

Global variables are treated as constants. The cache will remember the value of the global variable at compilation time. On cache load, the cached function will not rebind to the new value of the global variable.

Thanks for the pointer!

Most relevant is point 2, but I don't think that's a big problem right now. Caching wasn't really working until now for our outer functions (FunctionGraph, Elemwise, Blockwise, RV), and the inner functions for which caching was working aren't using any shared functionality other than intrinsics like to_fixed_tuple.

After this PR we'll move to our own caching system, and define the caching keys ourselves for the function we jit, and we can avoid that gotcha. Perhaps we have to do it for every function, not just the ones we get from codegen. Or we add the pytensor version as part of the key...

Points 1-3 aren't an issue for us AFAICT

jorenham · 2025-09-05T11:07:23Z

pytensor/link/numba/linker.py

+        else:
+            from pytensor.link.numba.dispatch.basic import numba_njit

-        jitted_fn = numba_njit(fn, no_cpython_wrapper=False, no_cfunc_wrapper=False)
-        return jitted_fn
+            jitted_fn = numba_njit(fn, no_cpython_wrapper=False, no_cfunc_wrapper=False)
+            return jitted_fn


I suppose there's no need for this else branch 🤷🏻

It's a stylistic choice. Should be enforced by ruff I guess

Feel free to open an issue to add https://docs.astral.sh/ruff/rules/superfluous-else-return/ to our rules.

Feel free to open an issue to add docs.astral.sh/ruff/rules/superfluous-else-return to our rules.

Nah you're right; it's personal preference. I guess I've just been brainwashed by the default ruff style haha.

jorenham · 2025-09-05T11:08:16Z

pytensor/link/utils.py

+        with open(filename, "wb") as f:
+            f.write(src.encode())


Suggested change

with open(filename, "wb") as f:

f.write(src.encode())

filename.write_bytes(src.encode())

pytensor/link/utils.py

ricardoV94 · 2025-09-05T17:11:28Z

left some comments; hope you don't mind

I don't, but it's still too early for that sort of feedback. I'm just thinkering around at this point.

ricardoV94 · 2025-09-14T10:58:43Z

pytensor/link/numba/dispatch/scalar.py

    )

-    signature = create_numba_signature(node, force_scalar=True)
+    # signature = create_numba_signature(node, force_scalar=True)


This was causing eager compilation during dispatch, nullifying any caching benefits

ricardoV94 · 2025-09-14T10:59:39Z

pytensor/link/numba/dispatch/vectorize_codegen.py

            output_core_shapes,
        )

+        core_signature = typingctx.resolve_function_type(


This causes compilation of the inner function, so I moved it to the codegen, otherwise we had to pay the cost, even if we could cache it? Not sure. But even if not, I guess we generally want to lazy compile by default?

And make that the default backend

jorenham reviewed Sep 5, 2025

View reviewed changes

ricardoV94 force-pushed the numba_cache branch 2 times, most recently from aff2a9a to 978b701 Compare September 14, 2025 10:30

ricardoV94 changed the title ~~Implement Numba VM~~ Implement Numba VM with caching of individual nodes Sep 14, 2025

ricardoV94 changed the title ~~Implement Numba VM with caching of individual nodes~~ HACK: Implement Numba VM with caching of individual nodes Sep 14, 2025

ricardoV94 commented Sep 14, 2025

View reviewed changes

ricardoV94 force-pushed the numba_cache branch from 978b701 to 3677f33 Compare September 30, 2025 12:36

ricardoV94 added 10 commits October 2, 2025 13:51

Add error message in Numba implementation of SpecifyShape

079a51a

Make Numba the default backend

675e7e5

Implement Numba VM with caching

fb57684

And make that the default backend

Cache more Ops

b98cf4f

.wip

8dd4b48

.More hacking around

3692b3a

.More hacking around

022e187

New bench function

5798bf6

Benchmark radon function

b578284

Fix non-vm NUMBA

5bb8c9b

ricardoV94 force-pushed the numba_cache branch from 8709393 to 5bb8c9b Compare October 2, 2025 11:51

ricardoV94 added 2 commits October 6, 2025 18:45

.more caching

4f1585b

.fix shit

44a4ab7

ricardoV94 mentioned this pull request Oct 7, 2025

Trust me numba: you can use this cache #1637

Draft

6 tasks

	with open(filename, "wb") as f:
	f.write(src.encode())
	filename.write_bytes(src.encode())

HACK: Implement Numba VM with caching of individual nodes #1604

Are you sure you want to change the base?

HACK: Implement Numba VM with caching of individual nodes #1604

Uh oh!

Conversation

ricardoV94 commented Sep 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

TODO:

Uh oh!

review-notebook-app bot commented Sep 2, 2025

Uh oh!

jorenham left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ricardoV94 Sep 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ricardoV94 Sep 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ricardoV94 commented Sep 5, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ricardoV94 commented Sep 2, 2025 •

edited

Loading

ricardoV94 Sep 30, 2025 •

edited

Loading

ricardoV94 Sep 14, 2025 •

edited

Loading