Skip to content

Conversation

@xal-0
Copy link
Member

@xal-0 xal-0 commented Nov 11, 2025

Ports our RTDyLD memory manager to JITLink in order to avoid memory use regressions after switching to JITLink everywhere (#60031). This is a direct port: finalization must happen all at once, because it invalidates all allocation wr_ptrs. I decided it wasn't worth it to associate OnFinalizedFunction callbacks with each block, since they are large enough to make it extremely likely that all in-flight allocations land in the same block; everything must be relocated before finalization can happen.

I plan to add support for DualMapAllocator on ARM64 macOS, as well as an alternative for executable memory later. For now, we fall back to the old MapperJITLinkMemoryManager.

@xal-0 xal-0 added the performance Must go faster label Nov 11, 2025
@xal-0 xal-0 force-pushed the jitlink-cgmemmgr branch 3 times, most recently from 5d5362e to 0b04319 Compare November 11, 2025 20:07
Ports our RTDyLD memory manager to JITLink in order to avoid memory use
regressions after switching to JITLink everywhere (JuliaLang#60031).  This is a
direct port: finalization must happen all at once, because it
invalidates all allocation `wr_ptr`s.  I decided it wasn't worth it to
associate `OnFinalizedFunction` callbacks with each block, since they
are large enough to make it extremely likely that all in-flight
allocations land in the same block; everything must be relocated before
finalization can happen.

I plan to add support for DualMapAllocator on ARM64 macOS, as well as an
alternative for executable memory later.  For now, we fall back to the
old MapperJITLinkMemoryManager.

Release JLJITLinkMemoryManager lock when calling FinalizedCallbacks
@giordano
Copy link
Member

About jitlink everywhere, are you planning to address llvm/llvm-project#63236? That caused us some pain when we switched aarch64-linux.

@xal-0
Copy link
Member Author

xal-0 commented Nov 11, 2025

That's what this change addresses, though the fallback to MapperJITLinkMemoryManager/InProcessMemoryMapper still triggers on macOS because of the way DualMapAllocator creates R-X mappigns. That issue should disappear entirely in a subsequent pull request.

I thought I'd defer that work both because it's separable and because there is a better option on macOS that I am working on now: we can use the new APRR (aka fast permission restrictions). That lets us create special RWX mappings and toggle whether each thread sees it as R-X or RW- independently and without any system calls like mprotect.

@giordano
Copy link
Member

Cool, it wasn't to clear to me what were the memory use regressions mentioned in the first post, thanks for explaining it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance Must go faster

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants