Skip to content

Add haxe.GcFinalizer with handle-based API#12766

Merged
Simn merged 6 commits intoHaxeFoundation:developmentfrom
jdonaldson:gc-finalizer
Apr 12, 2026
Merged

Add haxe.GcFinalizer with handle-based API#12766
Simn merged 6 commits intoHaxeFoundation:developmentfrom
jdonaldson:gc-finalizer

Conversation

@jdonaldson
Copy link
Copy Markdown
Member

@jdonaldson jdonaldson commented Mar 8, 2026

Summary

  • Adds haxe.GcFinalizer<T> — a cross-platform API for GC finalization callbacks, modeled after JS FinalizationRegistry
  • When a registered target object is garbage-collected, the callback receives a held value (not the dying object itself), which is safe since the object may be in an invalid state
  • Adds haxe.ICloseable — a general-purpose interface for releasable resources (follows IThreadCallbackHandle precedent)
  • Companion to Add haxe.ds.WeakRef and expand WeakMap target coverage #12765 (WeakRef/WeakMap) — independent and can be merged separately

API

// General-purpose interface
interface ICloseable {
    function close():Void;
}

class GcFinalizer<T> {
    public function new(callback:T->Void);
    public function register(target:{}, heldValue:T):ICloseable;
}

register() returns an ICloseable handle. Calling close() cancels the finalization. No token-based unregister — JS users who need batch unregister can use js.lib.FinalizationRegistry directly.

Target implementations

Target Mechanism Handle close
JS Wraps native FinalizationRegistry h.unregister(syntheticToken)
Python weakref.finalize fin.detach()
C++ cpp.vm.Gc.setFinalizer with registration data on target reg.cancelled = true
Eval eval.vm.Gc.finalise with closure-captured state reg.cancelled = true
Lua Proxy table with __gc metamethod (Lua 5.2+) proxy.cb = nil
JVM WeakReference + ReferenceQueue polling reg.cancelled = true
Others Throws NotImplementedException

Changes from initial version

Redesigned per review feedback: replaced token-based register(target, heldValue, ?unregisterToken) + unregister(token) with handle-based register(target, heldValue):ICloseable. This eliminates ObjectMap dependency from all non-JS targets and prevents silent token leak issues.

Test plan

  • Smoke tests: construct, register, close handle don't crash (all supported targets)
  • GC-dependent tests: callback fires after forced GC, close() prevents callback (cpp, eval)
  • Unsupported targets throw NotImplementedException
  • Lua, JVM, Python, eval tests pass locally
  • CI validates JS, cpp, and remaining targets

Provides a cross-platform API for GC finalization callbacks.
When a registered target object is collected, a user-provided
callback is invoked with a held value (not the dying object).

Supported targets: js, python, cpp, eval, lua, jvm.
Unsupported targets throw NotImplementedException.
@Simn
Copy link
Copy Markdown
Member

Simn commented Mar 8, 2026

Interesting! Just the other day I was looking for something like this in the context of threads.

I don't like this tokenMap approach though. The usage of ObjectMap always already alerts me with regards to portability, but the current implementation also doesn't appear to handle the case where I provide a token when registering but never unregister it. Doesn't that just keep the token in the map forever?

I generally prefer when functions return you an interface handle that you can close() instead because in that case we can blame the user if something leaks, and we don't need that entire machinery that tracks registrations. I was thinking about having a general haxe.ICloseable for this purpose. There's currently one in sys/thread/ThreadCallback.hx which could be used for that. Yes this feels a bit Java, but I do think it's the best approach for this kind of stuff.

Do you think that would work?

@back2dos
Copy link
Copy Markdown
Member

back2dos commented Mar 8, 2026

I was thinking about having a general haxe.ICloseable for this purpose.

That would be great. Although personally, I would say "close" is not the best choice for a general name, or even the specific one. One doesn't "open" a thread callback. Common alternatives include "cancel" (C#'s CancellationToken, Swift's Task, Kotlin's Job, tink's CallbackLink), "abort" (JavaScript's AbortSignal + AbortController). There's also "dispose", "kill", "stop", "cleanup" in various places. So many choices!

@Simn
Copy link
Copy Markdown
Member

Simn commented Mar 8, 2026

My workaround for that terminology problem is that I understand it as closing the handle, which is a common term. What exactly that means then depends on what the handle actually handles.

Replace register(target, heldValue, ?unregisterToken) + unregister(token)
with register(target, heldValue):ICloseable. Calling close() on the
returned handle cancels the finalization.

This removes ObjectMap dependency from all non-JS targets and eliminates
silent token leak issues. Add haxe.ICloseable as a general-purpose
interface following IThreadCallbackHandle precedent.
@jdonaldson jdonaldson changed the title Add haxe.GcFinalizer (FinalizationRegistry pattern) Add haxe.GcFinalizer with handle-based API Mar 8, 2026
@jdonaldson
Copy link
Copy Markdown
Member Author

jdonaldson commented Mar 8, 2026

I like to follow YAGNI principles, but i think it's "neighborly" to offer an Interface as a sort of contract for this behavior. We don't necessarily need to care about how the garbage collection actually happens, as long as it just follows the contract from our point of view. Also, our platforms don't consistently provide all the features we need here, I think it's prudent to offer a mechanism to enable supporting it cleanly.

@back2dos
Copy link
Copy Markdown
Member

back2dos commented Mar 9, 2026

My workaround for that terminology problem is that I understand it as closing the handle, which is a common term. What exactly that means then depends on what the handle actually handles.

Is a workaround really the best choice here? Names, like jokes, work better when they don't rely on explanations ;)

@jdonaldson
Copy link
Copy Markdown
Member Author

I don't have any real skin in the game on naming here, never had to do something like that. We could put the interface somewhere out of the way, I bet only AI will go digging for it.

@Simn
Copy link
Copy Markdown
Member

Simn commented Mar 10, 2026

I don't see the controversy here. You register a callback and doing so gives you a handle, which you can close, which in turn cancels/detaches/unregisters/aborts/stops/kills the callback. Out of all these, the only acceptable ones would be detach and unregister because the others suggest to me that an ongoing callback execution could be interrupted.

We'll have to fix the C++ implementation because it doesn't work, and even if it did work that code doesn't look right... maybe @Aidan63 can help with that.

Copy link
Copy Markdown
Contributor

@Aidan63 Aidan63 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like its getting confused about the {} type and the erased T parameter, I doubt anyones used fromStaticFunction like that (and I'm not sure they should), but I can take a look anyway.

I've taken a general look at this and it seems... madly convoluted. I think I understand what the cpp implementation is trying to do, even if just about nothing with it does / would work. In general I'm still not entirely sure why there's this token system, what is it trying to solve that a more simple function finalise<T>(obj:T, callback:T->Void):IClosable doesn't?

@tobil4sk
Copy link
Copy Markdown
Member

My workaround for that terminology problem is that I understand it as closing the handle, which is a common term. What exactly that means then depends on what the handle actually handles.

If close makes sense in the context of handles, then maybe it's better to call it haxe.IHandle to make that semantic connection clear?

@Simn
Copy link
Copy Markdown
Member

Simn commented Mar 10, 2026

My workaround for that terminology problem is that I understand it as closing the handle, which is a common term. What exactly that means then depends on what the handle actually handles.

If close makes sense in the context of handles, then maybe it's better to call it haxe.IHandle to make that semantic connection clear?

That sounds like a wonderful idea which should make everyone happy. By having a common interface like that we allow a more unified way of dealing with this which we can also adapt for the thread callback handles and the coro scheduler handles. Plus it's easier to pronounce than closeable.

I also just double-checked that you can redefine methods in interfaces:

Code_BWEXW2U6lx

Rename haxe.ICloseable to haxe.IHandle per reviewer consensus
(tobil4sk suggested, Simn endorsed). The name makes the semantic
connection clear: calling close() on a handle.

Stub out the cpp GcFinalizer with NotImplementedException. The
previous implementation had correctness issues flagged by Aidan63:
dynamic callback invocation allocates during GC finalization
(crashes hxcpp), and Reflect.setField on arbitrary {} objects
won't work on typed cpp classes. Simn asked Aidan63 to help
provide a proper cpp implementation.
@jdonaldson
Copy link
Copy Markdown
Member Author

Pushed changes addressing the feedback:

Rename: ICloseableIHandle per @tobil4sk's suggestion and @Simn's endorsement.

C++ implementation: Stubbed out with NotImplementedException. @Aidan63's review was right — the previous implementation had fundamental issues (dynamic callback invocation allocating during GC finalization, Reflect.setField on arbitrary {} objects, type mismatch with fromStaticFunction). Since @Simn asked for Aidan63's help with the cpp implementation, I've removed the broken code rather than try to paper over it.

Re: the API design question ("why not just function finalise<T>(obj:T, callback:T→Void):IClosable?"):

The held-value pattern comes from JS's FinalizationRegistry. The reason: the target object may already be in an invalid/partially-collected state when the finalizer runs, so passing it to the callback is unsafe. The held value is a separate object guaranteed to be alive. This is the same rationale MDN documents for FinalizationRegistry.

(Side note on spelling: the codebase uses American finalize/Finalizer for its own APIs — cpp.Finalizable, Finalization.ml, etc. — while faithfully wrapping OCaml's British Gc.finalise in the eval target. GcFinalizer follows the established convention.)

@jdonaldson
Copy link
Copy Markdown
Member Author

@Aidan63 I've addressed the atomics feedback — cancelled is now an AtomicBool with compareExchange(false, true) in close() on both eval and JVM. Also nulling out reg.callback and reg.heldValue after the finalizer fires on eval (JVM's pollQueue already did this).

For the cpp implementation, I wanted to ask your advice since you know hxcpp internals. The core constraint is that we can't allocate or do GC operations inside an hxcpp finalizer callback. The options I've been considering:

  1. Zombie queue via GCSetFinalizer + GCDoNotKill: Registration proxy gets a native finalizer that calls GCDoNotKill on itself, pushing it to the zombie queue. register() polls GCGetNextZombie() lazily to fire callbacks on the main/safe side.

  2. GCSetFinalizer + atomic flag: The native finalizer just sets an atomic ready flag (no allocation). register() polls all registrations checking that flag, fires callbacks for ready ones.

  3. IFinalizable / RunFinalizers: Registration implements IFinalizable.finalize(), hxcpp calls it directly — but unclear how to tie the registration's lifetime to the target:{} object's collection.

Which pattern would you recommend? Or is there a better approach I'm not seeing? The key requirement is: when target is collected, we need to safely invoke callback(heldValue) from non-GC context.

@jdonaldson
Copy link
Copy Markdown
Member Author

Also, FWIW, keep in mind CPP already throws a NotImplemented exception. I was going to take a stab at those in a separate PR, but if it helps keep context together, we can do it all here.

@Aidan63
Copy link
Copy Markdown
Contributor

Aidan63 commented Mar 14, 2026

I think the zombie queue sounds best, that method also seems easiest to switch over to something fancier down the line. I'm thinking eventually hxcpp might have a dedicated finalisation queue and thread, when the GC would call the finaliser it instead puts the object onto a queue and the finaliser thread would take items off the queue and invoke the callback concurrently with user haxe code.

I had a quick look at immix.cpp and there are already four queues it iterates over which store various finalisers! So there might be some rationalisation to be done here.

@jdonaldson
Copy link
Copy Markdown
Member Author

Update: cpp implementation + target:{} type analysis

The {} type is correct for this API

Investigated whether target:{} needs to change based on @Aidan63's concern about reflection. Key findings:

  • {} as a type parameter constraint already exists throughout the stdlib for exactly this purpose: ObjectMap<K:{}, V>, WeakMap<K:{}, V>, AtomicObject<T:{}> — all use <T:{}> to mean "must be a reference type, not a primitive"
  • On cpp, type parameters erase to Dynamic regardless (no @:generic needed for our use case)
  • The reflection concern was about the old approach that wrote fields onto target. The new implementation never touches target's fields
  • There's no Haxe type that expresses "is a heap-allocated GC-managed object" — {} is the closest semantic match and is the established pattern

cpp implementation: WeakRef + polling (zombie queue pattern)

Pushed a cpp implementation using cpp.vm.WeakRef — same pattern as the JVM WeakReference + ReferenceQueue approach:

  • Each Registration holds a WeakRef<Dynamic> to the target (doesn't prevent GC)
  • pollQueue() checks each weak ref — if get() returns null, target was collected → invoke callback
  • Polling happens on each register() call (same as JVM)
  • Uses AtomicBool + compareExchange for cancelled flag (consistent with eval/JVM)
  • Also cleans up cancelled registrations eagerly during polling

Why WeakRef instead of setFinalizer:

  • setFinalizer requires cpp.Callable (static function pointer) — can't capture registration state
  • Finalizer callbacks can't allocate — dynamic dispatch and Reflect.field might allocate internally
  • WeakRef polling avoids all GC-safety concerns entirely
  • Same tradeoff as JVM: callbacks are lazy (fire on next register() call), not immediate

Limitations to discuss:

  • Callbacks only fire when pollQueue() runs (triggered by register()). If the user registers once and never again, collected targets won't trigger callbacks. A public poll() method could address this but would diverge from the base API.
  • @:unreflective types should still work as targets since we never call Reflect.setField on them.

@Aidan63 — does this approach look reasonable as a starting point? You mentioned hxcpp might eventually get a dedicated finalization queue/thread, which could replace the polling with event-driven notification later.

cpp: WeakRef-based polling (same pattern as JVM's ReferenceQueue) avoids
all GC-safety concerns — no allocations during finalization, no static
function pointer constraints.

eval/JVM: Use AtomicBool with compareExchange for cancelled flag, null
out callback and heldValue after processing to prevent retention.

Tests: Add cpp to supported target conditionals.
@Aidan63
Copy link
Copy Markdown
Contributor

Aidan63 commented Mar 17, 2026

Following the jvm and eval approach seems like a good idea until I get a chance to look at the callable issue or a hxcpp impl.

The atomic bools also look like a good change, I think it might be pretty simple to enhance that slightly more as well. E.g. If we set the atomic bool in the handles close to true we could then null out all fields on the registration which might help with the limit that the user needs to call poll regually for for cleanup to happen on some targets.

E.g. in the hande close

function close() {
    if (false == reg.cancelled.compareExchange(false, true)) {
        // Null / clear whatever we want in the registration.
    }
}

Then instead of calling load the callback invoking code should also try and set the cancelled bool to true, if it does so the user has not cancelled it so everything is good / won't be nulled out.

if (false == reg.cancelled.compareExchange(false, true)) {
    reg.callback(reg.heldValue);
}

I think that logic makes sense, but someone else might want to give it a look over.

Both Handle.close() and callback invocation now race via
compareExchange to claim ownership of the registration. The winner
eagerly nulls callback/heldValue, ensuring fields are freed
immediately regardless of which side fires first.

Applies to eval, JVM, and cpp implementations.
@back2dos
Copy link
Copy Markdown
Member

You register a callback and doing so gives you a handle, which you can close, which in turn cancels/detaches/unregisters/aborts/stops/kills the callback.

I thought about this for a while, and I'm still not a fan. The terminology is a bit fuzzy here, but I would say that a "handle":

  • indirectly references a resource that is actually acquired outside the running process (typically from the kernel), like a file
  • allows operating on said resource, e.g. read, write etc. ... closing is not an operation, it is a necessary overhead, that approaches like RAII and its derivatives attempt to actually attempt to hide

Deregistering a callback is a safe, fast in-process operation. Closing an open file requires a kernel call as the absolute minimum, which means a context switch and a possibly failure. If it happens to be NFS, then this can take tens if not hundreds of milliseconds and failure is a rather likely outcome.

Why not call it Registration? Can be used for callbacks, filters, plugins etc.

Out of all these, the only acceptable ones would be detach and unregister because the others suggest to me that an ongoing callback execution could be interrupted.

Not really, since haxe.Timer::stop doesn't do that either. And the reason is quite simple: the few runtimes that still even allow preemptive interruption of ongoing execution strongly discourage it. Because if it acquired mutexes or resources, those are never released. This is not an expectation to contend with. Perhaps there's still a good reason to prefer detach/unregister, but this ain't it ;)

@jdonaldson
Copy link
Copy Markdown
Member Author

@Aidan63 Thanks for the suggestion — this is already in place across all three targets. Both close() and the callback/polling paths use compareExchange(false, true) as the gate, and the winner nulls out callback and heldValue immediately. The cpp implementation also has an extra cancelled.load() check in pollQueue() to eagerly prune registrations that were closed but whose target hasn't been collected yet.

@jdonaldson
Copy link
Copy Markdown
Member Author

@back2dos hazarding an opinion here... Handle does feel more closely related to resource references for the kernel, but "registration" feels mismatched too in some way, at least for coro (we don't really "register" a coroutine).

Naming things always sucks!

@Simn
Copy link
Copy Markdown
Member

Simn commented Mar 22, 2026

Libuv calls everything a handle too and has both timer and async handles, which is a good argument from authority. Let's stick to this so we don't get lost in endless naming discussions.

@back2dos
Copy link
Copy Markdown
Member

Libuv calls everything a handle too and has both timer and async handles, which is a good argument from authority. Let's stick to this so we don't get lost in endless naming discussions.

Libuv calls these uv_handle. That is very specific, while whatever is happening here is not. My issue is less with the name, but that I don't really understand what the category is, that's being "generalized" over. Simple, practical question: should Input and Output implement this general interface, or not?

@Simn
Copy link
Copy Markdown
Member

Simn commented Mar 22, 2026

Simple, practical question: should Input and Output implement this general interface, or not?

I'd have to think about all the possible scenarios, but my inclination would be that Output should while Input should not. That's probably what I would suggest if we created these types right now because it's conceptually sound to have writers tell us when they're done writing. I have no plans to mess with these two at the moment though!

What's being generalized is the notion of "here's a thing that you can close when you're done with it".

@jdonaldson
Copy link
Copy Markdown
Member Author

jdonaldson commented Mar 24, 2026

This is definitely a grey area. close() on a finalizer is fast and in-process. A close() on an Output could block, fail, require flushing... involve syscall... That sort of uncertainty undermines a unified naming abstraction IMHO. (This is just me thinking with my meat brain, Claude seems to have no problem either way).

@Simn
Copy link
Copy Markdown
Member

Simn commented Mar 24, 2026

I asked my Claude too:


The naming question: IHandle vs ICloseable vs Registration vs something else

I side with IHandle. Here's my reasoning:

Against ICloseable: back2dos's critique is valid — "close" implies an open/close lifecycle, which doesn't fit callbacks or coroutine handles. The asymmetry is weird ("you opened it?").

Against Registration: Simn's right that it doesn't generalize — you don't "register" a coroutine, timer, etc. It's too domain-specific for what's meant to be a shared interface.

For IHandle: back2dos argues handles imply kernel resources, but I think that's overly pedantic. The term "handle" in everyday programming usage is simply "an opaque reference you can use to control something". close() on a handle reads naturally: you're releasing your grip on the thing. The key semantic is "here's a thing, close it when done." libuv's precedent is solid.

back2dos's key question ("should Input/Output implement it?") is actually the strongest argument — if yes, then the interface needs to be loose enough for I/O which has failure modes, and close() already means both "release a GC registration" and "flush+syscall on a stream". That semantic overload is a real concern.

My suggestion: Keep IHandle with close(), but treat it explicitly as a cancellation/release interface, not an I/O interface. Don't retrofit it onto Input/Output — that's a separate concern. The interface is deliberately narrow: it means "you got this object from registering/scheduling something, and you can close it to undo that". That's coherent and well-precedented.

The endless naming discussion is itself the biggest argument for just picking IHandle and committing.

@jdonaldson
Copy link
Copy Markdown
Member Author

I like this because now we can just blame Claude instead of piling it on Simn like usual.

@jdonaldson
Copy link
Copy Markdown
Member Author

@Simn If you're happy with where this landed, would you mind approving? CI is green and I believe all the feedback has been addressed.

@Simn Simn merged commit 14becd8 into HaxeFoundation:development Apr 12, 2026
50 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants