-
-
Notifications
You must be signed in to change notification settings - Fork 33.2k
diagnostics_channel: fix race condition with diagnostics_channel and GC #59910
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
diagnostics_channel: fix race condition with diagnostics_channel and GC #59910
Conversation
Failed to start CI⚠ No approving reviews found ✘ Refusing to run CI on potentially unsafe PRhttps://github.com/nodejs/node/actions/runs/17791170306 |
Secondary pointdiagnostics_channel has always been subject to GC uintuitiveness, and some of its API was thus deprecated because of it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Just one non-blocking suggestion.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #59910 +/- ##
========================================
Coverage 88.28% 88.29%
========================================
Files 702 702
Lines 206904 207030 +126
Branches 39808 39829 +21
========================================
+ Hits 182665 182789 +124
+ Misses 16256 16246 -10
- Partials 7983 7995 +12
🚀 New features to boost your workflow:
|
e84ab55
to
a40e16b
Compare
Nice spot.
The issue raised here relates to the behaviour of the channel API as a whole, not specifically to that part of the API that was subject to the deprecation notice. (Indeed, the deprecation notice's previous "safe" suggestion of using the top-level subscribe functions is equally affected.) I don't think that the rationale for DEP0163's revocation is affected by this PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might misread some code, I just believe we might be able to use SafeWeakRef
directly and not increment and decrement at all in here after the change. I believe that is effectively the same. That would simplify the code quite a bit.
The channel object may never strongly exist in user space, hence the rationale for |
@Renegade334 I mean the following: diff --git a/lib/diagnostics_channel.js b/lib/diagnostics_channel.js
index 312bd258f58..6d6e392446d 100644
--- a/lib/diagnostics_channel.js
+++ b/lib/diagnostics_channel.js
@@ -17,6 +17,7 @@ const {
ReflectApply,
SafeFinalizationRegistry,
SafeMap,
+ SafeWeakRef,
SymbolHasInstance,
} = primordials;
@@ -31,30 +32,25 @@ const {
const { triggerUncaughtException } = internalBinding('errors');
-const { WeakReference } = require('internal/util');
-
-// Can't delete when weakref count reaches 0 as it could increment again.
// Only GC can be used as a valid time to clean up the channels map.
class WeakRefMap extends SafeMap {
#finalizers = new SafeFinalizationRegistry((key) => {
- this.delete(key);
+ if (!this.has(key)) {
+ this.delete(key);
+ }
});
set(key, value) {
this.#finalizers.register(value, key);
- return super.set(key, new WeakReference(value));
+ return super.set(key, new SafeWeakRef(value));
}
get(key) {
- return super.get(key)?.get();
- }
-
- incRef(key) {
- return super.get(key)?.incRef();
+ return super.get(key)?.deref();
}
- decRef(key) {
- return super.get(key)?.decRef();
+ has(key) {
+ return !!this.get(key);
}
}
@@ -101,7 +97,6 @@ class ActiveChannel {
validateFunction(subscription, 'subscription');
this._subscribers = ArrayPrototypeSlice(this._subscribers);
ArrayPrototypePush(this._subscribers, subscription);
- channels.incRef(this.name);
}
unsubscribe(subscription) {
@@ -113,15 +108,12 @@ class ActiveChannel {
this._subscribers = before;
ArrayPrototypePushApply(this._subscribers, after);
- channels.decRef(this.name);
maybeMarkInactive(this);
return true;
}
bindStore(store, transform) {
- const replacing = this._stores.has(store);
- if (!replacing) channels.incRef(this.name);
this._stores.set(store, transform);
}
@@ -132,7 +124,6 @@ class ActiveChannel {
this._stores.delete(store);
- channels.decRef(this.name);
maybeMarkInactive(this);
return true;
+++ b/test/fixtures/source-map/output/source_map_assert_source_line.snapshot
@@ -7,7 +7,7 @@ AssertionError [ERR_ASSERTION]: The expression evaluated to a falsy value:
*
*
*
- at TracingChannel.traceSync (node:diagnostics_channel:322:14)
+ at TracingChannel.traceSync (node:diagnostics_channel:313:14)
*
*
* This passes all tests and I believe this is correct, since we never actually use the increment / decrement counter in this code. I didn't check: do we have a similar issue with domain? That is also using the increment / decrement counter at the moment. |
The incRef was needed because we want the channel object itself to be possible to garbage collect if there are no references, however we also need the subscribe to hold it alive for as long as a subscriber is present. The reason for this is that one can do |
If I understand your message @Qard, the reason of having a WeakReference is to be safe of this type of things: dc.channel('mychannel').subscribe(() => { /* my function */ }) // no references to the channel
// => later. after some Garbage collections
dc.channel('mychannel').hasSubscriptions // without the ref count, this can be false and the subscription can be lost |
Yes, without the incRef the second |
Thanks, that makes sense! Should we add another test that covers this case as well? That way it would be more obvious. The example by @uurien would just need a gc call in-between. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
It would be nice to get the other test we spoke about and I am fine either way about adding the has
method or not.
I'm going to add another test to this use case
I'm also changing it to |
a40e16b
to
4ddedfd
Compare
I've addressed the changes, it'd be nice if some maintainer could run the CI. Thanks! |
When a garbage collector is executed, the callback of
FinalizationRegistry
it is not executed synchronously with the GC, it is executed later, in the next event loop.That means that there is a corner case in the
WeakRefMap
object in diagnostics channel. Eventually could happen that an event is GC and created again before the execution of the callback ofFinalizationRegistry
. When this happens, the key object is deleted from theWeakRefMap
even when it has a valid value.This behavior can be reproduced with this code added in tests:
This code fails in
main
branch, but it works as expected in the current branch.