-
Notifications
You must be signed in to change notification settings - Fork 43
Enhancement for feedback scrape type #161
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Enhancement for feedback scrape type #161
Conversation
|
|
||
| ### Risks and Mitigation | ||
|
|
||
| What are the risks of this proposal and how do we mitigate. Think broadly. For |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think one of the risks we have to consider is there might be too many watch event, and there could be too many updates call to the manifestwork.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added some more details to this section on how we can limit the updates to the ManifestWork object.
Need to still think through a few more implementation details (eg. how can MWRS verify if the resource is Ready to stop the informer from watching).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
might be also worth having some metrics defined, e.g. number of started informers.
| In the ManifestWork Controller (pkg/work/spoke/controllers/statuscontroller/availablestatus_controller.go) | ||
| introduce a watch-based path alongside the existing poll loop. When syncing the ManifestWork, register a informer for the resource if `feedbackScrapeType` WATCH. | ||
|
|
||
| When there is a change seen on the WATCH type, patch the status conditions for that resource. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is some technical details we need to consider. For instance, per resource watch might be too heavy, we might want a per resource type informer maintained in a informer pool. So when a mw wants to watch a certain resource, agent register the resource to the pool incrementing a reference count for each resource type. When a resource is removed from mw, or deleted the resource should be unregistered from the pool.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So instead of creating a dynamic informer per resource (with name, namespace, etc.), are you suggesting to create a informer that is more generalized to the resource type?
Can you share if there is some existing functionality in informer factories to support the behavior you mention of registering to the pool and increment reference count?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not think there is, I can build some demo code on that. I think dynamic informer per resource/name/namespace might be too many. I am thinking dynamic informer per resource/namespace is good enough. Just something need to be considered when we implement it.
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: annelaucg, qiujian16 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
| In the ManifestWork Controller (pkg/work/spoke/controllers/statuscontroller/availablestatus_controller.go) | ||
| introduce a watch-based path alongside the existing poll loop. | ||
|
|
||
| When syncing the ManifestWork, register a informer for the resource if `feedbackScrapeType` WATCH. If the `feedbackScrapeType` is no longer WATCH, unregister the informer. If the ManifestWork is in Ready state (Progressing == False and Degraded == False / NotSet), unregister the informer. This will prevent long running WATCH from triggering too many ManifestWork changes. If there is new rollout, and the status changes to Progressing, then register the informer again. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you explain more details about the register/unregister informer part? I have a concern that in some cases, even with this watch informer, we might still not observe the whole status change process of the resource, since it may take several seconds to wait for the informer cache to sync when starting the informer. For example, if the replica number changes from 1->2->3 in 5 seconds, and the waiting for the deployment informer cache sync process takes 6 seconds, the result we see will still be 1->3.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you thinking of the situation on start up? Or just in general when the informer is running? My expectation is that the informer should see every change (which I think is the guarantee of the informer) and that the ManifestWork feedbackStatus will get updated with that change. And when the ManifestWork statusFeedback updates, we will populate this upwards to the MWRS controller.
From my understanding, which step will be missed above since it takes several seconds to wait for informer cache to sync?
The informer will be registered when a new rollout is started with a new MWRS.spec and observedGeneration for the ManifestWork. The informer will never get unregistered unless the feedbackScrapeType is removed from that resource. However, once the MWRS is marked as Ready, the informer will stop sending updates on add/update/delete to the statusFeedback to prevent triggering resync on the MWRS controller level.
No description provided.