-
Notifications
You must be signed in to change notification settings - Fork 42
Proposal for placement sort order and score visibility #143
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,173 @@ | ||
| # Placement Score Visibility | ||
|
|
||
| ## Release Signoff Checklist | ||
|
|
||
| - [] Enhancement is `provisional` | ||
| - [] Design details are appropriately documented from clear requirements | ||
| - [] Test plan is defined | ||
| - [] Graduation criteria for dev preview, tech preview, GA | ||
| - [] User-facing documentation is created in [website](https://github.com/open-cluster-management-io/open-cluster-management-io.github.io/) | ||
|
|
||
| ## Summary | ||
|
|
||
| This proposal extends existing Placement mechanisms to surface the scores of each cluster | ||
| in PlacementDecision so that users (or future OCM features) can make advanced scheduling decisions | ||
| based on the scored state of clusters. | ||
|
|
||
| In addition, it adds support for different sort strategies on the decisions generated by a Placement. | ||
| By default, decisions will continue to be sorted by "Name". An additional sort strategy "Score" will | ||
| enable sorting decisions numerically in descending order by prioritizer score. | ||
|
|
||
| ## Motivation | ||
|
|
||
| Placement scores are a powerful feature of OCM for scheduling workloads, but there is currently | ||
| no reliable way for users to see what scores were assigned to clusters for a placement. Scores are | ||
| reported in events, but these are ephemeral. Since PlacementDecisions are orderd alphanumerically | ||
| there is no good way to see the relative scores of clusters in a decision other than recomputing | ||
| them for each cluster in the decision group(s). | ||
|
|
||
| A user may wish to use Placement to find all clusters that match some conditions, but then pick | ||
| only the top N clusters for their workload. While this could be accomplished by setting | ||
| `numberOfClusters: N` on the placement, this means a new placement must be created for every value | ||
| of N. Another use-case would be to distribute workloads out to selected clusters randomly weighted | ||
| by their scores. There is currently no way to do this in OCM without recomputing scores for each cluster | ||
| listed in the PlacementDecisions. Displaying scores in an easy to consume manner gives more flexibility | ||
| to the user, and enables a single Placement to be reused for many different scheduling decisions | ||
| (reducing API pressure on the hub). | ||
|
|
||
| ## Proposal | ||
|
|
||
| Add `SortBy` to `PlacementSpec` to configure the sort order for PlacementDecision. This field | ||
bhperry marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| is an enum with values "Name" and "Score". Add `Score` to `ClusterDecision` to reflect the | ||
| latest score of each cluster in the decision. | ||
|
|
||
| ### Design Details | ||
|
|
||
| #### API change | ||
|
|
||
| Field for sorting decisions | ||
|
|
||
| ```go | ||
| type PlacementSpec struct { | ||
| ... | ||
|
|
||
| // SortBy sets the sort order for decisions. | ||
| // It can be "Name", "Score", or "", where "" is "Name" by default. | ||
| // If sortBy is "Name", decisions will be orderd alphanumerically by cluster name | ||
| // If sortBy is "Score", decisions will be ordered numerically in descending order by score, | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. As mentioned above, need to notice user that when sortBy is "Score", the placement decision will trigger an update when score changes. Since today we didn't control the frequency of how user update the AddonPlacmentScore, that means if the score changed too frequently, the decision result will be updated frequently also. Need to set an appropriate score update frequency.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Defining the time (eg. every 1 minutes) to reschedule the placement might be a way to avoid the pressure on API server, but as the above discussed, the update of the score is uncontrollable, so perhaps the best way for now is just add some notice to the user.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. it might make sense to define a rateLimit, such as MaxUpdatePerMinute |
||
| // then by cluster name in the event of a tie | ||
| // +optional | ||
| SortBy PlacementSortByType `json:"sortBy,omitempty"` | ||
| } | ||
|
|
||
| // +kubebuilder:validation:Enum=Name;Score | ||
| type PlacementSortByType string | ||
|
|
||
| const ( | ||
| // PlacementSortByName sorts decisions alphanumerically by cluster name | ||
| PlacementSortByName PlacementSortByType = "Name" | ||
| // PlacementSortByScore sorts decisions numerically in descending order by score. | ||
| // If one or more clusters are tied on score they will be sorted by name | ||
| PlacementSortByScore PlacementSortByType = "Score" | ||
| ) | ||
| ``` | ||
|
|
||
| Display score to consumer of decisions | ||
|
|
||
| ```go | ||
| type ClusterDecision struct { | ||
| ... | ||
|
|
||
| // Score is the computed score for the cluster based on configured prioritizers | ||
| Score int64 `json:"score"` | ||
bhperry marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| } | ||
|
|
||
| ``` | ||
|
|
||
| #### hub implementation | ||
|
|
||
| In `scheduling_controller.go`, update `divideDecisionGroups` to check the value of `SortBy` before | ||
| sorting and splitting clusters into groups. Any pre-determined decision groups set in | ||
| `placement.Spec.DecisionStrategy.GroupStrategy.DecisionGroups` will be individually ordered within their | ||
| named groups by the `SortBy` strategy. Any remaining clusters will be sorted across all remaining | ||
| groups before being split. | ||
|
|
||
| Prioritizer scores will be passed along with the clusters slice so that each cluster's score can | ||
| be looked up when generating `clusterapiv1beta1.ClusterDecision`. | ||
|
|
||
| #### examples | ||
|
|
||
| Sort placement decisions by score | ||
|
|
||
| ```yaml | ||
| apiVersion: cluster.open-cluster-management.io/v1beta1 | ||
| kind: Placement | ||
| metadata: | ||
| name: h100 | ||
| namespace: example | ||
| spec: | ||
| clusterSets: | ||
| - global | ||
| predicates: | ||
| - requiredClusterSelector: | ||
| labelSelector: | ||
| matchLabels: | ||
| gpu.example.com/h100: enabled | ||
| prioritizerPolicy: | ||
| configurations: | ||
| - scoreCoordinate: | ||
| addOn: | ||
| resourceName: hardware-availability | ||
| scoreName: h100 | ||
| type: AddOn | ||
| weight: 1 | ||
| sortBy: Score | ||
| --- | ||
| apiVersion: cluster.open-cluster-management.io/v1beta1 | ||
| kind: PlacementDecision | ||
| metadata: | ||
| labels: | ||
| cluster.open-cluster-management.io/decision-group-index: "0" | ||
| cluster.open-cluster-management.io/decision-group-name: "" | ||
| cluster.open-cluster-management.io/placement: h100 | ||
| name: h100-decision-1 | ||
| namespace: example | ||
| ownerReferences: | ||
| - apiVersion: cluster.open-cluster-management.io/v1beta1 | ||
| controller: true | ||
| kind: Placement | ||
| name: h100 | ||
| status: | ||
| decisions: | ||
| - clusterName: cluster-2 | ||
| score: 80 | ||
| - clusterName: cluster-1 | ||
| score: 65 | ||
| - clusterName: cluster-5 | ||
| score: 10 | ||
| - clusterName: cluster-6 | ||
| score: 10 | ||
| ``` | ||
|
|
||
|
|
||
| ### Test Plan | ||
| - test that the default alphanumeric sort order remains the same | ||
| - test that `sortBy: Score` properly orders clusters across decision groups | ||
| - test that named decision groups are sorted individually based on `sortBy` | ||
| - test that decision groups are updated when scores change | ||
|
|
||
| ### Graduation Criteria | ||
| N/A | ||
|
|
||
| ### Upgrade Strategy | ||
| It will need upgrade on CRD of Placement and PlacementDecision on hub cluster. No changes to agents. | ||
|
|
||
| When a user needs to use this feature with an existing `Placement`, the user needs to add | ||
| `sortBy: Score` to their placements. No changes are required to update existing decisions | ||
| with score (regardless of sortBy). Once the placement controller is updated, all | ||
| `PlacementDecisions` will be updated with prioritizer scores on their next reconciliation. | ||
|
|
||
| ### Version Skew Strategy | ||
| - The new fields are optional, and if not set, the placement decisions will be ordered the same | ||
| as in previous versions. | ||
| - Older versions of the placement controller will ignore the newly added fields | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,13 @@ | ||
| title: placement-score-visibility | ||
| authors: | ||
| - "@bhperry" | ||
| reviewers: | ||
| - "@deads2k" | ||
| - "@elgnay" | ||
| - "@zhujian7" | ||
| approvers: | ||
| - "@elgnay" | ||
| creation-date: 2025-05-22 | ||
| last-updated: 2025-05-22 | ||
| status: provisional | ||
| see-also: [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one of the issue with sorting by score is that if the score changed to frequently, the decision result will be updated frequently also. We need to think about some approach to avoid that, or at least some document to notice users.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Including that note in the description of
SortBy: Scoreperhaps?What is the main concern with decision results updating frequently? Including score in the decision results will result in the same number of updates (although not reordering without SortBy). In terms of API pressure, OCM already creates events for each score update. Are there expensive reconcilers that run on update to PlacementDecision?