-
Notifications
You must be signed in to change notification settings - Fork 21
feat: add exhaust node resources fault #329
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
feat: add exhaust node resources fault #329
Conversation
Signed-off-by: Pree Simphliphan <[email protected]>
sre/roles/faults/tasks/inject_custom_exhaust_node_resources.yaml
Outdated
Show resolved
Hide resolved
sre/roles/faults/tasks/inject_custom_exhaust_node_resources.yaml
Outdated
Show resolved
Hide resolved
rohanarora
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sre/roles/faults/tasks/inject_custom_exhaust_node_resources.yaml
Outdated
Show resolved
Hide resolved
|
Thank you for your comment. Below are what I changed based on what we discussed in last Thursday's meeting.
|
Red-GV
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few changes I'm noticing on this round of the review. Sorry if I didn't mention them earlier.
sre/roles/faults/tasks/inject_custom_exhaust_node_resources.yaml
Outdated
Show resolved
Hide resolved
sre/roles/faults/tasks/inject_custom_exhaust_node_resources.yaml
Outdated
Show resolved
Hide resolved
sre/roles/faults/tasks/inject_custom_exhaust_node_resources.yaml
Outdated
Show resolved
Hide resolved
sre/roles/faults/tasks/inject_custom_exhaust_node_resources.yaml
Outdated
Show resolved
Hide resolved
sre/roles/faults/tasks/inject_custom_exhaust_node_resources.yaml
Outdated
Show resolved
Hide resolved
sre/roles/faults/tasks/remove_custom_exhaust_node_resources.yaml
Outdated
Show resolved
Hide resolved
Red-GV
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just one minor thing in the ground truth and I think we're good here.
@rohanarora I'm going to recommend that we table the node labeling for now until we add observability node labels to install the tools onto. That also means we should probably avoid smoke testing this one for now because it might render the cluster inoperable.
Red-GV
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@preespp Not seeing any of the alerts go off for this one. I suggest doing two things for this PR:
- Look at Incident 26/27 for how to enable a ChaosMesh fault rather than manually orchestrating it.
- Make a new fault that removes resource limits on a workload. Technically, you can make two incidents with that. Though I think that runs a bit into #336 with that.
Added a new fault: exhaust-node-resources and a corresponding incident spec (Incident 297) from issue #297