-
Notifications
You must be signed in to change notification settings - Fork 60
Description
Overview
This proposal outlines how to add support for the io.katacontainers.config.hypervisor.memory_overhead
annotation in the OpenShift Sandboxed Containers Operator. This annotation allows external operators to specify VM memory overhead, which is crucial for proper cgroup management and preventing OOM issues.
Background
The Kata Containers runtime will soon supports a memory overhead compensation mechanism that requires the io.katacontainers.config.hypervisor.memory_overhead
annotation to be set on pods. This annotation works in conjunction with the existing RuntimeClass PodFixed overhead to ensure proper memory accounting.
Current State
The operator currently:
- Creates RuntimeClass objects with PodFixed overhead (CPU and memory)
- Uses hardcoded memory overhead values: 350Mi for kata, 120Mi for kata-remote
- Does not inject any Kata-specific annotations on pods
Proposed Solution
1. Add Memory Overhead Configuration to KataConfigSpec
Add a new field to the KataConfigSpec to allow users to configure memory overhead:
// KataConfigSpec defines the desired state of KataConfig
type KataConfigSpec struct {
// ... existing fields ...
// MemoryOverheadMB specifies the memory overhead in MB for Kata containers
// This value will be used to set the io.katacontainers.config.hypervisor.memory_overhead annotation
// +optional
// +kubebuilder:default:=350
MemoryOverheadMB *int32 `json:"memoryOverheadMB,omitempty"`
}
2. Create a Pod Mutation Webhook
Create a new mutating admission webhook that:
- Intercepts pod creation/update events
- Checks if the pod uses a Kata runtime class
- Injects the memory overhead annotation based on the KataConfig
3. Implementation Details
3.1 Webhook Structure
// +kubebuilder:webhook:path=/mutate-pods-v1,mutating=true,failurePolicy=fail,sideEffects=None,groups="",resources=pods,verbs=create;update,versions=v1,name=mpods.kb.io,admissionReviewVersions=v1
type PodMutator struct {
client.Client
Log logr.Logger
}
func (m *PodMutator) Handle(ctx context.Context, req admission.Request) admission.Response {
// Implementation to inject memory overhead annotation
}
3.2 Annotation Injection Logic
func (m *PodMutator) injectMemoryOverheadAnnotation(pod *corev1.Pod) error {
// Check if pod uses Kata runtime class
if !m.usesKataRuntime(pod) {
return nil
}
// Get memory overhead from KataConfig
memoryOverhead, err := m.getMemoryOverheadFromKataConfig()
if err != nil {
return err
}
// Set annotation
if pod.Annotations == nil {
pod.Annotations = make(map[string]string)
}
pod.Annotations["io.katacontainers.config.hypervisor.memory_overhead"] = fmt.Sprintf("%d", memoryOverhead)
return nil
}
3.3 Runtime Class Detection
func (m *PodMutator) usesKataRuntime(pod *corev1.Pod) bool {
runtimeClassName := pod.Spec.RuntimeClassName
if runtimeClassName == nil {
return false
}
// Check against known Kata runtime classes
kataRuntimeClasses := []string{"kata", "kata-remote"}
for _, rtc := range kataRuntimeClasses {
if *runtimeClassName == rtc {
return true
}
}
return false
}
4. Configuration Management
4.1 Default Values
- Default memory overhead: 350MB (matching current RuntimeClass overhead; this is likely to change soon)
- Configurable per KataConfig instance
- Fallback to hardcoded values if not specified
4.2 Validation
- Memory overhead must be positive
- Maximum reasonable limit (e.g., 4GB) to prevent abuse
- Validation webhook to ensure consistency