Skip to content
Merged
Show file tree
Hide file tree
Changes from 7 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions build/container/Dockerfile.metadata
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,6 @@ RUN apt-get update \
# Expose metadata service port
EXPOSE 8090

# Run metadata server
CMD ["python", "-m", "aibrix.metadata.app", "--host", "0.0.0.0", "--port", "8090"]
# Set entrypoint for Metadata service
ENTRYPOINT ["aibrix_metadata", "--enable-k8s-job", "--host", "0.0.0.0"]

10 changes: 8 additions & 2 deletions config/gateway/gateway-plugin/gateway-plugin.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -134,7 +134,7 @@ spec:
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
name: reserved-router-models-endpoint
name: reserved-router-metadata-endpoint
namespace: system
spec:
parentRefs:
Expand All @@ -144,6 +144,12 @@ spec:
- path:
type: PathPrefix
value: /v1/models
- path:
type: PathPrefix
value: /v1/files
- path:
type: PathPrefix
value: /v1/batches
backendRefs:
- name: aibrix-metadata-service
port: 8090
Expand All @@ -157,7 +163,7 @@ spec:
targetRef:
group: gateway.networking.k8s.io
kind: HTTPRoute
name: aibrix-reserved-router-models-endpoint
name: aibrix-reserved-router-metadata-endpoint
---
# this is a dummy route for incoming request and,
# then request is routed to httproute using model name OR
Expand Down
16 changes: 16 additions & 0 deletions config/metadata/job_template_patch.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
apiVersion: batch/v1
kind: Job
metadata:
name: batch-job-template
namespace: default
spec:
parallelism: 1 # Customizable. The number of parallel workers.
completions: 1 # Customizable. Must equal to the parallelism.
backoffLimit: 2 # Customizable, but usually no need to change.
template:
spec:
containers:
- name: batch-worker
image: aibrix/runtime:nightly # Customizable, runtime image
- name: llm-engine
image: aibrix/vllm-mock:nightly # Customizable, LLM engine image
6 changes: 6 additions & 0 deletions config/metadata/kustomization.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,12 @@ resources:
- metadata.yaml
- redis.yaml

configMapGenerator:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need configmap here? I see python folder have many skeleton templates

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The job template patch has to be mapped to the container folder to take effect. This configmap can achieve this without rebuilding the image.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

then should we remove file based templates?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am glad to hear the options. Can you elaborate on how users might customize the job template?
The bottom line is users have to change the k8s_job_template.yaml and rebuild the image.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

template should be managed by operation folks. We can have one template now, leave the flexibility support to later . Once we get more feedback, we can start to work on it

- name: metadata-config
namespace: aibrix-system
files:
- job_template_patch.yaml

labels:
- pairs:
app.kubernetes.io/component: aibrix-metadata-service
89 changes: 87 additions & 2 deletions config/metadata/metadata.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,30 @@ rules:
- apiGroups: ["model.aibrix.ai"]
resources: ["modeladapters"]
verbs: ["get", "list"]
# For batch job watching
- apiGroups: ["batch"]
resources: ["jobs"]
verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
# For batch job ServiceAccount management
- apiGroups: [""]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we use a fix service account, along with aibrix installation? instead of reply on permission here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was originally fixed. I later found that the service account must be under the default namespace, while the kustomize overrides the namespace and puts everything in the aibrix-system.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

em. we can put the file under a separate config/manifest folder without aibrix-system override.

in helm, I think we can support it as well, something like

  namespace: {{ .Values.job.namespace }}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if it works. How is kustomize going to load config/manifest without overriding the namespace? I can have a try.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can ask user to follow a guidance to create a RBAC in short term, we do not want the controller to have such high permission.

resources: ["serviceaccounts"]
verbs: ["get", "create", "update", "patch", "delete"]
- apiGroups: ["rbac.authorization.k8s.io"] # for Role management
resources: ["roles"]
verbs: ["get", "create", "update", "patch", "delete"]
- apiGroups: ["rbac.authorization.k8s.io"] # for RoleBinding management
resources: ["rolebindings"]
verbs: ["get", "create", "update", "patch", "delete"]
# For kopf high availability
- apiGroups: ["coordination.k8s.io"]
resources: ["leases"]
verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
- apiGroups: ["apiextensions.k8s.io"] # required by kopf
resources: ["customresourcedefinitions"]
verbs: ["get", "list", "watch"]
- apiGroups: [""] # required by kopf
resources: ["namespaces"]
verbs: ["list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
Expand Down Expand Up @@ -64,14 +88,27 @@ spec:
- name: init-redis
image: busybox
command: ['sh', '-c', 'until echo "ping" | nc aibrix-redis-master 6379 -w 1 | grep -c PONG; do echo waiting for redis; sleep 2; done']
volumes:
- name: config-volume
configMap:
name: metadata-config
containers:
- name: metadata-service
image: metadata-service:latest
imagePullPolicy: IfNotPresent
command: ["python", "-m", "aibrix.metadata.app"]
args: ["--host=0.0.0.0", "--port=8090"]
command:
- aibrix_metadata
- --host
- "0.0.0.0"
- --enable-k8s-job
- --k8s-job-patch
- /app/config/job_template_patch.yaml
ports:
- containerPort: 8090
volumeMounts:
- name: config-volume
mountPath: /app/config
readOnly: true
resources:
limits:
cpu: 500m
Expand All @@ -92,6 +129,54 @@ spec:
valueFrom:
fieldRef:
fieldPath: metadata.namespace
# Object store configuration
# Comment the following lines to disable S3 as the object store
- name: STORAGE_AWS_ACCESS_KEY_ID
valueFrom:
secretKeyRef:
name: aibrix-s3-credentials
key: access-key-id
- name: STORAGE_AWS_SECRET_ACCESS_KEY
valueFrom:
secretKeyRef:
name: aibrix-s3-credentials
key: secret-access-key
- name: STORAGE_AWS_REGION
valueFrom:
secretKeyRef:
name: aibrix-s3-credentials
key: region
- name: STORAGE_AWS_BUCKET
valueFrom:
secretKeyRef:
name: aibrix-s3-credentials
key: bucket-name
# Uncomment the following lines to enable TOS as the object store
# - name: STORAGE_TOS_ACCESS_KEY
# valueFrom:
# secretKeyRef:
# name: aibrix-tos-credentials
# key: access-key
# - name: STORAGE_TOS_SECRET_KEY
# valueFrom:
# secretKeyRef:
# name: aibrix-tos-credentials
# key: secret-key
# - name: STORAGE_TOS_ENDPOINT
# valueFrom:
# secretKeyRef:
# name: aibrix-tos-credentials
# key: endpoint
# - name: STORAGE_TOS_REGION
# valueFrom:
# secretKeyRef:
# name: aibrix-tos-credentials
# key: region
# - name: STORAGE_TOS_BUCKET
# valueFrom:
# secretKeyRef:
# name: aibrix-tos-credentials
# key: bucket-name
livenessProbe:
httpGet:
path: /healthz
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -35,4 +35,4 @@ spec:
targetRef:
group: gateway.networking.k8s.io
kind: HTTPRoute
name: aibrix-reserved-router-models-endpoint
name: aibrix-reserved-router-metadata-endpoint
8 changes: 7 additions & 1 deletion dist/chart/templates/gateway-plugin/httproute.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ spec:
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
name: aibrix-reserved-router-models-endpoint
name: aibrix-reserved-router-metadata-endpoint
namespace: {{ .Release.Namespace }}
labels:
{{- include "chart.labels" . | nindent 4 }}
Expand All @@ -46,6 +46,12 @@ spec:
- path:
type: PathPrefix
value: /v1/models
- path:
type: PathPrefix
value: /v1/files
- path:
type: PathPrefix
value: /v1/batches
backendRefs:
- name: aibrix-metadata-service
port: 8090
2 changes: 0 additions & 2 deletions dist/chart/templates/metadata-service/deployment.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -31,8 +31,6 @@ spec:
- name: metadata-service
image: {{ .Values.metadata.service.container.image.repository }}:{{ .Values.metadata.service.container.image.tag }}
imagePullPolicy: {{ .Values.metadata.service.container.image.imagePullPolicy | default "IfNotPresent" }}
command: ["python", "-m", "aibrix.metadata.app"]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's keep the explicit commands here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I can add an explicit command, but it would be aibrix_metadata

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's ok. user sometimes need to adjust parameters, if there's no config here, it automatically use container entrypoints, and user do not know it unless they know the commands in Dockerfile

Copy link
Collaborator Author

@zhangjyr zhangjyr Oct 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found aibrix_metadatadoes not work in the helm. I have to change to
["python", "-m", "aibrix.metadata.app"].
One more thing, in the config, I disabled k8s-job by default, so users can run other metadata services (v1/models) if they do not configure external object storage (in this case, the k8s job will not work anyway). Do I need to apply the same logic in chart, too? This means users have to adjust values to enable object storage and k8s-job.

args: ["--host=0.0.0.0", "--port=8090"]
ports:
- containerPort: 8090
resources: {{ toYaml .Values.metadata.service.container.resources | nindent 12 }}
Expand Down
24 changes: 24 additions & 0 deletions dist/chart/templates/metadata-service/rbac.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,30 @@ rules:
- apiGroups: ["model.aibrix.ai"]
resources: ["modeladapters"]
verbs: ["get", "list"]
# For batch job watching
- apiGroups: ["batch"]
resources: ["jobs"]
verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
# For batch job ServiceAccount management
- apiGroups: [""]
resources: ["serviceaccounts"]
verbs: ["get", "create", "update", "patch", "delete"]
- apiGroups: ["rbac.authorization.k8s.io"] # for Role management
resources: ["roles"]
verbs: ["get", "create", "update", "patch", "delete"]
- apiGroups: ["rbac.authorization.k8s.io"] # for RoleBinding management
resources: ["rolebindings"]
verbs: ["get", "create", "update", "patch", "delete"]
# For kopf high availability
- apiGroups: ["coordination.k8s.io"]
resources: ["leases"]
verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
- apiGroups: ["apiextensions.k8s.io"] # required by kopf
resources: ["customresourcedefinitions"]
verbs: ["get", "list", "watch"]
- apiGroups: [""] # required by kopf
resources: ["namespaces"]
verbs: ["list", "watch"]
---

apiVersion: rbac.authorization.k8s.io/v1
Expand Down
Loading
Loading