Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions build/container/Dockerfile.metadata
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,6 @@ RUN apt-get update \
# Expose metadata service port
EXPOSE 8090

# Run metadata server
CMD ["python", "-m", "aibrix.metadata.app", "--host", "0.0.0.0", "--port", "8090"]
# Set entrypoint for Metadata service
ENTRYPOINT ["aibrix_metadata", "--host", "0.0.0.0"]

10 changes: 8 additions & 2 deletions config/gateway/gateway-plugin/gateway-plugin.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -134,7 +134,7 @@ spec:
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
name: reserved-router-models-endpoint
name: reserved-router-metadata-endpoint
namespace: system
spec:
parentRefs:
Expand All @@ -144,6 +144,12 @@ spec:
- path:
type: PathPrefix
value: /v1/models
- path:
type: PathPrefix
value: /v1/files
- path:
type: PathPrefix
value: /v1/batches
backendRefs:
- name: aibrix-metadata-service
port: 8090
Expand All @@ -157,7 +163,7 @@ spec:
targetRef:
group: gateway.networking.k8s.io
kind: HTTPRoute
name: aibrix-reserved-router-models-endpoint
name: aibrix-reserved-router-metadata-endpoint
---
# this is a dummy route for incoming request and,
# then request is routed to httproute using model name OR
Expand Down
4 changes: 4 additions & 0 deletions config/job/kustomization.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
kind: Kustomization

resources:
- k8s_job_rbac.yaml
24 changes: 24 additions & 0 deletions config/metadata/job_template_patch.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
apiVersion: batch/v1
kind: Job
metadata:
name: batch-job-template
namespace: default
spec:
parallelism: 1 # Customizable. The number of parallel workers.
completions: 1 # Customizable. Must equal to the parallelism.
backoffLimit: 2 # Customizable, but usually no need to change.
template:
spec:
containers:
- name: batch-worker
image: aibrix/runtime:nightly # Customizable, batch job worker image
- name: llm-engine
image: aibrix/vllm-mock:nightly # Customizable, customize your LLM engine image
# command: ["/bin/sh", "-c"] # Customization is not recommended. Know what you are doing.
args: # Customizable in the format of "WORKER_VICTIM=1 [your command] || true"
- |
# Run llm engine.
# 'WORKER_VICTIM=1' helps the batch-worker to identify llm-engine process.
# '|| true' at the end ensures the container llm-engine never fails.
WORKER_VICTIM=1 python app.py || true
readinessProbe: # Customizable, customize your readinessProbe
12 changes: 12 additions & 0 deletions config/metadata/kustomization.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,18 @@ resources:
- metadata.yaml
- redis.yaml

configMapGenerator:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need configmap here? I see python folder have many skeleton templates

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The job template patch has to be mapped to the container folder to take effect. This configmap can achieve this without rebuilding the image.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

then should we remove file based templates?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am glad to hear the options. Can you elaborate on how users might customize the job template?
The bottom line is users have to change the k8s_job_template.yaml and rebuild the image.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

template should be managed by operation folks. We can have one template now, leave the flexibility support to later . Once we get more feedback, we can start to work on it

- name: metadata-config
namespace: aibrix-system
files:
- job_template_patch.yaml

patches:
# Uncomment the following lines to enable S3 as the object store
# - path: s3-env-patch.yaml
# Uncomment the following lines to enable TOS as the object store
# - path: tos-env-patch.yaml

labels:
- pairs:
app.kubernetes.io/component: aibrix-metadata-service
25 changes: 23 additions & 2 deletions config/metadata/metadata.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,20 @@ rules:
- apiGroups: ["model.aibrix.ai"]
resources: ["modeladapters"]
verbs: ["get", "list"]
# For batch job watching
- apiGroups: ["batch"]
resources: ["jobs"]
verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
# For kopf high availability
- apiGroups: ["coordination.k8s.io"]
resources: ["leases"]
verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
- apiGroups: ["apiextensions.k8s.io"] # required by kopf
resources: ["customresourcedefinitions"]
verbs: ["get", "list", "watch"]
- apiGroups: [""] # required by kopf
resources: ["namespaces"]
verbs: ["list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
Expand Down Expand Up @@ -64,14 +78,21 @@ spec:
- name: init-redis
image: busybox
command: ['sh', '-c', 'until echo "ping" | nc aibrix-redis-master 6379 -w 1 | grep -c PONG; do echo waiting for redis; sleep 2; done']
volumes:
- name: config-volume
configMap:
name: metadata-config
containers:
- name: metadata-service
image: metadata-service:latest
imagePullPolicy: IfNotPresent
command: ["python", "-m", "aibrix.metadata.app"]
args: ["--host=0.0.0.0", "--port=8090"]
# Enable S3 or TOS to enable-k8s-job, default disabled.
ports:
- containerPort: 8090
volumeMounts:
- name: config-volume
mountPath: /app/config
readOnly: true
resources:
limits:
cpu: 500m
Expand Down
40 changes: 40 additions & 0 deletions config/metadata/s3-env-patch.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
# This patch contains the S3 object store configuration for the metadata service
# !!Important: Please make sure aibrix-s3-credentials secret exists in both aibrix-system and default namespaces.
apiVersion: apps/v1
kind: Deployment
metadata:
name: metadata-service
namespace: aibrix-system
spec:
template:
spec:
containers:
- name: metadata-service
command:
- aibrix_metadata
- --host
- "0.0.0.0"
- --enable-k8s-job
- --k8s-job-patch
- /app/config/job_template_patch.yaml
env:
- name: STORAGE_AWS_ACCESS_KEY_ID
valueFrom:
secretKeyRef:
name: aibrix-s3-credentials
key: access-key-id
- name: STORAGE_AWS_SECRET_ACCESS_KEY
valueFrom:
secretKeyRef:
name: aibrix-s3-credentials
key: secret-access-key
- name: STORAGE_AWS_REGION
valueFrom:
secretKeyRef:
name: aibrix-s3-credentials
key: region
- name: STORAGE_AWS_BUCKET
valueFrom:
secretKeyRef:
name: aibrix-s3-credentials
key: bucket-name
45 changes: 45 additions & 0 deletions config/metadata/tos-env-patch.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
# This patch contains the TOS object store configuration for the metadata service
# !!Important: Please make sure aibrix-tos-credentials secret exists in both aibrix-system and default namespaces.
apiVersion: apps/v1
kind: Deployment
metadata:
name: metadata-service
namespace: aibrix-system
spec:
template:
spec:
containers:
- name: metadata-service
command:
- aibrix_metadata
- --host
- "0.0.0.0"
- --enable-k8s-job
- --k8s-job-patch
- /app/config/job_template_patch.yaml
env:
- name: STORAGE_TOS_ACCESS_KEY
valueFrom:
secretKeyRef:
name: aibrix-tos-credentials
key: access-key
- name: STORAGE_TOS_SECRET_KEY
valueFrom:
secretKeyRef:
name: aibrix-tos-credentials
key: secret-key
- name: STORAGE_TOS_ENDPOINT
valueFrom:
secretKeyRef:
name: aibrix-tos-credentials
key: endpoint
- name: STORAGE_TOS_REGION
valueFrom:
secretKeyRef:
name: aibrix-tos-credentials
key: region
- name: STORAGE_TOS_BUCKET
valueFrom:
secretKeyRef:
name: aibrix-tos-credentials
key: bucket-name
Original file line number Diff line number Diff line change
Expand Up @@ -35,4 +35,4 @@ spec:
targetRef:
group: gateway.networking.k8s.io
kind: HTTPRoute
name: aibrix-reserved-router-models-endpoint
name: aibrix-reserved-router-metadata-endpoint
8 changes: 7 additions & 1 deletion dist/chart/templates/gateway-plugin/httproute.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ spec:
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
name: aibrix-reserved-router-models-endpoint
name: aibrix-reserved-router-metadata-endpoint
namespace: {{ .Release.Namespace }}
labels:
{{- include "chart.labels" . | nindent 4 }}
Expand All @@ -46,6 +46,12 @@ spec:
- path:
type: PathPrefix
value: /v1/models
- path:
type: PathPrefix
value: /v1/files
- path:
type: PathPrefix
value: /v1/batches
backendRefs:
- name: aibrix-metadata-service
port: 8090
8 changes: 6 additions & 2 deletions dist/chart/templates/metadata-service/deployment.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -31,8 +31,12 @@ spec:
- name: metadata-service
image: {{ .Values.metadata.service.container.image.repository }}:{{ .Values.metadata.service.container.image.tag }}
imagePullPolicy: {{ .Values.metadata.service.container.image.imagePullPolicy | default "IfNotPresent" }}
command: ["python", "-m", "aibrix.metadata.app"]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's keep the explicit commands here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I can add an explicit command, but it would be aibrix_metadata

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's ok. user sometimes need to adjust parameters, if there's no config here, it automatically use container entrypoints, and user do not know it unless they know the commands in Dockerfile

Copy link
Collaborator Author

@zhangjyr zhangjyr Oct 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found aibrix_metadatadoes not work in the helm. I have to change to
["python", "-m", "aibrix.metadata.app"].
One more thing, in the config, I disabled k8s-job by default, so users can run other metadata services (v1/models) if they do not configure external object storage (in this case, the k8s job will not work anyway). Do I need to apply the same logic in chart, too? This means users have to adjust values to enable object storage and k8s-job.

args: ["--host=0.0.0.0", "--port=8090"]
command:
- python
- -m
- aibrix.metadata.app
- --host
- "0.0.0.0"
ports:
- containerPort: 8090
resources: {{ toYaml .Values.metadata.service.container.resources | nindent 12 }}
Expand Down
14 changes: 14 additions & 0 deletions dist/chart/templates/metadata-service/rbac.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,20 @@ rules:
- apiGroups: ["model.aibrix.ai"]
resources: ["modeladapters"]
verbs: ["get", "list"]
# For batch job watching
- apiGroups: ["batch"]
resources: ["jobs"]
verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
# For kopf high availability
- apiGroups: ["coordination.k8s.io"]
resources: ["leases"]
verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
- apiGroups: ["apiextensions.k8s.io"] # required by kopf
resources: ["customresourcedefinitions"]
verbs: ["get", "list", "watch"]
- apiGroups: [""] # required by kopf
resources: ["namespaces"]
verbs: ["list", "watch"]
---

apiVersion: rbac.authorization.k8s.io/v1
Expand Down
Loading
Loading