Add blog post on Integrating Kubeflow Model Registry into Your Kubeflow Pipelines #176

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Open

hpurdom wants to merge 27 commits into kubeflow:master from hpurdom:new-blog-post

hpurdom commented Jul 31, 2025 •

edited

Loading

Resolves #175
This PR proposes adding a new blog post to the Kubeflow community blog. The post is titled "Integrating Kubeflow Model Registry into Your Kubeflow Pipelines".

It serves as a practical guide for data scientists and ML engineers looking to centralize their model management by integrating the Kubeflow Model Registry directly into their Kubeflow Pipelines.

Content Overview

The blog post covers:

Local Setup: Detailed instructions for setting up a local Kubeflow environment on kind, including common troubleshooting steps and their solutions.
Best Practices for Model Registration: How to connect KFP components to the Model Registry API, use Kubernetes Secrets for authentication, and implement model_source metadata for robust lineage tracking.
Sample Pipeline: A complete, runnable Python pipeline example demonstrating fake model registration with best practices.
Verification: Instructions for viewing registered models via the Model Registry API and UI.
Future Outlook: Mentions the Model Registry Integration KEP for future, more automated integration.

Contributor

google-oss-prow bot commented Jul 31, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign franciscojavierarceo for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

google-oss-prow bot requested review from juliusvonkohout and terrytangyuan

July 31, 2025 20:16

google-oss-prow bot added the size/XL label

mprahl reviewed

View reviewed changes

_posts/2025-07-25-model-registry-integration.md


		# Why Leverage Model Registry in Your Pipelines?

		Integrating model registration as a native step within your Kubeflow Pipelines offers significant advantages for robust MLOps:

mprahl Jul 31, 2025

I wouldn't use the term "native" because that is custom user code until this KEP is implemented:
https://github.com/kubeflow/pipelines/blob/master/proposals/12020-model-registry-integration/README.md

alyssacgoins commented Jul 31, 2025

The DCO failure can be fixed quickly - just amend the commits to include a signature

mprahl reviewed

View reviewed changes

_posts/2025-07-25-model-registry-integration.md

+              * [Python](https://www.python.org/downloads/) (3.11 or newer).
+              * [kubectl](https://kubernetes.io/docs/tasks/tools/install-kubectl/): Kubernetes command-line tool.
+              * [kind](https://kind.sigs.k8s.io/docs/user/quick-start/): Kubernetes in Docker.
+              * [helm](https://helm.sh/docs/intro/install/): Kubernetes package manager.

mprahl Jul 31, 2025

I don't think helm is used in this blog.

mprahl reviewed

View reviewed changes

_posts/2025-07-25-model-registry-integration.md

+              * [kind](https://kind.sigs.k8s.io/docs/user/quick-start/): Kubernetes in Docker.
+              * [helm](https://helm.sh/docs/intro/install/): Kubernetes package manager.
+              * [git](https://git-scm.com/book/en/v2/Getting-Started-Installing-Git): For cloning repositories.
+              - **Install Kustomize:**

mprahl Jul 31, 2025

It'd be better to link to the Kustomize documentation for installation since these instructions will get out of date.

mprahl reviewed

View reviewed changes

_posts/2025-07-25-model-registry-integration.md

+              # Perform a full system logout and login after installation to refresh PATH.
+              ```
+               # Common Troubleshooting during Setup:

mprahl Jul 31, 2025

I'd recommend putting this section towards the end or last so the reader can see the interesting parts earlier.

hpurdom force-pushed the new-blog-post branch from edf244e to 48597a7 Compare

July 31, 2025 20:53

mprahl reviewed

View reviewed changes

_posts/2025-07-25-model-registry-integration.md

+              ```
+               # Common Troubleshooting during Setup:
+              Setting up complex MLOps tools locally can be challenging. Here, we provide the robust deployment steps, incorporating solutions for common issues encountered during this process:

mprahl Jul 31, 2025

I'd recommend having the troubleshooting be more about the pipeline itself rather than the environment setup. It's difficult to capture all the issues they may encounter so unless there's something very common, I'd suggest leaving that out to let the reader use other resources to solver their specific problem.

mprahl reviewed

View reviewed changes

_posts/2025-07-25-model-registry-integration.md

Comment on lines +113 to +123

+              Install `kind` (if not already installed via dnf):
+              ```
+              sudo dnf install kind # Recommended for Fedora
+              # OR manual install if dnf fails (e.g., if 'No match for argument: kind'):
+              # cd ~
+              # curl -Lo kind https://kind.sigs.k8s.io/dl/v0.22.0/kind-linux-amd64
+              # chmod +x kind
+              # sudo mv kind /usr/local/bin/kind
+              # Perform a full system logout and login after any kind installation/update.
+              ```

mprahl Jul 31, 2025

This can be skipped since it's already covered as a prerequisite.

mprahl reviewed

View reviewed changes

_posts/2025-07-25-model-registry-integration.md

+              Clone Kubeflow Pipelines Repository:
+              ```
+              cd ~ git clone https://github.com/kubeflow/pipelines.git kfp-local-manifests

mprahl Jul 31, 2025

This is an invalid command. Did you mean?

Suggested change

      
            cd ~ git clone https://github.com/kubeflow/pipelines.git kfp-local-manifests
          
            cd ~ && git clone https://github.com/kubeflow/pipelines.git kfp-local-manifests

Hailey Purdom and others added 17 commits

July 31, 2025 16:57


          feat: Add blog post on Model Registry integration with KFP

1124a1f

Signed-off-by: Hailey Purdom <[email protected]>


          feat: Add YAML front matter to blog post

165f8f1

Signed-off-by: Hailey Purdom <[email protected]>


          Fix alignment

254878f

Signed-off-by: Hailey Purdom <[email protected]>


          Fix alignment

f08340b

Signed-off-by: Hailey Purdom <[email protected]>


          Editing texts

4719b74

Signed-off-by: Hailey Purdom <[email protected]>


          Formatting comands as code blocks

f1454c7

Signed-off-by: Hailey Purdom <[email protected]>


          New edita need to check

31796f3

Signed-off-by: Hailey Purdom <[email protected]>


          Update deployement steps

d5a4ccd

Signed-off-by: Hailey Purdom <[email protected]>


          Update boldness

dddbeb2

Signed-off-by: Hailey Purdom <[email protected]>


          Updated layout

bd25b0c

Signed-off-by: Hailey Purdom <[email protected]>


          Checking update

e49af18

Signed-off-by: Hailey Purdom <[email protected]>


          check change

1485fd3

Signed-off-by: Hailey Purdom <[email protected]>


          Finishing touches

b4fbd7f

Signed-off-by: Hailey Purdom <[email protected]>


          update

f8907b5

Signed-off-by: Hailey Purdom <[email protected]>


          Updates with script

e9a9e6b

Signed-off-by: Hailey Purdom <[email protected]>


          update

8f91dc3

Signed-off-by: Hailey Purdom <[email protected]>


          update format

a634508

Signed-off-by: Hailey Purdom <[email protected]>

Hailey Purdom added 3 commits

July 31, 2025 16:57


          update

dd78783

Signed-off-by: Hailey Purdom <[email protected]>


          feat: Add blog post on MR integration

9c798f4

Signed-off-by: Hailey Purdom <[email protected]>


          docs: Add a minor clarification for PR

d116cee

Signed-off-by: Hailey Purdom <[email protected]>

hpurdom force-pushed the new-blog-post branch from 48597a7 to d116cee Compare

July 31, 2025 20:57

mprahl reviewed

View reviewed changes

_posts/2025-07-25-model-registry-integration.md

+              # Perform a full system logout and login after any kind installation/update.
+              ```
+              Clone Kubeflow Pipelines Repository:

mprahl Jul 31, 2025

A lot of this section seems to duplicate what the install script does. We should be able to just use the install script and also just use the kubeflow/manifests repo for installation.

mprahl reviewed

View reviewed changes

_posts/2025-07-25-model-registry-integration.md

+              kubectl get pods -n kubeflow -l app=model-registry-ui
+              ```
+              - **Option B:** Multi-User Kubeflow Deployment (More comprehensive, but complex): This option deploys a full multi-user Kubeflow environment, which includes Istio, Dex, and other components necessary for authentication and traffic management. This path is significantly more time-consuming and complex to set up, but it enables the Model Registry UI to integrate with the Central Dashboard in a production-like, authenticated manner.

mprahl Jul 31, 2025

I personally would just go with this option in this section to keep things simple.

Author

hpurdom commented Jul 31, 2025 •

edited

Loading

@alyssacgoins Thank you so much, the DCO is now successful from your assistance!

mprahl reviewed

View reviewed changes

_posts/2025-07-25-model-registry-integration.md

+              - Configure Profiles + KFAM (Kubeflow Access Management).
+              - Configure User Namespaces.
+              **The following `install.sh script`, developed by Matt Prahl, automates the complex, multi-component Kubeflow deployment stated above on Kind.**

mprahl Jul 31, 2025

It'd be good to say which commit or kubeflow/manifests tag you used for this blog in case a future change breaks the install script.

mprahl reviewed

View reviewed changes

_posts/2025-07-25-model-registry-integration.md



		Accessing the cluster:
		After this complex setup, you would typically access the Kubeflow Central Dashboard via an Ingress or LoadBalancer, as described in the [Connect to your Kubeflow cluster](https://github.com/kubeflow/manifests?tab=readme-ov-file#connect-to-your-kubeflow-cluster) section of the `kubeflow/manifests` README.

mprahl Jul 31, 2025

"After this complex setup" 😆

mprahl reviewed

View reviewed changes

_posts/2025-07-25-model-registry-integration.md

Comment on lines +376 to +392

+              ***Once the full multi-user Kubeflow is deployed, you would then apply the Model Registry UI's Istio-dependent overlay:***
+              ```
+              cd ~/kubeflow-manifests-repo/applications/model-registry/upstream/options/ui/overlays/istio
+              kubectl apply -k . -n kubeflow
+              ```
+              This would then work because Istio would be present.
+              - Integrate Model Registry UI with Kubeflow Central Dashboard (Optional but Recommended): This step adds a direct link to the Model Registry UI in the Kubeflow Central Dashboard's navigation menu.
+              ```
+              kubectl get configmap centraldashboard-config -n kubeflow -o json | \
+              jq '.data.links |= (fromjson | .menuLinks += [{"icon": "assignment", "link": "/model-registry/", "text": "Model Registry", "type": "item"}] | tojson)' | \
+              kubectl apply -f - -n kubeflow
+              ```
+              ***Note: This command requires jq to be installed (sudo dnf install jq). If you prefer, you can edit the ConfigMap manually: kubectl edit configmap -n kubeflow centraldashboard-config and add the menu item under data.links.menuLinks.***

mprahl Jul 31, 2025

This is already handled in the install script.

mprahl reviewed

View reviewed changes

_posts/2025-07-25-model-registry-integration.md

Comment on lines +394 to +401

+              **Set Up Port-Forwards:**
+              Open three separate terminal windows and run:
+              - KFP UI: `kubectl port-forward -n kubeflow svc/ml-pipeline-ui 8080:80`
+              - Model Registry API: `kubectl port-forward -n kubeflow svc/model-registry-service 8082:8080`
+              - Model Registry UI: `kubectl port-forward -n kubeflow svc/model-registry-ui-service 8084:80`
+              - Minio (for KFP logs): `kubectl; port-forward -n kubeflow svc/minio-service 9000:9000`

mprahl Jul 31, 2025

This doesn't match the Kubeflow platform installation instructions. It should be kubectl port-forward svc/istio-ingressgateway -n istio-system 8080:80.

mprahl reviewed

View reviewed changes

_posts/2025-07-25-model-registry-integration.md


		# Connecting KFP to Model Registry: Best Practices

		To enable your KFP component to register models, you need to provide it with the Model Registry's API endpoint and, ideally, an authentication token. While the Model Registry API can be accessed directly, using a Kubernetes Secret for sensitive information like tokens is a best practice.

mprahl Jul 31, 2025

When using the install.sh script, it seemed the Model Registry API didn't require any authentication or authorization. Did you add this after the fact?

mprahl reviewed

View reviewed changes

_posts/2025-07-25-model-registry-integration.md

+              * **Base Image & Dependencies:** The component uses a `python:3.11-slim-buster` base image, and the `model-registry==0.2.19` client library is automatically installed into its container environment via the `packages_to_install` argument in the decorator.
+              * **Model Registry Connection:** Inside the component, the `ModelRegistry` client is initialized. It connects to the Model Registry API server using its internal Kubernetes DNS name (`http://model-registry-service.kubeflow.svc.cluster.local:8080`).
+              * **Authentication:** For authentication, the component retrieves a token from the `MR_AUTH_TOKEN` environment variable using `os.environ.get()`. This environment variable is populated by mounting a Kubernetes Secret (`model-registry-auth`) into the component's pod, ensuring sensitive credentials are not hardcoded.
+              * **Dynamic Metadata for Lineage:** To achieve robust cross-referencing and lineage tracking, the component dynamically captures details about its own pipeline run. It retrieves values like `KFP_RUN_ID`, `KFP_PIPELINE_NAME`, and `KFP_POD_NAMESPACE` from environment variables that Kubeflow Pipelines automatically inject into every component pod. These values are then passed as `model_source_id`, `model_source_name`, `model_source_class`, `model_source_kind`, and `model_source_group` when registering the model.

mprahl Jul 31, 2025

I don't think these environment variables exist. You need to use placeholders such as this:
https://kubeflow-pipelines.readthedocs.io/en/stable/source/dsl.html#kfp.dsl.PIPELINE_JOB_ID_PLACEHOLDER

mprahl reviewed

View reviewed changes

code_examples/model_registration_pipeline_example.py Outdated

+                  # Model Registry service address within the Kubernetes cluster
+                  # This uses the internal Kubernetes DNS name.
+                  MODEL_REGISTRY_SERVER_ADDRESS = "http://model-registry-service.kubeflow.svc.cluster.local"

mprahl Jul 31, 2025

Is this correct in multi-user mode? I thought it was in the user profile namespace.

mprahl reviewed

View reviewed changes

code_examples/model_registration_pipeline_example.py Outdated

+                          # In a real scenario, you'd get the actual modelVersionId from the API response
+                          # For now, we'll use a placeholder or query it if needed.
+                          "versionID": "placeholder-version-id", # Model Registry client doesn't return version ID directly from register_model
+                          "modelRegistryURL": f"http://localhost:{MODEL_REGISTRY_PORT}/models/{registered_model_and_version.id}/versions/{model_version_name}",

mprahl Jul 31, 2025

localhost is not reachable outside of the cluster.

mprahl reviewed

View reviewed changes

code_examples/model_registration_pipeline_example.py Outdated

+                          "modelID": registered_model_and_version.id,
+                          # In a real scenario, you'd get the actual modelVersionId from the API response
+                          # For now, we'll use a placeholder or query it if needed.
+                          "versionID": "placeholder-version-id", # Model Registry client doesn't return version ID directly from register_model

mprahl Jul 31, 2025

You can just call model_version_id = registry.get_model_version(model_name, model_version_name).id to get it.

mprahl reviewed

View reviewed changes

code_examples/model_registration_pipeline_example.py Outdated

Comment on lines 20 to 27

+                  # These parameters are included in the component signature for completeness,
+                  # but their values are not directly used in this simplified registration logic.
+                  model_data_type: str = "tabular",
+                  model_storage_format: str = "pickle",
+                  model_container_image: str = "my-fake-model-server:latest",
+                  model_serving_path: str = "/mnt/models/fake_model.pkl",
+                  model_serving_input_schema: str = '{"type": "object", "properties": {"feature1": {"type": "number"}}}',
+                  model_serving_output_schema: str = '{"type": "object", "properties": {"prediction": {"type": "number"}}}',

mprahl Jul 31, 2025

I'd suggest removing these and keep it simple.

mprahl reviewed

View reviewed changes

code_examples/model_registration_pipeline_example.py Outdated

+              def model_registration_pipeline(
+                  model_name: str = "MyFakeModel",
+                  model_version_name: str = "v1.0.0-blog-post", # Unique version for blog post example
+                  model_artifact_uri: str = "s3://my-model-bucket/fake/v1.0.0/model.pkl",

mprahl Jul 31, 2025

Doesn't this come from output_model.uri? We don't need a pipeline input parameter.

mprahl reviewed

View reviewed changes

code_examples/model_registration_pipeline_example.py Outdated

+                      model_version_name=model_version_name,
+                      model_artifact_uri=model_artifact_uri,
+                      model_author=model_author,
+                      output_model=dsl.Output[dsl.Model](), # Pass a KFP Output Model artifact

mprahl Jul 31, 2025

You're not intended to pass this. An output model is a special type. For more info, check out the KFP SDK docs.

Hailey Purdom added 3 commits

August 1, 2025 14:12


          Updated code

17894af


          update code

1b17e4d


          Apply final blog post formatting and content updates")

a94a23c

review-notebook-app bot commented Aug 5, 2025

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

google-oss-prow bot added size/XXL and removed size/XL labels

Hailey Purdom added 4 commits

August 5, 2025 11:00


          edited pip install

bc3a682


          edited code again

3859d28


          took out some lines

e132b0c


          updated some names

bfc33ff

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels