Skip to content

Commit ef66cc0

Browse files
safoinmeAlexejPenneractions-usergithub-actions[bot]htahir1
authored
Databricks integrations (#2823)
* Add Databricks integration to ZenML * Add entrypoint and wheeled orchestrator classes * Add environment variable support to entrypoint.py and base_step.py * Update Databricks integration requirements The Databricks integration now requires the "databricks-sdk" package instead of "workflows_authoring_toolkit". * refactor: Remove unnecessary code in DatabricksIntegration class * Update Databricks integration * Update DatabricksOrchestrator to use empty strings for host, client_id, and client_secret * Auto-update of LLM Finetuning template * Refactor DatabricksOrchestratorConfig to use empty strings for host, client_id, and client_secret * chore: Remove hardcoding of ZenML library in Databricks orchestrator utils * Update Databricks orchestrator utils to remove hardcoding of ZenML library * databricks deployer * Update Databricks orchestrator utils to use correct import path * Update Databricks integration * Update Databricks integration * Update Databricks integration * format * update host * update pipeline_name * Update Databricks orchestrator to use orchestrator_run_name instead of pipeline_name * Update Databricks orchestrator to use orchestrator_run_name instead of pipeline_name * Refactor DatabricksDeploymentService to use Client from zenml.client module * add mlflow example as databricks example to delete later * format * update demo * update demo * update demo * Auto-update of E2E template * update demo * Auto-update of E2E template * update demo * update demo and orchetrator * remove logs * rename demo * rename demo * rename demo * Fixed info log * Fixed smaller log messages * Update DatabricksDeploymentService to handle missing endpoint secret name and token * Simplify DatabricksOrchestrator environment variable handling * Update wheeled_orchestrator.py with logger for wheel creation output * update how wheel are run * update load source * Refactor DatabricksOrchestrator environment variable handling * Refactor DatabricksOrchestrator environment variable handling * Refactor DatabricksOrchestrator environment variable handling * test env varibale * Refactor DatabricksOrchestrator environment variable handling * Remove unused job parameter definition in DatabricksOrchestrator * Refactor DatabricksOrchestrator environment variable handling * fix mypy and add docs * Optimised images with calibre/image-actions * Refactor DatabricksOrchestrator error message for job id retrieval * Refactor DatabricksOrchestrator error message for job id retrieval * update based on reviews * Optimised images with calibre/image-actions * add diagram * Optimised images with calibre/image-actions * Update docs/book/component-guide/model-deployers/databricks.md Co-authored-by: Hamza Tahir <[email protected]> * remove demo and fix mypy * add databricks to docs * add alpha tag to databricks * update logos --------- Co-authored-by: AlexejPenner <[email protected]> Co-authored-by: GitHub Actions <[email protected]> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Hamza Tahir <[email protected]>
1 parent 9be9fcd commit ef66cc0

27 files changed

+2258
-2
lines changed
165 KB
Loading
709 KB
Loading
335 KB
Loading
131 KB
Loading
Lines changed: 148 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,148 @@
1+
---
2+
description: >-
3+
Deploying models to Databricks Inference Endpoints with Databricks
4+
---
5+
6+
# Databricks
7+
8+
9+
Databricks Model Serving or Mosaic AI Model Serving provides a unified interface to deploy, govern, and query AI models. Each model you serve is available as a REST API that you can integrate into your web or client application.
10+
11+
This service provides dedicated and autoscaling infrastructure managed by Databricks, allowing you to deploy models without dealing with containers and GPUs.
12+
13+
14+
{% hint style="info" %}
15+
Databricks Model deployer can be considered as a managed service for deploying models using MLflow, This means you can switch between MLflow and Databricks Model Deployers without changing your pipeline code even for custom complex models.
16+
{% endhint %}
17+
18+
## When to use it?
19+
20+
You should use Databricks Model Deployer:
21+
22+
* You are already using Databricks for your data and ML workloads.
23+
* If you want to deploy AI models without dealing with containers and GPUs, Databricks Model Deployer provides a unified interface to deploy, govern, and query models.
24+
* Databricks Model Deployer offers dedicated and autoscaling infrastructure managed by Databricks, making it easier to deploy models at scale.
25+
* Enterprise security is a priority, and you need to deploy models into secure offline endpoints accessible only via a direct connection to your Virtual Private Cloud (VPCs).
26+
* if your goal is to turn your models into production-ready APIs with minimal infrastructure or MLOps involvement.
27+
28+
29+
If you are looking for a more easy way to deploy your models locally, you can use the [MLflow Model Deployer](mlflow.md) flavor.
30+
31+
## How to deploy it?
32+
33+
The Databricks Model Deployer flavor is provided by the Databricks ZenML integration, so you need to install it on your local machine to be able to deploy your models. You can do this by running the following command:
34+
35+
```bash
36+
zenml integration install databricks -y
37+
```
38+
39+
To register the Databricks model deployer with ZenML you need to run the following command:
40+
41+
```bash
42+
zenml model-deployer register <MODEL_DEPLOYER_NAME> --flavor=databricks --host=<HOST> --client_id={{databricks.client_id}} --client_secret={{databricks.client_secret}}
43+
```
44+
45+
{% hint style="info" %}
46+
We recommend creating a Databricks service account with the necessary permissions to create and run jobs. You can find more information on how to create a service account [here](https://docs.databricks.com/dev-tools/api/latest/authentication.html). You can generate a client_id and client_secret for the service account and use them to authenticate with Databricks.
47+
{% endhint %}
48+
49+
We can now use the model deployer in our stack.
50+
51+
```bash
52+
zenml stack update <CUSTOM_STACK_NAME> --model-deployer=<MODEL_DEPLOYER_NAME>
53+
```
54+
55+
See the [databricks\_model\_deployer\_step](https://sdkdocs.zenml.io/latest/integration\_code\_docs/integrations-databricks/#zenml.integrations.databricks.steps.databricks\_deployer.databricks\_model\_deployer\_step) for an example of using the Databricks Model Deployer to deploy a model inside a ZenML pipeline step.
56+
57+
## Configuration
58+
59+
Within the `DatabricksServiceConfig` you can configure:
60+
61+
62+
* `model_name`: The name of the model that will be served, this will be used to identify the model in the Databricks Model Registry.
63+
* `model_version`: The version of the model that will be served, this will be used to identify the model in the Databricks Model Registry.
64+
* `workload_size`: The size of the workload that the model will be serving. This can be `ServedModelInputWorkloadSize.SMALL`, `ServedModelInputWorkloadSize.MEDIUM`, or `ServedModelInputWorkloadSize.LARGE`, you can import this enum from `from databricks.sdk.service.serving import ServedModelInputWorkloadSize`.
65+
* `scale_to_zero_enabled`: A boolean flag to enable or disable the scale to zero feature.
66+
* `env_vars`: A dictionary of environment variables to be passed to the model serving container.
67+
* `workload_type`: The type of workload that the model will be serving. This can be `ServedModelInputWorkloadType.CPU`, `ServedModelInputWorkloadType.GPU_LARGE`, `ServedModelInputWorkloadType.GPU_MEDIUM`, `ServedModelInputWorkloadType.GPU_SMALL`, or `ServedModelInputWorkloadType.MULTIGPU_MEDIUM`, you can import this enum from `from databricks.sdk.service.serving import ServedModelInputWorkloadType`.
68+
* `endpoint_secret_name`: The name of the secret that will be used to secure the endpoint and authenticate requests.
69+
70+
For more information and a full list of configurable attributes of the Databricks Model Deployer, check out the [SDK Docs](https://sdkdocs.zenml.io/latest/integration\_code\_docs/integrations-databricks/#zenml.integrations.databricks.model\_deployers) and Databricks endpoint [code](https://github.com/databricks/databricks\_hub/blob/5e3b603ccc7cd6523d998e75f82848215abf9415/src/databricks\_hub/hf\_api.py#L6957).
71+
72+
### Run inference on a provisioned inference endpoint
73+
74+
The following code example shows how to run inference against a provisioned inference endpoint:
75+
76+
```python
77+
from typing import Annotated
78+
from zenml import step, pipeline
79+
from zenml.integrations.databricks.model_deployers import DatabricksModelDeployer
80+
from zenml.integrations.databricks.services import DatabricksDeploymentService
81+
82+
83+
# Load a prediction service deployed in another pipeline
84+
@step(enable_cache=False)
85+
def prediction_service_loader(
86+
pipeline_name: str,
87+
pipeline_step_name: str,
88+
running: bool = True,
89+
model_name: str = "default",
90+
) -> DatabricksDeploymentService:
91+
"""Get the prediction service started by the deployment pipeline.
92+
93+
Args:
94+
pipeline_name: name of the pipeline that deployed the MLflow prediction
95+
server
96+
step_name: the name of the step that deployed the MLflow prediction
97+
server
98+
running: when this flag is set, the step only returns a running service
99+
model_name: the name of the model that is deployed
100+
"""
101+
# get the Databricks model deployer stack component
102+
model_deployer = DatabricksModelDeployer.get_active_model_deployer()
103+
104+
# fetch existing services with same pipeline name, step name and model name
105+
existing_services = model_deployer.find_model_server(
106+
pipeline_name=pipeline_name,
107+
pipeline_step_name=pipeline_step_name,
108+
model_name=model_name,
109+
running=running,
110+
)
111+
112+
if not existing_services:
113+
raise RuntimeError(
114+
f"No Databricks inference endpoint deployed by step "
115+
f"'{pipeline_step_name}' in pipeline '{pipeline_name}' with name "
116+
f"'{model_name}' is currently running."
117+
)
118+
119+
return existing_services[0]
120+
121+
122+
# Use the service for inference
123+
@step
124+
def predictor(
125+
service: DatabricksDeploymentService,
126+
data: str
127+
) -> Annotated[str, "predictions"]:
128+
"""Run a inference request against a prediction service"""
129+
130+
prediction = service.predict(data)
131+
return prediction
132+
133+
134+
@pipeline
135+
def databricks_deployment_inference_pipeline(
136+
pipeline_name: str, pipeline_step_name: str = "databricks_model_deployer_step",
137+
):
138+
inference_data = ...
139+
model_deployment_service = prediction_service_loader(
140+
pipeline_name=pipeline_name,
141+
pipeline_step_name=pipeline_step_name,
142+
)
143+
predictions = predictor(model_deployment_service, inference_data)
144+
```
145+
146+
For more information and a full list of configurable attributes of the Databricks Model Deployer, check out the [SDK Docs](https://sdkdocs.zenml.io/latest/integration\_code\_docs/integrations-databricks/#zenml.integrations.databricks.model\_deployers).
147+
148+
<figure><img src="https://static.scarf.sh/a.png?x-pxid=f0b4f458-0a54-4fcd-aa95-d5ee424815bc" alt="ZenML Scarf"><figcaption></figcaption></figure>
Lines changed: 192 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,192 @@
1+
---
2+
description: Orchestrating your pipelines to run on Databricks.
3+
---
4+
5+
# Databricks Orchestrator
6+
7+
[Databricks](https://www.databricks.com/) is a unified data analytics platform that combines the best of data warehouses and data lakes to offer an integrated solution for big data processing and machine learning. It provides a collaborative environment for data scientists, data engineers, and business analysts to work together on data projects. Databricks offers optimized performance and scalability for big data workloads.
8+
9+
The Databricks orchestrator is an orchestrator flavor provided by the ZenML databricks integration that allows you to run your pipelines on Databricks. This integration enables you to leverage Databricks' powerful distributed computing capabilities and optimized environment for your ML pipelines within the ZenML framework.
10+
11+
{% hint style="warning" %}
12+
The following features are currently in Alpha and may be subject to change. We recommend using them in a controlled environment and providing feedback to the ZenML team.
13+
{% endhint %}
14+
15+
### When to use it
16+
17+
You should use the Databricks orchestrator if:
18+
19+
* you're already using Databricks for your data and ML workloads.
20+
* you want to leverage Databricks' powerful distributed computing capabilities for your ML pipelines.
21+
* you're looking for a managed solution that integrates well with other Databricks services.
22+
* you want to take advantage of Databricks' optimization for big data processing and machine learning.
23+
24+
### Prerequisites
25+
26+
You will need to do the following to start using the Databricks orchestrator:
27+
28+
* An Active Databricks workspace, depends on the cloud provider you are using, you can find more information on how to create a workspace:
29+
* [AWS](https://docs.databricks.com/en/getting-started/onboarding-account.html)
30+
* [Azure](https://learn.microsoft.com/en-us/azure/databricks/getting-started/#--create-an-azure-databricks-workspace)
31+
* [GCP](https://docs.gcp.databricks.com/en/getting-started/index.html)
32+
* Active Databricks account or service account with sufficient permission to create and run jobs
33+
34+
## How it works
35+
36+
37+
![Databricks How It works Diagram](../../.gitbook/assets/Databricks_How_It_works.png)
38+
39+
The Databricks orchestrator in ZenML leverages the concept of Wheel Packages. When you run a pipeline with the Databricks orchestrator, ZenML creates a Python wheel package from your project. This wheel package contains all the necessary code and dependencies for your pipeline.
40+
41+
Once the wheel package is created, ZenML uploads it to Databricks. ZenML leverage Databricks SDK to create a job definition, This job definition includes information about the pipeline steps and ensures that each step is executed only after its upstream steps have successfully completed.
42+
43+
The Databricks job is also configured with the necessary cluster settings to run. This includes specifying the version of Spark to use, the number of workers, the node type, and other configuration options.
44+
45+
When the Databricks job is executed, it retrieves the wheel package from Databricks and runs the pipeline using the specified cluster configuration. The job ensures that the steps are executed in the correct order based on their dependencies.
46+
47+
Once the job is completed, ZenML retrieves the logs and status of the job and updates the pipeline run accordingly. This allows you to monitor the progress of your pipeline and view the logs of each step.
48+
49+
50+
### How to use it
51+
52+
To use the Databricks orchestrator, you first need to register it and add it to your stack. Before registering the orchestrator, you need to install the Databricks integration by running the following command:
53+
54+
```shell
55+
zenml integration install databricks
56+
```
57+
58+
This command will install the necessary dependencies, including the `databricks-sdk` package, which is required for authentication with Databricks. Once the integration is installed, you can proceed with registering the orchestrator and configuring the necessary authentication details.
59+
60+
```shell
61+
zenml integration install databricks
62+
```
63+
64+
Then, we can register the orchestrator and use it in our active stack:
65+
66+
```shell
67+
zenml orchestrator register databricks_orchestrator --flavor=databricks --host="https://xxxxx.x.azuredatabricks.net" --client_id={{databricks.client_id}} --client_secret={{databricks.client_secret}}
68+
```
69+
70+
{% hint style="info" %}
71+
We recommend creating a Databricks service account with the necessary permissions to create and run jobs. You can find more information on how to create a service account [here](https://docs.databricks.com/dev-tools/api/latest/authentication.html). You can generate a client_id and client_secret for the service account and use them to authenticate with Databricks.
72+
73+
![Databricks Service Account Permession](../../.gitbook/assets/DatabricksPermessions.png)
74+
{% endhint %}
75+
76+
```shell
77+
# Add the orchestrator to your stack
78+
zenml stack register databricks_stack -o databricks_orchestrator ... --set
79+
```
80+
81+
You can now run any ZenML pipeline using the Databricks orchestrator:
82+
83+
```shell
84+
python run.py
85+
```
86+
87+
### Databricks UI
88+
89+
Databricks comes with its own UI that you can use to find further details about your pipeline runs, such as the logs of your steps.
90+
91+
![Databricks UI](../../.gitbook/assets/DatabricksUI.png)
92+
93+
For any runs executed on Databricks, you can get the URL to the Databricks UI in Python using the following code snippet:
94+
95+
```python
96+
from zenml.client import Client
97+
98+
pipeline_run = Client().get_pipeline_run("<PIPELINE_RUN_NAME>")
99+
orchestrator_url = pipeline_run.run_metadata["orchestrator_url"].value
100+
```
101+
102+
![Databricks Run UI](../../.gitbook/assets/DatabricksRunUI.png)
103+
104+
105+
### Run pipelines on a schedule
106+
107+
The Databricks Pipelines orchestrator supports running pipelines on a schedule using its [native scheduling capability](https://docs.databricks.com/en/workflows/jobs/schedule-jobs.html).
108+
109+
**How to schedule a pipeline**
110+
111+
```python
112+
from zenml.config.schedule import Schedule
113+
114+
# Run a pipeline every 5th minute
115+
pipeline_instance.run(
116+
schedule=Schedule(
117+
cron_expression="*/5 * * * *"
118+
)
119+
)
120+
```
121+
122+
{% hint style="warning" %}
123+
The Databricks orchestrator only supports the `cron_expression`, in the `Schedule` object, and will ignore all other parameters supplied to define the schedule.
124+
{% endhint %}
125+
126+
{% hint style="warning" %}
127+
The Databricks orchestrator requires Java Timezone IDs to be used in the `cron_expression`. You can find a list of supported timezones [here](https://docs.oracle.com/middleware/1221/wcs/tag-ref/MISC/TimeZones.html), the timezone ID must be set in the settings of the orchestrator (see below for more imformation how to set settings for the orchestrator).
128+
{% endhint %}
129+
130+
**How to delete a scheduled pipeline**
131+
132+
Note that ZenML only gets involved to schedule a run, but maintaining the lifecycle of the schedule is the responsibility of the user.
133+
134+
In order to cancel a scheduled Databricks pipeline, you need to manually delete the schedule in Databricks (via the UI or the CLI).
135+
136+
### Additional configuration
137+
138+
For additional configuration of the Databricks orchestrator, you can pass `DatabricksOrchestratorSettings` which allows you to change the Spark version, number of workers, node type, autoscale settings, Spark configuration, Spark environment variables, and schedule timezone.
139+
140+
```python
141+
from zenml.integrations.databricks.flavors.databricks_orchestrator_flavor import DatabricksOrchestratorSettings
142+
143+
databricks_settings = DatabricksOrchestratorSettings(
144+
spark_version="15.3.x-scala2.12",
145+
num_workers="3",
146+
node_type_id="Standard_D4s_v5",
147+
policy_id=POLICY_ID,
148+
autoscale=(2, 3),
149+
spark_conf={},
150+
spark_env_vars={},
151+
schedule_timezone="America/Los_Angeles" or "PST" # You can get the timezone ID from here: https://docs.oracle.com/middleware/1221/wcs/tag-ref/MISC/TimeZones.html
152+
)
153+
```
154+
155+
These settings can then be specified on either pipeline-level or step-level:
156+
157+
```python
158+
# Either specify on pipeline-level
159+
@pipeline(
160+
settings={
161+
"orchestrator.databricks": databricks_settings,
162+
}
163+
)
164+
def my_pipeline():
165+
...
166+
```
167+
168+
We can also enable GPU support for the Databricks orchestrator changing the `spark_version` and `node_type_id` to a GPU-enabled version and node type:
169+
170+
```python
171+
from zenml.integrations.databricks.flavors.databricks_orchestrator_flavor import DatabricksOrchestratorSettings
172+
173+
databricks_settings = DatabricksOrchestratorSettings(
174+
spark_version="15.3.x-gpu-ml-scala2.12",
175+
node_type_id="Standard_NC24ads_A100_v4",
176+
policy_id=POLICY_ID,
177+
autoscale=(1, 2),
178+
)
179+
```
180+
181+
With these settings, the orchestrator will use a GPU-enabled Spark version and a GPU-enabled node type to run the pipeline on Databricks, next section will show how to enable CUDA for the GPU to give its full acceleration for your pipeline.
182+
183+
#### Enabling CUDA for GPU-backed hardware
184+
185+
Note that if you wish to use this orchestrator to run steps on a GPU, you will need to follow [the instructions on this page](../../how-to/training-with-gpus/training-with-gpus.md) to ensure that it works. It requires adding some extra settings customization and is essential to enable CUDA for the GPU to give its full acceleration.
186+
187+
<figure><img src="https://static.scarf.sh/a.png?x-pxid=f0b4f458-0a54-4fcd-aa95-d5ee424815bc" alt="ZenML Scarf"><figcaption></figcaption></figure>
188+
189+
190+
Check out the [SDK docs](https://sdkdocs.zenml.io/latest/integration\_code\_docs/integrations-databricks/#zenml.integrations.databricks.flavors.databricks\_orchestrator\_flavor.DatabricksOrchestratorSettings) for a full list of available attributes and [this docs page](../../how-to/use-configuration-files/runtime-configuration.md) for more information on how to specify settings.
191+
192+
For more information and a full list of configurable attributes of the Databricks orchestrator, check out the [SDK Docs](https://sdkdocs.zenml.io/latest/integration\_code\_docs/integrations-databricks/#zenml.integrations.databricks.orchestrators.databricks\_orchestrator.DatabricksOrchestrator) .

docs/book/toc.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -187,6 +187,7 @@
187187
* [Kubernetes Orchestrator](component-guide/orchestrators/kubernetes.md)
188188
* [Google Cloud VertexAI Orchestrator](component-guide/orchestrators/vertex.md)
189189
* [AWS Sagemaker Orchestrator](component-guide/orchestrators/sagemaker.md)
190+
* [Databricks Orchestrator](component-guide/orchestrators/databricks.md)
190191
* [Tekton Orchestrator](component-guide/orchestrators/tekton.md)
191192
* [Airflow Orchestrator](component-guide/orchestrators/airflow.md)
192193
* [Skypilot VM Orchestrator](component-guide/orchestrators/skypilot-vm.md)
@@ -223,6 +224,7 @@
223224
* [Seldon](component-guide/model-deployers/seldon.md)
224225
* [BentoML](component-guide/model-deployers/bentoml.md)
225226
* [Hugging Face](component-guide/model-deployers/huggingface.md)
227+
* [Databricks](component-guide/model-deployers/databricks.md)
226228
* [Develop a Custom Model Deployer](component-guide/model-deployers/custom.md)
227229
* [👣 Step Operators](component-guide/step-operators/step-operators.md)
228230
* [Amazon SageMaker](component-guide/step-operators/sagemaker.md)

src/zenml/__init__.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -56,6 +56,7 @@
5656
from zenml.new.steps.step_decorator import step
5757
from zenml.new.steps.step_context import get_step_context
5858
from zenml.steps.utils import log_step_metadata
59+
from zenml.entrypoints import entrypoint
5960

6061
__all__ = [
6162
"ArtifactConfig",
@@ -74,4 +75,5 @@
7475
"save_artifact",
7576
"show",
7677
"step",
78+
"entrypoint",
7779
]

src/zenml/integrations/__init__.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,7 @@
2323
from zenml.integrations.azure import AzureIntegration # noqa
2424
from zenml.integrations.bentoml import BentoMLIntegration # noqa
2525
from zenml.integrations.bitbucket import BitbucketIntegration # noqa
26+
from zenml.integrations.databricks import DatabricksIntegration # noqa
2627
from zenml.integrations.comet import CometIntegration # noqa
2728
from zenml.integrations.deepchecks import DeepchecksIntegration # noqa
2829
from zenml.integrations.discord import DiscordIntegration # noqa

src/zenml/integrations/constants.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,7 @@
2222
BITBUCKET = "bitbucket"
2323
COMET = "comet"
2424
DASH = "dash"
25+
DATABRICKS = "databricks"
2526
DEEPCHECKS = "deepchecks"
2627
DISCORD = "discord"
2728
EVIDENTLY = "evidently"

0 commit comments

Comments
 (0)