Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -131,11 +131,11 @@ On the Create workbench page, complete the following information.

Once you have entered the information for your workbench, click **Create**.

![Fill Workbench Information](images/tensorflow-workbench.png)
![Fill Workbench Information](images/tensor flow-workbench.png)

For our example project, let's name it "Tensorflow Workbench". We'll select the
**TensorFlow** image, choose a **Deployment size** of **Small**,
**Accelerator** of **NVIDIA A100 GPU**, **Number of accelerators**
**Jupyter | TensorFlow | CUDA | Python 3.12** image, choose a
of **Small**, **Accelerator** of **NVIDIA A100 GPU**, **Number of accelerators**
as **1** and allocate a **Cluster storage** space of **20GB** (Selected By Default).

After creating the workbench, you will return to your project page. It shows the
Expand Down
8 changes: 4 additions & 4 deletions docs/openshift-ai/other-projects/RAG-talk-with-your-pdf.md
Original file line number Diff line number Diff line change
Expand Up @@ -71,10 +71,10 @@ be populated with documents.
![RAG Workbench Information](images/RAG-Jupyter-Notebook-Workbench.png)

For our example project, let's name it "RAG Workbench". We'll select the
**TensorFlow** image with Recommended Version (selected by default), choose
a **Deployment size** of **Medium**, **Accelerator** as **None** (no GPU is
needed for this setup) and allocate a **Cluster storage** space of **20GB**
(Selected By Default).
**Jupyter | TensorFlow | CUDA | Python 3.12** image with Recommended Version
(selected by default), choose a **Deployment size** of **Medium**, **Accelerator**
as **None** (no GPU is needed for this setup) and allocate a **Cluster storage**
space of **20GB** (Selected By Default).

!!! tip "Tip"

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -21,9 +21,10 @@ and Number of accelerators (GPUs).
![PyTorch Workbench Information](images/pytorch-workbench.png)

For our example project, let's name it "PyTorch Workbench". We'll select the
**PyTorch** image, choose a **Deployment size** of **Small**, choose **Accelerator**
of **NVIDIA V100 GPU**, **Number of accelerators** as **1**, and allocate a
**Cluster storage** space of **20GB** (Selected By Default).
**Jupyter | PyTorch | CUDA | Python 3.12** image, choose a **Deployment size**
of **Small**, choose **Accelerator** of **NVIDIA V100 GPU**,
**Number of accelerators** as **1**, and allocate a **Cluster storage** space of
**20GB** (Selected By Default).

!!! tip "Hardware Acceleration using GPU"

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -73,7 +73,7 @@ Use the following steps to connect your local VS Code to RHOAI Workbench kernel:
You can find this port from the service in your project with name same
as your workbench.

- Open the Jupyter notebook in your local VS Code
- Open the Jupyter notebook in your local VS Code:

![Jupyter Notebook](images/jupyter-nb.png)

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -381,64 +381,78 @@ and use `Llama 3.2 3B Modelcar` as the connection name, as shown below:

![Create Connection](images/create-connection-using-uri.png)

!!! note "Important Note: ModelCar Requirements & Guidance"

You have several options for deploying models to your OpenShift AI cluster.
We recommend using **[ModelCar](https://kserve.github.io/website/docs/model-serving/storage/providers/oci#using-modelcars)**
because it removes the need to manually download models from Hugging Face Hub,
upload them to S3, or manage access permissions. With ModelCar, you can package
models as OCI images and pull them at runtime or precache them. This simplifies
versioning, improves traceability, and integrates cleanly into CI/CD workflows.
ModelCar images also ensure reproducibility and maintain versioned model releases.

You can deploy our own model using a ModelCar container, which packages all
model files into an OCI container image. To learn more about ModelCar containers,
read this article **[Build and deploy a ModelCar container in OpenShift AI](https://developers.redhat.com/articles/2025/01/30/build-and-deploy-modelcar-container-openshift-ai)**.
It explains the benefits of ModelCar containers, how to build a ModelCar image,
and how to deploy it with OpenShift AI.

For additional patterns and prebuilt ModelCar images, explore the Red Hat AI
Services **[ModelCar Catalog repository](https://github.com/redhat-ai-services/modelcar-catalog)**
on GitHub. Prebuilt images from this catalog are also available in the
**[ModelCar Catalog registry](https://quay.io/repository/redhat-ai-services/modelcar-catalog)**
on *Quay.io*.

!!! tip "Use Any Other Available Model from the ModelCar Catalog registry."

**You can use any model from the ModelCar Catalog registry in a similar way.**
For example, for the `Granite-3.3-8B-Instruct` model, you can use the publicly
available container image from the **Quay.io** registry: **[quay.io/redhat-ai-services/modelcar-catalog:granite-3.3-8b-instruct](https://quay.io/repository/redhat-ai-services/modelcar-catalog?tag=granite-3.3-8b-instruct)**.

The **[Granite-3.3-8B-Instruct](https://huggingface.co/ibm-granite/granite-3.3-8b-instruct)**
model is an 8-billion-parameter, 128K context-length language model fine-tuned
for improved reasoning and instruction-following capabilities. It is built
on top of the `Granite-3.3-8B-Base` model.

To create a connection for the `Granite-3.3-8B-Instruct` model, use the
following URI:

```sh
oci://quay.io/redhat-ai-services/modelcar-catalog:granite-3.3-8b-instruct
```
##### Using Publicly Available ModelCar Catalog registry

You have several options for deploying models to your OpenShift AI cluster.
We recommend using **[ModelCar](https://kserve.github.io/website/docs/model-serving/storage/providers/oci#using-modelcars)**
because it removes the need to manually download models from Hugging Face Hub,
upload them to S3, or manage access permissions. With ModelCar, you can package
models as OCI images and pull them at runtime or precache them. This simplifies
versioning, improves traceability, and integrates cleanly into CI/CD workflows.
ModelCar images also ensure reproducibility and maintain versioned model releases.

You can deploy your own model using a ModelCar container, which packages all
model files into an OCI container image. To learn more about ModelCar containers,
read this article **[Build and deploy a ModelCar container in OpenShift AI](https://developers.redhat.com/articles/2025/01/30/build-and-deploy-modelcar-container-openshift-ai)**.

It explains the benefits of ModelCar containers, how to build a ModelCar image,
and how to deploy it with OpenShift AI.

For additional patterns and prebuilt ModelCar images, explore the Red Hat AI
Services **[ModelCar Catalog repository](https://github.com/redhat-ai-services/modelcar-catalog)**
on GitHub. Prebuilt images from this catalog are also available in the
**[ModelCar Catalog registry](https://quay.io/repository/redhat-ai-services/modelcar-catalog)**
on *Quay.io*.

In the **Additional serving runtime arguments** field under **Configuration
parameters** section, specify the following recommended arguments:

```yaml
--dtype=half
--max-model-len=20000
--gpu-memory-utilization=0.95
--enable-chunked-prefill
--enable-auto-tool-choice
--tool-call-parser=granite
--chat-template=/app/data/template/tool_chat_template_granite.jinja
```
!!! tip "Use Any Other Available Model from the ModelCar Catalog registry."

**You can use any model from the ModelCar Catalog registry in a similar way.**
For example, for the `Granite-3.3-8B-Instruct` model, you can use the publicly
available container image from the **Quay.io** registry: **[quay.io/redhat-ai-services/modelcar-catalog:granite-33-8b-instruct](https://quay.io/repository/redhat-ai-services/modelcar-catalog?tag=granite-3.3-8b-instruct)**.

The **[Granite-3.3-8B-Instruct](https://huggingface.co/ibm-granite/granite-3.3-8b-instruct)**
model is an 8-billion-parameter, 128K context-length language model fine-tuned
for improved reasoning and instruction-following capabilities. It is built
on top of the `Granite-3.3-8B-Base` model.

To create a connection for the `Granite-3.3-8B-Instruct` model, use the
following URI:

```sh
oci://quay.io/redhat-ai-services/modelcar-catalog:granite-3.3-8b-instruct
```

In the **Additional serving runtime arguments** field under **Configuration
parameters** section, specify the following recommended arguments:

```yaml
--dtype=half
--max-model-len=20000
--gpu-memory-utilization=0.95
--enable-chunked-prefill
--enable-auto-tool-choice
--tool-call-parser=granite
--chat-template=/app/data/template/tool_chat_template_granite.jinja
```

Additionally, you may find it helpful to read **[Optimize and deploy LLMs for
production with OpenShift AI](https://developers.redhat.com/articles/2025/10/06/optimize-and-deploy-llms-production-openshift-ai)**.

##### Using Model Catalog

Recent version of RHOAI include support for the **Model Catalog**, enabling users
to easily discover, evaluate, and deploy generative AI models from a centralized
interface. This feature provides access to models from multiple providers such
as Red Hat, IBM, Meta, NVIDIA, Mistral AI, and Google, with built-in benchmarking
based on open-source evaluation datasets to compare performance and quality. It
simplifies the workflow by allowing data scientists and AI engineers to select
suitable models, register them in a model registry, and deploy them directly to
a serving runtime.

However, note that all these images are compiled for the **x86 architecture**.
If you're targeting ARM, you'll need to rebuild these images on an ARM machine,
as demonstrated in **[this guide](https://pandeybk.medium.com/serving-vllm-and-granite-models-on-arm-with-red-hat-openshift-ai-0178adba550e)**.
![Model Catalog](images/model-catalog.png)

Additionally, you may find it helpful to read **[Optimize and deploy LLMs for production with OpenShift AI](https://developers.redhat.com/articles/2025/10/06/optimize-and-deploy-llms-production-openshift-ai)**.
Models can be deployed directly from the model catalog to streamline the deployment
process. For more details, refer to the [**official documentation**](https://docs.redhat.com/en/documentation/red_hat_openshift_ai_self-managed/3.0/html-single/working_with_model_registries/index#deploying-a-model-from-the-model-catalog_model-registry).

## Setting up Single-model Server and Deploy the model

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -647,10 +647,10 @@ look like this:
![Fraud detection Workbench Information](images/fraud-detection-workbench.png)

For our example project, let's name it "Fraud detection". We'll select the
**TensorFlow** image with Recommended Version (selected by default), choose
a **Deployment size** of **Small**, choose **Accelerator** of
**NVIDIA V100 GPU**, **Number of accelerators** as **1**, and allocate
a **Cluster storage** space of **20GB** (Selected By Default).
****Jupyter | TensorFlow | CUDA | Python 3.12** image with Recommended Version
(selected by default), choose a **Deployment size** of **Small**, choose
**Accelerator** of **NVIDIA V100 GPU**, **Number of accelerators** as **1**, and
allocate a **Cluster storage** space of **20GB** (Selected By Default).

!!! info "Running Workbench without GPU"

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -16,8 +16,8 @@ and Number of accelerators (GPUs).
![Standard Data Science Workbech Information](images/standard-data-science-workbench.png)

For our example project, let's name it "Standard Data Science Workbench". We'll
select the **Standard Data Science** image with Recommended Version
(selected by default), choose a **Deployment size** of **Small**,
select the **Jupyter | Data Science | CPU | Python 3.12** image with Recommended
Version (selected by default), choose a **Deployment size** of **Small**,
**Accelerator** as **None** (no GPU is needed for this setup), and allocate a
**Cluster storage** space of **1GB**.

Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/openshift-ai/other-projects/images/open-RAG-workbench.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/openshift-ai/other-projects/images/pytorch-workbench.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/openshift-ai/other-projects/images/tgis-workbench.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Original file line number Diff line number Diff line change
Expand Up @@ -565,10 +565,10 @@ look like this:
![Object detection Workbench Information](images/object-detection-workbench.png)

For our example project, let's name it "Fraud detection". We'll select the
**PyTorch** image with Recommended Version (selected by default), choose
a **Deployment size** of **Medium**, choose **Accelerator** of
**NVIDIA V100 GPU**, **Number of accelerators** as **1**, and allocate
a **Cluster storage** space of **20GB** (Selected By Default).
**Jupyter | PyTorch | CUDA | Python 3.12** image with Recommended Version
(selected by default), choose a **Deployment size** of **Medium**, choose
**Accelerator** of **NVIDIA V100 GPU**, **Number of accelerators** as **1**, and
allocate a **Cluster storage** space of **20GB** (Selected By Default).

!!! info "Running Workbench without GPU"

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -82,8 +82,8 @@ look like this:
![TGIS Workbench Information](images/tgis-workbench.png)

For our example project, let's name it "TGIS Workbench". We'll
select the **Standard Data Science** image with Recommended Version
(selected by default), choose a **Deployment size** of **Small**,
select the **Jupyter | Data Science | CPU | Python 3.12** image with Recommended
Version (selected by default), choose a **Deployment size** of **Small**,
**Accelerator** as **None** (no GPU is needed for this setup), and allocate
a **Cluster storage** space of **20GB** (Selected By Default).

Expand Down
2 changes: 1 addition & 1 deletion docs/openstack/create-and-connect-to-the-VM/launch-a-VM.md
Original file line number Diff line number Diff line change
Expand Up @@ -101,7 +101,7 @@ you may choose any available images.
If you don't specify custom value for the "**Volume Size (GB)**", that will
be set to the root disk size of your selected Flavor. For more about the
default root disk size you can refer to [this documentation](flavors.md).
We can override this value by entering our own custom value (in GiB) and that
We can override this value by entering your own custom value (in GiB) and that
is available as a Volume that is attach to the instance to enable persistent
storage.

Expand Down
8 changes: 4 additions & 4 deletions nerc-theme/main.html
Original file line number Diff line number Diff line change
@@ -1,17 +1,17 @@
{% extends "base.html" %} {% block announce %}
<div class="parent">

<!-- <div class="maintain">
<div class="maintain">
<div>
Upcoming NERC system maintenance and upgrade
</div>
<div>
(Tuesday, March 10, 2026 8:00 AM - 5:00 PM)
(Tuesday, April 14, 2026 8:00 AM - 5:00 PM)
<a style="color: #00efff"
href="https://nerc.mghpcc.org/event/nerc-system-maintenance-20260310/"
href="https://nerc.mghpcc.org/event/nerc-system-maintenance-20260414/"
target="_blank">[Event Details]</a>
</div>
</div> -->
</div>

<!-- <div class="upgrade">
</div> -->
Expand Down