OCI AI Blueprints v1.0.6
Pre-releaseRelease Change Audit Report
New Version: v1.0.6
Previous Version: v1.0.5
Generated: 2025-08-08 11:19:51
Repositories Analyzed:
- Quickstarts Repo (oci-ai-blueprints)
- Control Plane Repo (corrino)
- Portal Repo (oci_ai_blueprints_portal)
Version v1.0.6 builds on v1.0.5 by extending hardware support, polishing developer experiences, and enriching user workflows across our Quickstarts, Control Plane (corrino), and Portal components. The centerpiece of this release is full AMD MI300x support—from plugin and metrics exporter through blueprint examples and orchestration—plus new benchmarking blueprints, secrets integration patterns, enhanced validation, and UI refinements.
-
Executive Summary
• AMD MI300x support end-to-end: new plugin, metrics exporter, example blueprints, Terraform/Helm deployment, and Control Plane manifest recognition.
• Inference benchmarking: offline & online GPU benchmarking blueprints (vLLM, LLMPerf), shareable configs, MLflow visibility.
• Secrets workflows: comprehensive, end-to-end docs and JSON recipes for opaque, container-registry, Hugging Face, and NVIDIA NIM secrets.
• Control Plane enhancements: subnet-ID & deployment-name validation, local-filesystem volumes, DRF-Spectacular OpenAPI generation, JSON-schema expansion.
• Portal UI improvements: new blueprint categories, deployments page redesign, richer error messages and parsing logic.
• Documentation uplift: updated READMEs, API docs, schema definitions, and streamlined content across all repos. -
Repository-Specific Changes
Quickstarts (oci-ai-blueprints v1.0.6)
• New Features
– AMD MI300x support: AMD plugin & metrics-exporter code, example Llama4 Maverick and shared-pool blueprints, updated RDMA table, “bring-your-own” exporter pattern.
– Offline & Online Inference Benchmarking: vLLM and LLMPerf back-ends, JSON/YAML sample configs, shareable node-pool definitions, MLflow metric logging.
– Secrets Integration: docs/secrets section with workflows and JSON recipes for generic, container-registry, HF token, and NVIDIA NIM secrets.
• Improvements
– Blueprint schema & organization: new “partner” and “gpu_benchmarking” categories; reorganized sample_blueprints folders.
– API docs: local_filesystem and input_file_system options in docs/api_documentation.md.
– Naming & clarity: local_directory_path → node_directory_path; refined offline-inference README; polished IAM policy docs.
• Technical Changes
– Terraform & Helm: added AMD metrics exporter values.yaml, updated helm.tf, locals.tf, postgres_db.tf, variables.tf; MI300x/shared-pool support in cluster_creation_terraform.
– IAM policies: refined and reorganized examples.
• Documentation
– Updated READMEs across blueprint categories.
– Added detailed docs/secrets/*.md and JSON schema.
– Removed outdated inference and Whisper-transcription docs.
– Enhanced QuickStartVersions.md, about.md, and top-level README.md.Control Plane (corrino v1.0.6)
• New Features
– AMD MI300x & local-filesystem volumes: control-plane logic and manifest templates now detect MI300x GPU deployments and allow local filesystem as I/O.
– Subnet-ID validation: new subnet_validator verifies user VCN subnet IDs.
• Improvements
– Deployment-name validation: extended for service/job modes; unit tests for shared-pool, team, and update scenarios.
– JSON schema enhancements: added “Partner Blueprints” and more blueprint categories for richer user configurations.
– Swagger/OpenAPI simplification: cleaned up api/urls.py endpoints; minor DRF endpoint header and route tweaks.
• Bug Fixes
– Resolved subnet-validator logic issue (#111).
• Technical Changes
– Model refactoring (#104): split Deployment and Manifest models; moved admin registrations into api/models/admin.py; removed old api/admin.py.
– OpenAPI with DRF-Spectacular (#92): added dependency, schema.yml, and generation README.
• Documentation
– Deployment-logs API docs (#98): examples and descriptions in api/drf/deployment.py, api/urls.py, and schema.yml.
– OpenAPI generation guide: README.openapi.schema.generation.md.
– Swagger grammar and example fixes.Portal (oci_ai_blueprints_portal v1.0.6)
• New Features
– Library page: three new blueprint categories—Partner Blueprints, Workload Blueprints, Platform Feature Blueprints.
• Improvements
– DeploymentsPage UI: flat table for individual deployments; grouped deployments in an accordion for easier scanning.
– Deployment-content component: auto-detect Content-Type (JSON vs. text), provide scenario-specific error messages.
– Library page layout: more filterable categories; moved category description panel to right side.
• No bug fixes or documentation updates in this release. -
Cross-Repository Impact
• Unified AMD MI300x support: Quickstarts deliver examples and exporter, Control Plane handles manifest detection and volume types, Terraform/Helm modules orchestrate deployment.
• End-to-end benchmarking: Quickstarts’ GPU benchmarking blueprints can be deployed via corrino with improved validation; metrics flow into MLflow for centralized monitoring.
• Shared JSON schema and categories: Quickstarts and corrino share expanded blueprint categories (“partner”, “gpu_benchmarking”), reflected in the Portal library for consistent user navigation.
• Secrets integration synergy: Quickstarts’ end-to-end examples align with Control Plane’s validation logic, ensuring secrets are correctly created, updated, and consumed in deployments.
• Documentation coherence: newly generated OpenAPI spec and schema guide supports both developer workflows in corrino and UI-driven API exploration in the Portal. -
Upgrade Considerations
• Terraform/Helm modules: review and apply updated values.yaml, helm.tf, and variables.tf to deploy the AMD device metrics exporter.
• Configuration changes
– Rename any local_directory_path settings to node_directory_path in your custom blueprints.
– Validate blueprint JSON against the new schema (added “partner” and “gpu_benchmarking” categories).
• Control Plane
– Ensure drf-spectacular and schema.yml are included in your environment to enable updated OpenAPI docs.
– Verify subnet IDs in existing deployments conform to the new subnet_validator rules to avoid validation errors.
• Portal
– No breaking changes, but note new library categories and layout adjustments.
• Secrets management
– Follow the new docs/secrets recipes when migrating existing secret workflows to the new patterns.
• IAM policies
– Update or merge in the refined examples to maintain least-privilege on AMD exporter and other new resources.
• Backward compatibility
– Existing v1.0.5 blueprints and deployments should continue to function, but it’s recommended to test in a staging environment before rolling out to production.
We recommend reviewing the detailed documentation in each repository’s v1.0.6 release notes for full migration steps, code samples, and best practices. Enjoy the expanded hardware support and richer developer experience!