-
Notifications
You must be signed in to change notification settings - Fork 55
Open
Labels
feature requestNew feature or requestNew feature or request
Description
Proposal: Consolidate Docker images to a single tag per CUDA major version
- Goal: Stop producing per-minor CUDA 12 images (
cuda12.0
,cuda12.8
,cuda12.9
, …) and publish a single, rollingcuda12
image that always tracks the latest supported 12.x release. - Rationale: Minor releases in CUDA 12 are intended to be compatible within the major line. Our main constraint for customers is the NVIDIA driver version, not runtime/library compatibility. Maintaining per-minor images adds build/CI complexity, documentation churn, and support overhead without clear customer benefit.
Customer impact
-
Benefits
- Simpler choices: One image per major version reduces confusion and “which minor do I pick?” questions. RAPIDS supports minor version compatibility, so the choice of CUDA minor version is less important.
- Less documentation churn: Eliminates repeated RSNs and doc updates for “drop 12.x, add 12.y”. Examples:
- Faster adoption: Customers automatically get bug fixes and feature updates in the latest minor of the desired CUDA major line.
-
Potential downsides and mitigations
- Minimum driver version bumps: When a new CUDA minor version is available, the driver support could change based on the requirements of the latest
nvidia/cuda
images. However, this only reflects the support lifecycle of the CUDA drivers themselves. Users who need long term support should be adopting LTS driver branches already. - Reproducibility concerns: Changing CUDA minor versions could introduce defects from the CUDA Toolkit itself.
- Mitigation: If such a bug were identified, we would be able to revert to an older CUDA Toolkit version without needing to rename the container tag.
- Minimum driver version bumps: When a new CUDA minor version is available, the driver support could change based on the requirements of the latest
Is the minor version significant?
- Summary: Within a major version, CUDA is designed for application binary compatibility; the main operational constraint is the GPU driver meeting or exceeding the minimum required for that CUDA minor. Because of Minor Version Compatibility, this is a non-issue for RAPIDS. This supports the reliability of a rolling
cuda12
tag. - References:
Implementation plan
-
Tagging and image strategy
- Publish a single rolling tag per CUDA major and Python version:
<RAPIDS_VERSION>-cuda<CUDA_MAJOR_VERSION>-py<PYTHON_VERSION>
: e.g.25.10-cuda12-py3.13
- Publish a single rolling tag per CUDA major and Python version:
-
CI/build matrix changes
- Remove older CUDA 12.x releases from the build/test matrix; keep only the latest 12.x.
- Update the CUDA minor versions used as part of build-planning migrations to new minor versions
- Should we consider similar changes in
ci-imgs
?
- Should we consider similar changes in
-
Documentation and comms
- Update README and contributing docs to use major versions
- Replace per-minor RSNs with a single note in regular release notes when
cuda12
advances to a new minor. Clearly state minimum driver version (typically no changes will be expected except when LTS branches end their support cycle) and/or any notable changes in RAPIDS related to the CUDA minor version upgrade.
-
Testing
- Confirm all existing test suites pass on the consolidated
cuda12
image. Based on preliminary tests in Test earliest driver with CUDA 12.9 #780, this shouldn't be an issue.
- Confirm all existing test suites pass on the consolidated
Benefits
- Ops: Fewer CI jobs/build minutes; fewer cache entries; reduced image storage and tag count.
- Docs/Support: Fewer RSNs and docs PRs for CUDA minor versions; reduced customer questions related to “which minor should I use?”
- Adoption: Majority of pulls switch from
cuda12.x
tocuda12
within one release cycle.
Rollout
- 25.10: Publish
cuda12
alongside existingcuda12.x
for one release; publish onlycuda13
(nocuda13.0
); announce deprecation of CUDA minor tags. - 25.12: stop building
cuda12.0
andcuda12.9
Tasks
- Update build scripts to retag
cuda12.9
builds ascuda12
- Update docs: README, compatibility matrix, release notes template, migration guide.
- When adding CUDA 13, only create
cuda13
images - Announce deprecation of
cuda12.x
tags with an RSN
jameslamb
Metadata
Metadata
Assignees
Labels
feature requestNew feature or requestNew feature or request