Skip to content

[FEA] Consolidate Docker images to a single tag per CUDA major version #781

@bdice

Description

@bdice

Proposal: Consolidate Docker images to a single tag per CUDA major version

  • Goal: Stop producing per-minor CUDA 12 images (cuda12.0, cuda12.8, cuda12.9, …) and publish a single, rolling cuda12 image that always tracks the latest supported 12.x release.
  • Rationale: Minor releases in CUDA 12 are intended to be compatible within the major line. Our main constraint for customers is the NVIDIA driver version, not runtime/library compatibility. Maintaining per-minor images adds build/CI complexity, documentation churn, and support overhead without clear customer benefit.

Customer impact

  • Benefits

  • Potential downsides and mitigations

    • Minimum driver version bumps: When a new CUDA minor version is available, the driver support could change based on the requirements of the latest nvidia/cuda images. However, this only reflects the support lifecycle of the CUDA drivers themselves. Users who need long term support should be adopting LTS driver branches already.
    • Reproducibility concerns: Changing CUDA minor versions could introduce defects from the CUDA Toolkit itself.
      • Mitigation: If such a bug were identified, we would be able to revert to an older CUDA Toolkit version without needing to rename the container tag.

Is the minor version significant?

Implementation plan

  • Tagging and image strategy

    • Publish a single rolling tag per CUDA major and Python version:
      • <RAPIDS_VERSION>-cuda<CUDA_MAJOR_VERSION>-py<PYTHON_VERSION>: e.g. 25.10-cuda12-py3.13
  • CI/build matrix changes

    • Remove older CUDA 12.x releases from the build/test matrix; keep only the latest 12.x.
    • Update the CUDA minor versions used as part of build-planning migrations to new minor versions
      • Should we consider similar changes in ci-imgs?
  • Documentation and comms

    • Update README and contributing docs to use major versions
    • Replace per-minor RSNs with a single note in regular release notes when cuda12 advances to a new minor. Clearly state minimum driver version (typically no changes will be expected except when LTS branches end their support cycle) and/or any notable changes in RAPIDS related to the CUDA minor version upgrade.
  • Testing

Benefits

  • Ops: Fewer CI jobs/build minutes; fewer cache entries; reduced image storage and tag count.
  • Docs/Support: Fewer RSNs and docs PRs for CUDA minor versions; reduced customer questions related to “which minor should I use?”
  • Adoption: Majority of pulls switch from cuda12.x to cuda12 within one release cycle.

Rollout

  1. 25.10: Publish cuda12 alongside existing cuda12.x for one release; publish only cuda13 (no cuda13.0); announce deprecation of CUDA minor tags.
  2. 25.12: stop building cuda12.0 and cuda12.9

Tasks

  • Update build scripts to retag cuda12.9 builds as cuda12
  • Update docs: README, compatibility matrix, release notes template, migration guide.
  • When adding CUDA 13, only create cuda13 images
  • Announce deprecation of cuda12.x tags with an RSN

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions