Skip to content
This repository was archived by the owner on Feb 27, 2026. It is now read-only.
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
8 changes: 8 additions & 0 deletions .github/workflows/ci.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -37,3 +37,11 @@ jobs:
golang:
secrets: inherit
uses: ./.github/workflows/golang.yaml

image:
uses: ./.github/workflows/image.yaml
needs: [variables, golang, code-scanning]
secrets: inherit
with:
version: ${{ needs.variables.outputs.version }}
build_multi_arch_images: ${{ github.ref_name == 'main' || startsWith(github.ref_name, 'release-') }}
1 change: 0 additions & 1 deletion .github/workflows/image.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,6 @@ jobs:
matrix:
target:
- application
- packaging
steps:
- uses: actions/checkout@v6
name: Check out code
Expand Down
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,5 @@
*.swo

/coverage.out*
/shared-*
/shared-*
nvidia-ctk-installer
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -107,7 +107,7 @@ mod-vendor: mod-tidy
vendor: mod-vendor

check-modules: | mod-tidy mod-verify mod-vendor
git diff --quiet HEAD -- $$(find . -name go.mod -o -name go.sum -o -name vendor)
git diff --exit-code HEAD -- $$(find . -name go.mod -o -name go.sum -o -name vendor)

COVERAGE_FILE := coverage.out
test: build cmds
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# NVIDIA Container Toolkit Container

[![GitHub license](https://img.shields.io/github/license/NVIDIA/nvidia-container-toolkit?style=flat-square)](https://raw.githubusercontent.com/NVIDIA/container-config/main/LICENSE)
[![GitHub license](https://img.shields.io/github/license/NVIDIA/container-config?style=flat-square)](https://raw.githubusercontent.com/NVIDIA/container-config/main/LICENSE)
[![Documentation](https://img.shields.io/badge/documentation-wiki-blue.svg?style=flat-square)](https://docs.nvidia.com/datacenter/cloud-native/gpu-operator/latest/index.html)

## Introduction
Expand Down
68 changes: 68 additions & 0 deletions cmd/nvidia-ctk-installer/container/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
## Introduction

This repository contains tools that allow docker, containerd, or cri-o to be configured to use the NVIDIA Container Toolkit.

*Note*: These were copied from the [`container-config` repository](https://gitlab.com/nvidia/container-toolkit/container-config/-/tree/383587f766a55177ede0e39e3810a974043e503e) are being migrated to commands installed with the NVIDIA Container Toolkit.

These will be migrated into an upcoming `nvidia-ctk` CLI as required.

### Docker

After building the `docker` binary, run:
```bash
docker setup \
--runtime-name NAME \
/run/nvidia/toolkit
```

Configure the `nvidia-container-runtime` as a docker runtime named `NAME`. If the `--runtime-name` flag is not specified, this runtime would be called `nvidia`.

Since `--set-as-default` is enabled by default, the specified runtime name will also be set as the default docker runtime. This can be disabled by explicityly specifying `--set-as-default=false`.

The following table describes the behaviour for different `--runtime-name` and `--set-as-default` flag combinations.

| Flags | Installed Runtimes | Default Runtime |
|-------------------------------------------------------------|:--------------------------------|:----------------------|
| **NONE SPECIFIED** | `nvidia` | `nvidia` |
| `--runtime-name nvidia` | `nvidia` | `nvidia` |
| `--runtime-name NAME` | `NAME` | `NAME` |
| `--set-as-default` | `nvidia` | `nvidia` |
| `--set-as-default --runtime-name nvidia` | `nvidia` | `nvidia` |
| `--set-as-default --runtime-name NAME` | `NAME` | `NAME` |
| `--set-as-default=false` | `nvidia` | **NOT SET** |
| `--set-as-default=false --runtime-name NAME` | `NAME` | **NOT SET** |
| `--set-as-default=false --runtime-name nvidia` | `nvidia` | **NOT SET** |

These combinations also hold for the environment variables that map to the command line flags: `DOCKER_RUNTIME_NAME`, `DOCKER_SET_AS_DEFAULT`.

### Containerd
After running the `containerd` binary, run:
```bash
containerd setup \
--runtime-class NAME \
/run/nvidia/toolkit
```

Configure the `nvidia-container-runtime` as a runtime class named `NAME`. If the `--runtime-class` flag is not specified, this runtime would be called `nvidia`.

Adding the `--set-as-default` flag as follows:
```bash
containerd setup \
--runtime-class NAME \
--set-as-default \
/run/nvidia/toolkit
```
will set the runtime class `NAME` (or `nvidia` if not specified) as the default runtime class.

The following table describes the behaviour for different `--runtime-class` and `--set-as-default` flag combinations.

| Flags | Installed Runtime Classes | Default Runtime Class |
|--------------------------------------------------------|:--------------------------------|:----------------------|
| **NONE SPECIFIED** | `nvidia` | **NOT SET** |
| `--runtime-class NAME` | `NAME` | **NOT SET** |
| `--runtime-class nvidia` | `nvidia` | **NOT SET** |
| `--set-as-default` | `nvidia` | `nvidia` |
| `--set-as-default --runtime-class NAME` | `NAME` | `NAME` |
| `--set-as-default --runtime-class nvidia` | `nvidia` | `nvidia` |

These combinations also hold for the environment variables that map to the command line flags.
224 changes: 224 additions & 0 deletions cmd/nvidia-ctk-installer/container/container.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,224 @@
/**
# Copyright (c) NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/

package container

import (
"errors"
"fmt"
"os"
"os/exec"
"strings"

"github.com/sirupsen/logrus"

"github.com/NVIDIA/nvidia-container-toolkit/pkg/config/engine"
"github.com/NVIDIA/nvidia-container-toolkit/pkg/config/toml"

"github.com/NVIDIA/container-config/cmd/nvidia-ctk-installer/container/operator"
)

const (
restartModeNone = "none"
restartModeSignal = "signal"
restartModeSystemd = "systemd"
)

// Options defines the shared options for the CLIs to configure containers runtimes.
type Options struct {
DropInConfig string
DropInConfigHostPath string
// TopLevelConfigPath stores the path to the top-level config for the runtime.
TopLevelConfigPath string
Socket string
// ExecutablePath specifies the path to the container runtime executable.
// This is used to extract the current config, for example.
// If a HostRootMount is specified, this path is relative to the host root
// mount.
ExecutablePath string
// EnabledCDI indicates whether CDI should be enabled.
EnableCDI bool
RuntimeName string
RuntimeDir string
SetAsDefault bool
RestartMode string
HostRootMount string

ConfigSources []string
}

// Configure applies the options to the specified config
func (o Options) Configure(cfg engine.Interface) error {
err := o.UpdateConfig(cfg)
if err != nil {
return fmt.Errorf("unable to update config: %v", err)
}
return o.Flush(cfg)
}

// Unconfigure removes the options from the specified config
func (o Options) Unconfigure(cfg engine.Interface) error {
err := o.RevertConfig(cfg)
if err != nil {
return fmt.Errorf("unable to update config: %v", err)
}

if err := o.Flush(cfg); err != nil {
return err
}

if o.DropInConfig == "" {
return nil
}
// When a drop-in config is used, we remove the drop-in file explicitly.
// This is require for cases where we may have to include other contents
// in the drop-in file and as such it may not be empty when we flush it.
err = os.Remove(o.DropInConfig)
if err != nil && !errors.Is(err, os.ErrNotExist) {
return fmt.Errorf("failed to remove drop-in config file: %w", err)
}

return nil
}

// Flush flushes the specified config to disk
func (o Options) Flush(cfg engine.Interface) error {
filepath := o.DropInConfig
if filepath == "" {
filepath = o.TopLevelConfigPath
}
logrus.Infof("Flushing config to %v", filepath)
n, err := cfg.Save(filepath)
if err != nil {
return fmt.Errorf("unable to flush config: %v", err)
}
if n == 0 {
logrus.Infof("Config file is empty, removed")
}
return nil
}

// UpdateConfig updates the specified config to include the nvidia runtimes
func (o Options) UpdateConfig(cfg engine.Interface) error {
runtimes := operator.GetRuntimes(
operator.WithNvidiaRuntimeName(o.RuntimeName),
operator.WithSetAsDefault(o.SetAsDefault),
operator.WithRoot(o.RuntimeDir),
)
for name, runtime := range runtimes {
err := cfg.AddRuntime(name, runtime.Path, runtime.SetAsDefault)
if err != nil {
return fmt.Errorf("failed to update runtime %q: %v", name, err)
}
}

if o.EnableCDI {
cfg.EnableCDI()
}

return nil
}

// RevertConfig reverts the specified config to remove the nvidia runtimes
func (o Options) RevertConfig(cfg engine.Interface) error {
runtimes := operator.GetRuntimes(
operator.WithNvidiaRuntimeName(o.RuntimeName),
operator.WithSetAsDefault(o.SetAsDefault),
operator.WithRoot(o.RuntimeDir),
)
for name := range runtimes {
err := cfg.RemoveRuntime(name)
if err != nil {
return fmt.Errorf("failed to remove runtime %q: %v", name, err)
}
}

return nil
}

// Restart restarts the specified service
func (o Options) Restart(service string, withSignal func(string) error) error {
switch o.RestartMode {
case restartModeNone:
logrus.Warningf("Skipping restart of %v due to --restart-mode=%v", service, o.RestartMode)
return nil
case restartModeSignal:
return withSignal(o.Socket)
case restartModeSystemd:
return o.SystemdRestart(service)
}

return fmt.Errorf("invalid restart mode specified: %v", o.RestartMode)
}

// SystemdRestart restarts the specified service using systemd
func (o Options) SystemdRestart(service string) error {
var args []string
var msg string
if o.HostRootMount != "" {
msg = " on host"
args = append(args, "chroot", o.HostRootMount)
}
args = append(args, "systemctl", "restart", service)

logrus.Infof("Restarting %v%v using systemd: %v", service, msg, args)

//nolint:gosec // TODO: Can we harden this so that there is less risk of command injection
cmd := exec.Command(args[0], args[1:]...)
cmd.Stdout = os.Stdout
cmd.Stderr = os.Stderr
err := cmd.Run()
if err != nil {
return fmt.Errorf("error restarting %v using systemd: %v", service, err)
}

return nil
}

// GetConfigLoaders returns the loaders for the requested config sources.
// Supported config sources can be specified as:
//
// * 'file[=path/to/file]': The specified file or the top-level config path is used.
// * command: The runtime-specific function supplied as an argument is used.
func (o Options) GetConfigLoaders(commandSourceFunc func(string, string) toml.Loader) ([]toml.Loader, error) {
if len(o.ConfigSources) == 0 {
return []toml.Loader{toml.Empty}, nil
}
var loaders []toml.Loader
for _, configSource := range o.ConfigSources {
parts := strings.SplitN(configSource, "=", 2)
source := strings.TrimSpace(parts[0])
switch source {
case "file":
fileSourcePath := o.TopLevelConfigPath
if len(parts) > 1 {
fileSourcePath = strings.TrimSpace(parts[1])
}
loaders = append(loaders, toml.FromFile(fileSourcePath))
case "command":
if commandSourceFunc == nil {
logrus.Warnf("Ignoring command config source")
}
if len(parts) > 1 {
logrus.Warnf("Ignoring additional command argument %q", parts[1])
}
loaders = append(loaders, commandSourceFunc(o.HostRootMount, o.ExecutablePath))
default:
return nil, fmt.Errorf("unsupported config source %q", configSource)
}
}
return loaders, nil
}
Loading
Loading