Skip to content
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 13 additions & 10 deletions spark-rapids/spark-rapids.sh
Original file line number Diff line number Diff line change
Expand Up @@ -501,17 +501,20 @@ function install_nvidia_gpu_driver() {

elif is_rocky ; then

# Ensure the Correct Kernel Development Packages are Installed
execute_with_retries "dnf -y -q update --exclude=systemd*,kernel*"
execute_with_retries "dnf -y -q install pciutils kernel-devel gcc"
# Install kernel development packages
execute_with_retries "dnf install -y kernel-devel-$(uname -r) kernel-headers-$(uname -r)"

readonly NVIDIA_ROCKY_REPO_URL="${NVIDIA_REPO_URL}/cuda-${shortname}.repo"
execute_with_retries "dnf config-manager --add-repo ${NVIDIA_ROCKY_REPO_URL}"
execute_with_retries "dnf clean all"
configure_dkms_certs
execute_with_retries "dnf -y -q module install nvidia-driver:latest-dkms"
clear_dkms_key
execute_with_retries "dnf -y -q install cuda-toolkit"
# Download the CUDA installer run file
curl -o driver.run \
"https://developer.download.nvidia.com/compute/cuda/${CUDA_VERSION}/local_installers/cuda_${CUDA_VERSION}_${NVIDIA_DRIVER_VERSION}_linux.run"

# Run the installer in silent mode
bash driver.run --silent

# Remove the installer file after installation to clean up
rm driver.run
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using rm driver.run could cause the script to fail if driver.run was not downloaded successfully. It's safer to use rm -f driver.run to prevent an error if the file does not exist.

Suggested change
rm driver.run
rm -f driver.run


# Load the NVIDIA kernel module
modprobe nvidia

else
Expand Down