Skip to content

Remote server install failure #23

@wboykinm

Description

@wboykinm

Following the remote-launch outline laid out in @albarji's blog post . . .

  1. Booting a remote p2.xlarge server with Ubuntu 16.04.2 LTS (GNU/Linux 4.4.0-1020-aws x86_64)
  2. Cloning the repo
  3. Running the install script

. . . I get this:

./scripts/install-nvidia.sh
Reading package lists... Done
Building dependency tree       
Reading state information... Done
Note, selecting 'libc6-dev' instead of 'libc-dev'
gcc is already the newest version (4:5.3.1-1ubuntu1).
make is already the newest version (4.1-6).
libc6-dev is already the newest version (2.23-0ubuntu9).
0 upgraded, 0 newly installed, 0 to remove and 128 not upgraded.
--2018-01-11 15:31:19--  http://us.download.nvidia.com/XFree86/Linux-x86_64/361.42/NVIDIA-Linux-x86_64-361.42.run
Resolving us.download.nvidia.com (us.download.nvidia.com)... 192.229.211.70, 2606:2800:21f:3aa:dcf:37b:1ed6:1fb
Connecting to us.download.nvidia.com (us.download.nvidia.com)|192.229.211.70|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 86760004 (83M) [application/octet-stream]
Saving to: ‘/tmp/NVIDIA-Linux-x86_64-361.42.run.1’

NVIDIA-Linux-x86_64-361.42.run.1             100%[=============================================================================================>]  82.74M   140MB/s    in 0.6s    

2018-01-11 15:31:19 (140 MB/s) - ‘/tmp/NVIDIA-Linux-x86_64-361.42.run.1’ saved [86760004/86760004]

Verifying archive integrity... OK
Uncompressing NVIDIA Accelerated Graphics Driver for Linux-x86_64 361.42...................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................

ERROR: Unable to load the kernel module 'nvidia.ko'.  This happens most frequently when this kernel module was built against the wrong or improperly configured kernel sources,
       with a version of gcc that differs from the one used to build the target kernel, or if a driver such as rivafb, nvidiafb, or nouveau is present and prevents the NVIDIA
       kernel module from obtaining ownership of the NVIDIA graphics device(s), or no NVIDIA GPU installed in this system is supported by this NVIDIA Linux graphics driver
       release.
       
       Please see the log entries 'Kernel module load error' and 'Kernel messages' at the end of the file '/var/log/nvidia-installer.log' for more information.


ERROR: Installation has failed.  Please see the file '/var/log/nvidia-installer.log' for details.  You may find suggestions on fixing installation problems in the README
       available on the Linux driver download page at www.nvidia.com.

--2018-01-11 15:31:53--  https://github.com/NVIDIA/nvidia-docker/releases/download/v1.0.1/nvidia-docker_1.0.1-1_amd64.deb
Resolving github.com (github.com)... 192.30.253.113, 192.30.253.112
Connecting to github.com (github.com)|192.30.253.113|:443... connected.
HTTP request sent, awaiting response... 502 Bad Gateway
2018-01-11 15:31:53 ERROR 502: Bad Gateway.

dpkg: error processing archive /tmp/nvidia-docker*.deb (--install):
 cannot access archive: No such file or directory
Errors were encountered while processing:
 /tmp/nvidia-docker*.deb
sudo: nvidia-docker: command not found

This seems like a driver mismatch. I'm unable to test this locally, unfortunately (wrong GPU), so I'm left to guess if the image needs rebuilding or if I need to change my EC2 config somehow. It looks like the appropriate driver version needs a bump.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions