When the server does not have NVIDIA drivers installed, or the driver version does not match the graphics card, or when some system software is installed or the system updates the kernel, the server may not be able to connect to NVIDIA drivers after restarting. The error message is as follows:
$ nvidia-smi NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.
If it was normal before and this error occurs after restarting, it is probably caused by the upgrade of Ubuntu's kernel version. Let's reinstall the kernel headers.
$ sudo apt install linux-headers-`uname -r` #or $ sudo apt install linux-headers-$(uname -r)
Then enter nvidia-smi, and the output may be normal. If it is still wrong, please continue with the method below.
Sometimes, after restarting the machine, NVIDIA SMI will display NVIDIA driver loss, which is due to a Linux kernel upgrade where the previous NVIDIA driver does not match the connection.
DKMS (Dynamic Kernel Module System) can automatically compile modules after kernel changes and adapt to new kernels. It allows discrete kernel modules to update without the need to modify the entire kernel. Use dkms to reinstall the appropriate driver for the kernel:
$ sudo apt install dkms $ sudo dkms install -m nvidia -v 470.182.03 $ dkms status nvidia nvidia/470.182.03, 5.15.0-88-generic, x86_64: installed
Note: The 470.182.03 in the above command line is the version number of NVIDIA. When you are not aware of it, enter the/usr/src directory and you will see the nvidia folder with its suffix. Alternatively, use the following command to query it.
$ ls /usr/src | grep nvidia nvidia-470.182.03
When you input nvidia smi again, the familiar output will come back.
Step 1: Before installing the driver, make sure to update the package repository. Run the following commands:
$ sudo apt update $ sudo apt upgrade
Step 2: Search for Nvidia drivers, run the following command. The output shows a list of available drivers for your GPU.
$ apt search nvidia-driver
Step 3: Choose a driver to install from the list of available GPU drivers. The best fit is the latest tested proprietary version.
$ sudo apt install nvidia-driver-470
For this tutorial, we installed nvidia-driver-340, the latest tested proprietary driver for this GPU.
Step 4: Reboot your machine with the following command:
# sudo reboot
The PPA repository allows developers to distribute software that is not available in the official Ubuntu repositories. This means that you can install the latest beta drivers, however, at the risk of an unstable system.
To install the latest Nvidia drivers via the PPA repository, follow these steps:
Step 1: Add the graphics drivers repository to the system with the following command:
$ sudo add-apt-repository ppa:graphics-drivers/ppa
Step 2: To verify which GPU model you are using and to see a list of available drivers, run the following command:
$ ubuntu-drivers devices
Step 3: The output shows your GPU model as well as any available drivers for that specific GPU. To install a specific driver, use the following syntax:
$ sudo apt install nvidia-driver-470
Alternatively, install the recommended driver automatically by running:
$ sudo ubuntu-drivers autoinstall
Step 4: Reboot the machine for the changes to take effect.
Step1. NVIDIA drivers are available as .run installer packages for use with Linux distributions from the NVIDIA driver downloads site. Select the .run package for your GPU product.
Step 2. The .run can be downloaded using wget as shown in the example below:
$ wget https://us.download.nvidia.com/XFree86/Linux-x86_64/470.223.02/NVIDIA-Linux-x86_64-470.223.02.run
Step 3. Once the .run installer has been downloaded, the NVIDIA driver can be installed:
$ sudo sh NVIDIA-Linux-x86_64-$DRIVER_VERSION.run
Follow the prompts on the screen during the installation. For more advanced options on using the .run installer, see the --help option.
Step 4. Reboot the machine for the changes to take effect.