Ubuntu 20.04 Server Nvidia A6000 GPU Driver install
OS | OS_version | GPU | Drvier |
---|---|---|---|
Ubuntu Server | 20.04 | RTX A6000 | 515.48.07 |
Ubuntu Server install to Server
Ubuntu Server iso file download to Kakao-mirror Server.
Flash iso file on Usb
Flash ubuntu-server iso file to USB using Rufus software.
Install Ubuntu-Server OS on Server-Computer
- GNU GRUB
- Install Ubuntu Server
- Select Language
- English(US)
- Select Keyboard configuration
- English(US)
- Network connections
- Edit IPv4
- IPv4 Method:
- Manual
- Subnet : ..*.0/24
- Address : ...
- Gateway : ..*.1
- Name servers : ..., ...
- search domains : empty
- save
- Proxy address
- Pass(Done)
- Mirror address
- Pass(Done)
- Select Disk and Disk Size
- Use an entire disk
- Done
- USED DEVICES
- ubuntu-lv … size -> Entire size
- Done
- Use an entire disk
- Profile setup
- Install OpenSSH server
- Check
- Done
- Featured Server Snaps
- Pass(Done)
DeActivate Nouveau
Ubuntu 20.04 LTS에서 Nouveau 비활성화 및 CUDA 설치하기
sudo vim /etc/modprobe.d/blacklist.conf
# 아래 추가
blacklist nouveau
options nouveau modeset=0
sudo update-initramfs -u
sudo reboot
lsmod |grep nouveau
Language 설정
sudo su
apt-get update && apt-get install -y dialog language-pack-en
sudo update-locale && sudo vi /etc/default/locale
# 아래 추가
LANG="en_US.UTF-8"
LANGUAGE="en_US:en"
LC_ALL="en_US.UTF-8"
Install Nvidia-A6000 Driver
서버에 NVIDIA 드라이버 설치하기 - Ubuntu 16.04 LTS Server(Titan XP)
# Check Nvidia A6000 Drvier for Server
apt-cache search nvidia
sudo apt-get install nvidia-driver-515-server
sudo reboot
nvidia-smi
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 515.48.07 Driver Version: 515.48.07 CUDA Version: 11.7 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA RTX A6000 Off | 00000000:41:00.0 Off | Off |
| 30% 57C P0 86W / 300W | 0MiB / 49140MiB | 5% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
Install Docker.io
https://docs.docker.com/engine/install/ubuntu/
# Install Docker.io
sudo apt-get remove docker docker-engine docker.io containerd runc
sudo apt-get update && sudo apt-get install -y ca-certificates && sudo apt-get install -y curl && sudo apt-get install -y gnupg && sudo apt-get install -y lsb-release
sudo mkdir -p /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg
echo \
"deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu \
$(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt-get update && sudo apt-get install docker-ce docker-ce-cli containerd.io docker-compose-plugin
# Install nvidia-container-toolkit
distribution=$(. /etc/os-release;echo $ID$VERSION_ID) \
&& curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add - \
&& curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit
sudo systemctl restart docker
Install Nvidia-Docker
# Setting up NVIDIA Container Toolkit
## Setup the package repository and the GPG key
distribution=$(. /etc/os-release;echo $ID$VERSION_ID) \
&& curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
&& curl -s -L https://nvidia.github.io/libnvidia-container/$distribution/libnvidia-container.list | \
sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
# Install the nvidia-docker2
sudo apt-get update && sudo apt-get install -y nvidia-docker2
# Restart the docker
sudo systemctl restart docker
# Verify installed correctly
sudo docker run --rm --gpus all nvidia/cuda:11.0.3-base-ubuntu20.04 nvidia-smi
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.51.06 Driver Version: 450.51.06 CUDA Version: 11.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Tesla T4 On | 00000000:00:1E.0 Off | 0 |
| N/A 34C P8 9W / 70W | 0MiB / 15109MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
Add to Docker Group
sudo usermod -aG docker $USER && sudo service docker restart
Pull Docker Image
docker pull pytorchlightning/pytorch_lightning