【AI】openEuler 22.03 LTS SP4安装 docker NVIDIA Container Toolkit

news/2025/2/22 5:23:19

NVIDIA Container Toolkit

打开网址

Unsupported distribution or misconfigured repository settings | NVIDIA Container Toolkit

为方便离线安装,先下载过来

wget https://nvidia.github.io/libnvidia-container/stable/rpm/nvidia-container-toolkit.repo
mkdir rpms
yumdownloader --resolve --destdir=./rpms/ nvidia-container-toolkit

离线安装

# yum install ./*.rpm
Last metadata expiration check: 0:12:41 ago on Fri 21 Feb 2025 05:15:45 PM CST.
Dependencies resolved.
=================================================================================================================================================================
 Package                                              Architecture                  Version                            Repository                           Size
=================================================================================================================================================================
Installing:
 libnvidia-container-tools                            x86_64                        1.17.4-1                           @commandline                         40 k
 libnvidia-container1                                 x86_64                        1.17.4-1                           @commandline                        1.0 M
 nvidia-container-toolkit                             x86_64                        1.17.4-1                           @commandline                        1.2 M
 nvidia-container-toolkit-base                        x86_64                        1.17.4-1                           @commandline                        5.6 M

Transaction Summary
=================================================================================================================================================================
Install  4 Packages

Total size: 7.9 M
Installed size: 26 M
Is this ok [y/N]: y
Downloading Packages:
Running transaction check
Transaction check succeeded.
Running transaction test
Transaction test succeeded.
Running transaction
  Preparing        :                                                                                                                                         1/1
  Installing       : nvidia-container-toolkit-base-1.17.4-1.x86_64                                                                                           1/4
  Installing       : libnvidia-container1-1.17.4-1.x86_64                                                                                                    2/4
  Running scriptlet: libnvidia-container1-1.17.4-1.x86_64                                                                                                    2/4
  Installing       : libnvidia-container-tools-1.17.4-1.x86_64                                                                                               3/4
  Installing       : nvidia-container-toolkit-1.17.4-1.x86_64                                                                                                4/4
  Running scriptlet: nvidia-container-toolkit-1.17.4-1.x86_64                                                                                                4/4
  Verifying        : libnvidia-container1-1.17.4-1.x86_64                                                                                                    1/4
  Verifying        : libnvidia-container-tools-1.17.4-1.x86_64                                                                                               2/4
  Verifying        : nvidia-container-toolkit-1.17.4-1.x86_64                                                                                                3/4
  Verifying        : nvidia-container-toolkit-base-1.17.4-1.x86_64                                                                                           4/4

Installed:
  libnvidia-container-tools-1.17.4-1.x86_64                 libnvidia-container1-1.17.4-1.x86_64             nvidia-container-toolkit-1.17.4-1.x86_64
  nvidia-container-toolkit-base-1.17.4-1.x86_64

Complete!

Docker

手动下载最新版本

https://download.docker.com/linux/static/stable/x86_64/docker-28.0.0.tgz

wget https://download.docker.com/linux/static/stable/x86_64/docker-28.0.0.tgz
[root@localhost media]# tar -xvf docker-28.0.0.tgz
docker/
docker/containerd-shim-runc-v2
docker/containerd
docker/docker
docker/runc
docker/ctr
docker/dockerd
docker/docker-init
docker/docker-proxy
[root@localhost media]# mv -v docker/* /usr/local/bin/
renamed 'docker/containerd' -> '/usr/local/bin/containerd'
renamed 'docker/containerd-shim-runc-v2' -> '/usr/local/bin/containerd-shim-runc-v2'
renamed 'docker/ctr' -> '/usr/local/bin/ctr'
renamed 'docker/docker' -> '/usr/local/bin/docker'
renamed 'docker/dockerd' -> '/usr/local/bin/dockerd'
renamed 'docker/docker-init' -> '/usr/local/bin/docker-init'
renamed 'docker/docker-proxy' -> '/usr/local/bin/docker-proxy'
renamed 'docker/runc' -> '/usr/local/bin/runc'
[root@localhost media]# ll docker
total 0
[root@localhost media]# ll /usr/local/bin/
total 206856
-rwxr-xr-x. 1 1000 1000 40415384 Feb 20 06:11 containerd
-rwxr-xr-x. 1 1000 1000 13299864 Feb 20 06:11 containerd-shim-runc-v2
-rwxr-xr-x. 1 1000 1000 20394136 Feb 20 06:11 ctr
-rwxr-xr-x. 1 1000 1000 41532216 Feb 20 06:11 docker
-rwxr-xr-x. 1 1000 1000 76647872 Feb 20 06:11 dockerd
-rwxr-xr-x. 1 1000 1000   708448 Feb 20 06:11 docker-init
-rwxr-xr-x. 1 1000 1000  2377328 Feb 20 06:11 docker-proxy
-rwxr-xr-x. 1 1000 1000 16426200 Feb 20 06:11 runc

创建  /usr/lib/systemd/system/docker.service

[Unit]
Description=Docker Application Container Engine
Documentation=https://docs.docker.com
After=network-online.target firewalld.service
Wants=network-online.target

[Service]
Type=notify
EnvironmentFile=-/etc/sysconfig/docker
EnvironmentFile=-/etc/sysconfig/docker-storage
EnvironmentFile=-/etc/sysconfig/docker-network
Environment=GOTRACEBACK=crash
Environment=GOTRACEBACK=crash

ExecStart=/usr/local/bin/dockerd $OPTIONS \
                           $DOCKER_STORAGE_OPTIONS \
                           $DOCKER_NETWORK_OPTIONS \
                           $INSECURE_REGISTRY
ExecReload=/bin/kill -s HUP $MAINPID
LimitNOFILE=1048576
LimitNPROC=1048576
LimitCORE=infinity
# set delegate yes so that systemd does not reset the cgroups of docker containers
Delegate=yes
# kill only the docker process, not all processes in the cgroup
KillMode=process

[Install]
WantedBy=multi-user.target

nvidia-ctk配置runtime

[root@localhost media]# nvidia-ctk runtime configure --runtime=docker
INFO[0000] Config file does not exist; using empty config
INFO[0000] Wrote updated config to /etc/docker/daemon.json
INFO[0000] It is recommended that docker daemon be restarted.
[root@localhost media]# cat /etc/docker/daemon.json
{
    "runtimes": {
        "nvidia": {
            "args": [],
            "path": "nvidia-container-runtime"
        }
    }
}

启动Docker服务

[root@localhost media]# systemctl enable docker --now
Created symlink /etc/systemd/system/multi-user.target.wants/docker.service → /usr/lib/systemd/system/docker.service.
[root@localhost ~]# docker info
Client:
 Version:    28.0.0
 Context:    default
 Debug Mode: false

Server:
 Containers: 0
  Running: 0
  Paused: 0
  Stopped: 0
 Images: 0
 Server Version: 28.0.0
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Using metacopy: false
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 1
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local splunk syslog
 Swarm: inactive
 Runtimes: nvidia runc io.containerd.runc.v2
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: bcc810d6b9066471b0b6fa75f557a15a1cbf31bb
 runc version: v1.2.5-0-g59923ef
 init version: de40ad0
 Security Options:
  seccomp
   Profile: builtin
 Kernel Version: 5.10.0-216.0.0.115.oe2203sp4.x86_64
 Operating System: openEuler 22.03 (LTS-SP4)
 OSType: linux
 Architecture: x86_64
 CPUs: 128
 Total Memory: 30.46GiB
 Name: localhost.localdomain
 ID: e146eb60-c3e3-41d9-bf61-71e7cd5707f9
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Experimental: false
 Insecure Registries:
  ::1/128
  127.0.0.0/8
 Live Restore Enabled: false
 Product License: Community Engine

验证Docker nvidia-smi

随便找个镜像,采用--gpus=all参数执行nvidia-smi,如果不配置--gpus参数,容器内没有注入nvidia-smi指令

[root@localhost ollama]# docker run --rm -it ubuntu:22.04 nvidia-smi -l 1
docker: Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: exec: "nvidia-smi": executable file not found in $PATH: unknown

Run 'docker run --help' for more information
[root@localhost ollama]# docker run --rm -it --gpus=all ubuntu:22.04 nvidia-smi -l 1
Fri Feb 21 10:08:47 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 570.86.10              Driver Version: 570.86.10      CUDA Version: 12.8     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 4090        Off |   00000000:0C:00.0 Off |                  Off |
| 30%   27C    P8             18W /  450W |    8173MiB /  24564MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   1  NVIDIA GeForce RTX 4090        Off |   00000000:25:00.0 Off |                  Off |
| 30%   28C    P8             28W /  450W |    7821MiB /  24564MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   2  NVIDIA GeForce RTX 4090        Off |   00000000:32:00.0 Off |                  Off |
| 30%   27C    P8              5W /  450W |    7821MiB /  24564MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   3  NVIDIA GeForce RTX 4090        Off |   00000000:45:00.0 Off |                  Off |
| 30%   27C    P8             30W /  450W |    7821MiB /  24564MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   4  NVIDIA GeForce RTX 4090        Off |   00000000:58:00.0 Off |                  Off |
| 30%   28C    P8             18W /  450W |    7327MiB /  24564MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   5  NVIDIA GeForce RTX 4090        Off |   00000000:84:00.0 Off |                  Off |
| 30%   28C    P8             21W /  450W |    7327MiB /  24564MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   6  NVIDIA GeForce RTX 4090        Off |   00000000:D4:00.0 Off |                  Off |
| 30%   28C    P8             22W /  450W |    8009MiB /  24564MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|  No running processes found                                                             |
+-----------------------------------------------------------------------------------------+
Fri Feb 21 10:08:49 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 570.86.10              Driver Version: 570.86.10      CUDA Version: 12.8     |
|-----------------------------------------+------------------------+----------------------+

参考 

nstalling the NVIDIA Container Toolkit — NVIDIA Container Toolkit


http://www.niftyadmin.cn/n/5861617.html

相关文章

0099__Visual Studio 引入外部静态库与动态库

Visual Studio 引入外部静态库与动态库_visual studio 添加库-CSDN博客

域内证书维权

黄金证书 证书链中,除了处于证书链中的 根信任证书 外,其他证书的签名需要使用它上一级证书的私钥,而 根信任证书 使用自己的私钥给自己签名。CA 证书使用其私钥签发其他证书,因为 CA 证书被根信任证书的私钥签了名,因…

【Springboot3】Springboot3 搭建RocketMQ 最简单案例

说来也奇怪,RocketMQ 不能很好的兼容Springboot3,刚开始上手Springboot3集成RocketMQ会发现总是不能实例化RocketMQTemplate,老是启动时报错。本项目采用Springboot3,JDK21 ,Maven 3.9,提供一个非常简单的示…

Java 第九章 网络编程(2)

目录 网络编程 TCP编程 实例(发送文件) UDP编程 理解 流 程: 发送端 接收端 网络编程 TCP编程 实例(发送文件) public class Client {public static void main(String[] args) {// 在客户端输入一个文件try {FileInputStream inputStream new FileInputS…

使用Python进行PDF隐私信息检测

在当今,数据隐私保护变得尤为重要。随着越来越多的个人信息以电子形式存储和传输,确保这些信息的安全至关重要。本文将介绍如何使用Python及其相关库来检测PDF文件中的隐私信息,如姓名、身份证号、手机号和邮箱等。 C:\pythoncode\new\checkp…

Hopper架构 GEMM教程

一 使用 1.1 makefile compile:nvcc -arch=sm_90a -lcuda -lcublas -std=c++17 matmul_h100_optimal.cu -o testrun:./test加入-lcublas,不然会有函数无法被识别 二 代码分析 2.1 kernel外参数分析 2.1.1 基本参数 constexpr int BM = 64*2;constexpr int BN = 256;cons…

React fiber架构中 优先级是如何确定的?

React fiber架构中 优先级是如何确定的? 在React Fiber架构中,优先级的确定是一个复杂而精细的过程,它涉及多种因素和策略。以下是对React Fiber中优先级确定方式的详细分析: 一、优先级类型与划分 React Fiber为不同的任务分配了不同的优…

使用(xshell+xftp)将前端项目部署到服务器

一.以vue项目为例 将项目打包生成dist文件 二.下载载安装xshell和xftp 下载地址:家庭/学校免费 - NetSarang Website 三.连接服务器 在xshell新建会话(需要用到服务器、用户名、密码、端口号)正确输入后连接到服务器 使用命令连接&#x…