[root@test ~]# nvidia-smi
Fri Jun 13 17:35:05 2025
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.216.01 Driver Version: 535.216.01 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 Tesla T4 On | 00000000:00:08.0 Off | 0 |
| N/A 39C P8 11W / 70W | 2MiB / 15360MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| No running processes found |
+---------------------------------------------------------------------------------------+
[root@test ~]#
[root@test ~]# yum install podman -y
[root@test ~]# systemctl start podman
[root@test ~]# podman run --rm --gpus all nvidia/cuda:11.0.3-base-ubuntu20.04 nvidia-smi
✔ mirror.ccs.tencentyun.com/nvidia/cuda:11.0.3-base-ubuntu20.04
Trying to pull mirror.ccs.tencentyun.com/nvidia/cuda:11.0.3-base-ubuntu20.04...
Getting image source signatures
Copying blob e43c2058e496 done |
Copying blob 96d54c3075c9 done |
Copying blob 59f6381879f6 done |
Copying blob 655ed0df26cf done |
Copying blob 848b95ad96b5 done |
Copying config 97dfa1ef5e done |
Writing manifest to image destination
Error: runc: runc create failed: unable to start container process: exec: "nvidia-smi": executable file not found in $PATH: OCI runtime attempted to invoke a command that was not found
[root@test ~]#
参考 https://cloud.tencent.com/document/product/560/118463
还是报错
[root@test ~]# podman run --rm --gpus all nvidia/cuda:11.0.3-base-ubuntu20.04 nvidia-smi
Error: runc: runc create failed: unable to start container process: exec: "nvidia-smi": executable file not found in $PATH: OCI runtime attempted to invoke a command that was not found
[root@test ~]#
参考文档 https://blog.csdn.net/jiqiren_dasheng/article/details/124857320
加了这一段配置
[root@test ~]# Content=`cat << 'EOF'
> {
> "version": "1.0.0",
> "hook": {
> "path": "/usr/bin/nvidia-container-toolkit",
> "args": ["nvidia-container-toolkit", "prestart"],
> "env": [
> "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
> ]
> },
> "when": {
> "always": true,
> "commands": [".*"]
> },
> "stages": ["prestart"]
> }
> EOF`
[root@test ~]#
[root@test ~]# HookFile=/usr/share/containers/oci/hooks.d/oci-nvidia-hook.json
[root@test ~]# sudo mkdir -p `dirname $HookFile`
[root@test ~]# sudo echo "$Content" > $HookFile
[root@test ~]#
就正常了
[root@test ~]# podman run --rm --gpus all nvidia/cuda:11.0.3-base-ubuntu20.04 nvidia-smi
Fri Jun 13 09:57:59 2025
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.216.01 Driver Version: 535.216.01 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 Tesla T4 On | 00000000:00:08.0 Off | 0 |
| N/A 39C P8 11W / 70W | 2MiB / 15360MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| No running processes found |
+---------------------------------------------------------------------------------------+
[root@test ~]#
原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。
如有侵权,请联系 cloudcommunity@tencent.com 删除。
原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。
如有侵权,请联系 cloudcommunity@tencent.com 删除。
扫码关注腾讯云开发者
领取腾讯云代金券
Copyright © 2013 - 2025 Tencent Cloud. All Rights Reserved. 腾讯云 版权所有
深圳市腾讯计算机系统有限公司 ICP备案/许可证号:粤B2-20090059 深公网安备号 44030502008569
腾讯云计算(北京)有限责任公司 京ICP证150476号 | 京ICP备11018762号 | 京公网安备号11010802020287
Copyright © 2013 - 2025 Tencent Cloud.
All Rights Reserved. 腾讯云 版权所有