vGPU选择

NVIDIA 虚拟 GPU 软件产品包括 GRID 虚拟 PC (GRID vPC)、GRID 虚拟应用程序 (GRID vApp),以及 Quadro 虚拟数据中心工作站 (Quadro vDWS)。

vGPU支持情况

vGPU驱动安装

vGPU驱动说明:

A physical GPU that is passed through to a VM is bound to the vfio-pci kernel module. A physical GPU that is bound to the vfio-pci kernel module can be used only for pass-through. To enable the GPU to be used for vGPU, the GPU must be unbound from vfio-pci kernel module and bound to the nvidia kernel module.

环境信息

  • 系统:CentOS Linux release 7.8.2003 (Core)
  • 内核:3.10.0-1127.el7.x86_64

驱动安装

1
2
3
4
5
6
7
8
9
# 查看GPU信息
[root@k205 ~]# lspci -nnn -D | grep NVIDIA
0000:86:00.0 3D controller [0302]: NVIDIA Corporation TU104GL [Tesla T4] [10de:1eb8] (rev a1)

# 基础依赖包
[root@k205 ~]# yum install -y gcc

# 安装NVIDIA驱动
[root@k205 ~]# ./NVIDIA-Linux-x86_64-460.32.04-vgpu-kvm.run

问题汇总

问题1:安装报X library path '/usr/lib64' ... were not queryable from the system,问题截图如下:

X library not queryable

解决办法:关闭vnc server

vGPU使用

宿主机使用

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
# 查看GPU PCI信息
[root@k205 ~]# lspci -nnn -D | grep NVIDIA
0000:86:00.0 3D controller [0302]: NVIDIA Corporation TU104GL [Tesla T4] [10de:1eb8] (rev a1)

# 查看GPU PCI详情
[root@k205 ~]# lspci -d 10de: -k
86:00.0 3D controller: NVIDIA Corporation TU104GL [Tesla T4] (rev a1)
Subsystem: NVIDIA Corporation Device 12a2
Kernel driver in use: nvidia
Kernel modules: nouveau, nvidia_vgpu_vfio, nvidia

# 查看支持的vGPU类型
[root@k205 mdev_supported_types]# cd /sys/class/mdev_bus/0000:86:00.0/mdev_supported_types && ls
nvidia-222 nvidia-224 nvidia-226 nvidia-228 nvidia-230 nvidia-232 nvidia-234 nvidia-319 nvidia-321
nvidia-223 nvidia-225 nvidia-227 nvidia-229 nvidia-231 nvidia-233 nvidia-252 nvidia-320

# 查看指定类型vGPU信息
[root@k205 mdev_supported_types]# cat nvidia-222/name
GRID T4-1B
[root@k205 mdev_supported_types]# cat nvidia-222/available_instances
16
[root@k205 mdev_supported_types]# cat nvidia-222/description
num_heads=4, frl_config=45, framebuffer=1024M, max_resolution=5120x2880, max_instance=16
# PS:
# num_heads: The maximum number of virtual display heads that the vGPU type supports
# frl_config: The frame rate limiter (FRL) configuration in frames per second
# framebuffer: The frame buffer size in Mbytes
# max_resolution: The maximum resolution per display head
# max_instance: The maximum number of vGPU instances per physical GPU

# 创建vGPU
[root@k205 mdev_supported_types]# uuidgen
8da209ce-7865-48f0-9b04-fd6ef55dca63
[root@k205 mdev_supported_types]# echo 8da209ce-7865-48f0-9b04-fd6ef55dca63 > nvidia-222/create

# 查看现有vGPU设备
[root@k205 mdev_supported_types]# ls /sys/bus/mdev/devices/
8da209ce-7865-48f0-9b04-fd6ef55dca63

# 删除指定vGPU设备
[root@k205 mdev_supported_types]# echo 1 > nvidia-222/devices/8da209ce-7865-48f0-9b04-fd6ef55dca63/remove


## 脚本
# 查看所有vGPU类型信息
vGPU_DIR='/sys/class/mdev_bus/0000:18:00.0/mdev_supported_types'
for type in $(ls $vGPU_DIR)
do
echo "-----"
echo "$type"
echo "name: $(cat $vGPU_DIR/$type/name)"
echo "description: $(cat $vGPU_DIR/$type/description)"
done

虚机使用

创建虚机增加如下配置:

1
-device vfio-pci,sysfsdev=/sys/bus/mdev/devices/8da209ce-7865-48f0-9b04-fd6ef55dca63 -uuid xxxxxxxxxxxxxxxxxxxxx

虚机xml内相关配置如下:

1
2
3
4
5
6
7
<hostdev mode='subsystem' type='mdev' managed='no' model='vfio-pci' display='off'>
<source>
<address uuid='8da209ce-7865-48f0-9b04-fd6ef55dca63'/>
</source>
<alias name='hostdev0'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
</hostdev>

后面就是虚拟机内部安装对应的驱动程序了,一般名称为:xxxx_grid_win10_server2016_server2019_64bit_international.exe

vGPU授权

NVIDIA的vGPU在虚拟机内部使用是需要购买license的,具体的部署方式是需要搭建一台授权服务器,虚拟机内部安装显卡驱动后需要配置授权服务器的地址和端口,前提是虚拟机和授权服务器网络是通的,虚拟机每次开机后都要连接到授权服务器进行授权。

Table 1. NVIDIA vGPU Software Licensed Products

NVIDIA vGPU Software Licensed Product Target Users Supported NVIDIA vGPU Software Deployments
GRID Virtual Applications Users of PC-level applications and server-based desktops that use Citrix Virtual Apps and Desktops, VMware Horizon, RDSH, or other app streaming or session-based solutions - A-series NVIDIA vGPUs
- GPU pass through
- Microsoft DDA
- VMware vDGA
- Bare metal
GRID Virtual PC Users of business virtual desktops who require a great user experience with PC applications for Windows, web browsers, and high-definition video - B-series NVIDIA vGPUs
- Microsoft RemoteFX vGPU
- VMware vSGA
Quadro vDWS Users of mid-range and high-end workstations who require access to remote professional graphics applications with full performance on any device anywhere - Q-series NVIDIA vGPUs
- B-series NVIDIA vGPUs
- GPU pass through
- Microsoft DDA
- VMware vDGA
- Bare metal

参考文档