NVIDIA使用:CUDA安装
安装资源包下载根据NVIDIA GPU Driver Version,在官网寻找合适的CUDA版本:Release Notes :: CUDA Toolkit Documentation
CUDA Toolkit下载地址:CUDA Toolkit 11.2 Downloads | NVIDIA Developer
cuDNN下载地址:官方cuDNN下载
CUDA Toolkit
Toolkit Driver Version
—
Linux x86_64 Driver Version
Windows x86_64 Driver Version
CUDA 11.8 GA
>=520.61.05
>=522.06
CUDA 11.7 Update 1
>=515.48.07
>=516.31
CUDA 11.7 GA
>=515.43.04
>=516.01
CUDA 11.6 Update 2
>=510.47.03
>=511.65
CUDA 11.6 Update 1
>=510.47.03
> ...
Libvirt研发:9p virtio
简介主要需求是透传主机目录至虚机内部使用,9p virtio可以提供该能力,9p的官方介绍如下:
With QEMU’s 9pfs you can create virtual filesystem devices (virtio-9p-device) and expose them to guests, which essentially means that a certain directory on host machine is made directly accessible by a guest OS as a pass-through file system by using the 9P network protocol for communication between host and guest, if desired even accessible, shared by several guests simultaniously.
This section details the steps involved in setting up VirtFS (Pla ...
Libvirt研发:Qemu编译
简介QEMU is a generic and open source machine & userspace emulator and virtualizer.
QEMU is capable of emulating a complete machine in software without any need for hardware virtualization support. By using dynamic translation, it achieves very good performance. QEMU can also integrate with the Xen and KVM hypervisors to provide emulated hardware while allowing the hypervisor to manage the CPU. With hypervisor support, QEMU can achieve near native performance for CPUs. When QEMU emulates CPUs ...
NVIDIA使用:GPU驱动安装
安装安装依赖环境要装的两个依赖分别是:gcc、kernel-devel,其中需要注意的是,kernel-devel的版本需要与当前内核的版本一致,不然后面会出现找不到文件的情况。
1)查看我的内核版本:
12[root@k104 vGPU]# uname -r3.10.0-1127.el7.x86_64
2)查看一下可以安装的版本,安装对应内核版本:
12[root@k104 vGPU]# yum list | grep kernel-develkernel-devel.x86_64 3.10.0-1127.el7 @/kernel-devel-3.10.0-1127.el7.x86_64
3)安装依赖
1yum install kernel-devel-$(uname -r) gcc dkms -y
屏蔽系统自带的nouveau(重启生效)123456789101112echo "blacklist nouveau" >> /lib/modprobe.d/dist-blackl ...
NVIDIA研发:GPU Manager部署
简介本文介绍GPU Manager的部署流程,其原理本文不做介绍,原理可参考:论文笔记《GaiaGPU:Sharing GPUs in Container Clouds》
源码地址1234# Githubgpu-quota-admission https://github.com/tkestack/gpu-admissiongpu-manager https://github.com/tkestack/gpu-managervcuda-controller https://github.com/tkestack/vcuda-controller
环境信息123456789[root@k104 vGPU]# uname -aLinux k104 3.10.0-1127.el7.x86_64 #1 SMP Tue Mar 31 23:36:51 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux[root@k104 vGPU]# cat /etc/centos-releaseCentOS Linux release ...
Chrome使用:通用手册
常见问题NET::ERR_CERT_INVALID
在chrome该页面上,直接键盘敲入这12个字符:thisisunsafe
注意:鼠标点击当前页面任意位置,让页面处于最上层即可输入
OpenStack研发:cinderclient
研发环境
python:2.7.5
python-cinderclient 5.0.2
源码cinderclient调用方法123456789101112131415161718192021222324252627from keystoneauth1 import identityfrom keystoneauth1 import sessionfrom cinderclient import clientusername='admin'password='Inspur@123'project_name='admin'project_domain_id='default'user_domain_id='default'auth_url='http://111.111.9.207:35357/v3'CINDER_API_VERSION = "3.0"auth = identity.Password(auth_url=auth_url, ...
Libvirt研发:虚拟化特性检测
简介目前虚拟机环境检测有两个金标准,分别是Al-khaser和Pafish。这两个开源项目几乎一网打尽了所有公开常见的VM检测技术。
国外SANS安全组织的研究人员总结出当前各种虚拟机检测手段不外乎以下四类:● 搜索虚拟环境中的进程,文件系统,注册表;● 搜索虚拟环境中的内存● 搜索虚拟环境中的特定虚拟硬件● 搜索虚拟环境中的特定处理器指令和功能
Al-khaserGithub:https://github.com/LordNoteworthy/al-khaser
PafishGithub:https://github.com/a0rtega/pafish
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576777879808182838485868788899091929394959697* Pafish (Paranoid Fi ...
Libvirt研发:屏蔽虚拟化特性
简介记录如何解决虚机内软件、游戏启动报错:Sorry, This Application Cannot Run Under A Virtual Machine
关键点
虚拟化检测点可参考文档:操作系统能否知道自己处于虚拟机中?
CPU model and topology
禁用CPU的hypervisor功能。
使用host-passthrough模式,透传主机CPU能力。
参考xml相关配置:
123456789...<cpu mode='host-passthrough' check='none'> <topology sockets='2' cores='2' threads='2'/> <feature policy='disable' name='hypervisor'/> <numa> <cell id='0' cpus='0-7 ...
OpenStack使用:GPU卡使用
含有多个设备的GPU常见型号:Quadro RTX 6000/8000
简述系统内lspci查看显卡信息如下,可以看到同一个PCI插槽上(b1)包含了4个设备:VGA compatible、Audio device、USB、Serial bus,同时4个设备对应的驱动各不相同。
目前cyborg暂不支持此类加速设备的管理,cyborg采集的attach handle info中,只包含了function 0,即生成的xml中仅会透传GPU中的1个设备进入虚机,由于设备不完整,会导致libvirt会无法拉起虚机。
使用此类GPU,需要通过nova compute的PCI透传功能实现,总体思路如下:
切换GPU卡各类设备的驱动为vfio_pci。
透传GPU卡的所有设备至虚机。
若这两点有一点不满足,libvirt在拉起虚机时均会报如下报错:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646 ...