GPU virtualization
{{short description|Technology that allows a GPU to be used by multiple virtual machines}}
GPU virtualization refers to technologies that allow the use of a GPU to accelerate graphics or GPGPU applications running on a virtual machine. GPU virtualization is used in various applications such as desktop virtualization, cloud gaming{{cite conference |last1=Hong |first1=Hua-Jun |last2=Fan-Chiang |first2=Tao-Ya |last3=Lee |first3=Che-Rung |last4=Chen |first4=Kuan-Ta |last5=Huang |first5=Chun-Ying |last6=Hsu |first6=Cheng-Hsin |title=GPU Consolidation for Cloud Games: Are We There Yet? |url=https://www.iis.sinica.edu.tw/~swc/pub/gpu_virtualization_for_cloud_games.html |conference=13th Annual Workshop on Network and Systems Support for Games |location=Nagoya |pages=1–6 |doi=10.1109/NetGames.2014.7008969 |isbn=978-1-4799-6882-4 |issn=2156-8138 |publisher=Institute of Electrical and Electronics Engineers |date=2014 |s2cid=664129 |access-date=14 September 2020|url-access=subscription }} and computational science (e.g. hydrodynamics simulations).
GPU virtualization implementations generally involve one or more of the following techniques: device emulation, API remoting, fixed pass-through and mediated pass-through. Each technique presents different trade-offs regarding virtual machine to GPU consolidation ratio, graphics acceleration, rendering fidelity and feature support, portability to different hardware, isolation between virtual machines, and support for suspending/resuming and live migration.{{cite journal |last1=Dowty |first1=Micah |last2=Sugerman |first2=Jeremy |title=GPU Virtualization on VMware's Hosted I/O Architecture |url=https://www.usenix.org/legacy/event/wiov08/tech/full_papers/dowty/dowty.pdf |journal=ACM SIGOPS Operating Systems Review |location=San Diego |volume=43 |issue=3 |pages=73–82 |publisher=Association for Computing Machinery |publication-place=New York City |doi=10.1145/1618525.1618534 |issn=0163-5980 |date=July 2009 |s2cid=228328 |access-date=10 September 2020}}{{cite conference |last1=Yu |first1=Hangchen |last2=Rossbach |first2=Christopher |title=Full Virtualization for GPUs Reconsidered |url=https://www.cs.utexas.edu/~hyu/publication/wddd17-gpuvm.pdf |conference=ISCA-44 14th Annual Workshop on Duplicating, Deconstructing and Debunking |location=Toronto |date=25 June 2017 |access-date=12 September 2020}}{{cite conference |last1=Tian |first1=Kun |last2=Dong |first2=Yaozu |last3=Cowperthwaite |first3=David |title=A Full GPU Virtualization Solution with Mediated Pass-Through |url=https://www.usenix.org/system/files/conference/atc14/atc14-paper-tian.pdf |conference=USENIX Annual Technical Conference |location=Philadelphia |book-title=Proceedings of the 2014 USENIX Conference on USENIX Annual Technical Conference (USENIX ATC'14) |pages=121–132 |isbn=978-1-931971-10-2 |publisher=USENIX |date=June 2014 }}{{cite conference |last1=Gottschlag |first1=Mathias |last2=Hillenbrand |first2=Marius |last3=Kehne |first3=Jens |last4=Stoess |first4=Jan |last5=Bellosa |first5=Frank |title=LoGV: Low-Overhead GPGPU Virtualization |url=http://os.itec.kit.edu/downloads/logv_low_overhead%20gpgpu_virtualization.pdf |conference=10th International Conference on High Performance Computing |location=Zhangjiajie |pages=1721–1726 |doi=10.1109/HPCC.and.EUC.2013.245 |isbn=978-0-7695-5088-6 |publisher=IEEE Computer Society |date=November 2013 |access-date=16 September 2020}}
API remoting
In API remoting or API forwarding, calls to graphical APIs from guest applications are forwarded to the host by remote procedure call, and the host then executes graphical commands from multiple guests using the host's GPU as a single user. It may be considered a form of paravirtualization when combined with device emulation.{{cite conference |last1=Suzuki |first1=Yusuke |last2=Kato |first2=Shinpei |last3=Yamada |first3=Hiroshi |last4=Kono |first4=Kenji |title=GPUvm: Why Not Virtualizing GPUs at the Hypervisor? |url=https://www.usenix.org/system/files/conference/atc14/atc14-paper-suzuki.pdf |conference=USENIX Annual Technical Conference |location=Philadelphia |book-title=Proceedings of the 2014 USENIX Conference on USENIX Annual Technical Conference (USENIX ATC'14) |pages=109–120 |isbn=978-1-931971-10-2 |publisher=USENIX |date=June 2014 |access-date=14 September 2020}} This technique allows sharing GPU resources between multiple guests and the host when the GPU does not support hardware-assisted virtualization. It is conceptually simple to implement, but it has several disadvantages:
- In pure API remoting, there is little isolation between virtual machines when accessing graphical APIs; isolation can be improved using paravirtualization
- Performance ranges from 86% to as low as 12% of native performance in applications that issue a large number of drawing calls per frame
- A large number of API entry points must be forwarded, and partial implementation of entry points may decrease fidelity
- Applications on guest machines may be limited to few available APIs
Hypervisors usually use shared memory between guest and host to maximize performance and minimize latency. Using a network interface instead (a common approach in distributed rendering), third-party software can add support for specific APIs (e.g. rCUDA{{cite conference |last1=Duato |first1=José |last2=Peña |first2=Antonio |last3=Silla |first3=Federico |last4=Fernández |first4=Juan |last5=Mayo |first5=Rafael |last6=Quintana-Ortí |first6=Enrique |title=Enabling CUDA acceleration within virtual machines using rCUDA |journal=International Conference on High Performance Computing |url=https://core.ac.uk/download/pdf/231705177.pdf |conference=18th International Conference on High Performance Computing |location=Bangalore |pages=1–10 |doi=10.1109/HiPC.2011.6152718 |isbn=978-1-4577-1951-6 |issn=1094-7256 |publisher=IEEE Computer Society |date=December 2011 |access-date=13 September 2020|hdl=2117/168226 |hdl-access=free }} for CUDA) or add support for typical APIs (e.g. VMGL{{cite conference |last1=Lagar-Cavilla |first1=Horacio |last2=Tolia |first2=Niraj |last3=Satyanarayanan |first3=Mahadev |last4=Lara |first4=Eyal |title=VMM-Independent Graphics Acceleration |url=http://www.cs.cmu.edu/~satya/docdir/lagar-cavilla-vee-vmgl-2007.pdf |conference=VEE '07 |location=San Antonio |book-title=Proceedings of the 3rd International Conference on Virtual Execution Environments |pages=33–43 |doi=10.1145/1254810.1254816 |isbn=978-1-59593-630-1 |publisher=Association for Computing Machinery |publication-place=New York City |date=June 2007 |access-date=12 September 2020}} for OpenGL) when it is not supported by the hypervisor's software package, although network delay and serialization overhead may outweigh the benefits.
{{notelist-ua}}
Fixed pass-through
In fixed pass-through or GPU pass-through (a special case of PCI pass-through), a GPU is accessed directly by a single virtual machine exclusively and permanently. This technique achieves 96{{ndash}}100% of native performance{{cite conference |last1=Walters |first1=John |last2=Younge |first2=Andrew |last3=Kang |first3=Dong-In |last4=Yao |first4=Ke-Thia |last5=Kang |first5=Mikyung |last6=Crago |first6=Stephen |last7=Fox |first7=Geoffrey |title=GPU Passthrough Performance: A Comparison of KVM, Xen, VMware ESXi, and LXC for CUDA and OpenCL Applications |url=https://ieeexplore.ieee.org/document/6973796 |conference=IEEE 7th International Conference on Cloud Computing |location=Anchorage |book-title=IEEE 7th International Conference on Cloud Computing |pages=636–643 |doi=10.1109/CLOUD.2014.90 |isbn=978-1-4799-5063-8 |issn=2159-6190 |publisher=IEEE Computer Society |date=2014 |access-date=13 September 2020|url-access=subscription }} and high fidelity, but the acceleration provided by the GPU cannot be shared between multiple virtual machines. As such, it has the lowest consolidation ratio and the highest cost, as each graphics-accelerated virtual machine requires an additional physical GPU.
The following software technologies implement fixed pass-through:
- VMware Virtual Dedicated Graphics Acceleration (vDGA){{efn|name=vmware-workstation-vsga-only|Not available on VMware Workstation.}}
- Parallels Workstation Extreme{{cite tech report |type=White paper |title=GPU Development with Parallels Workstation Extreme |url=http://download.parallels.com/doc/pwe/en/GPU_development_solution_brief.pdf |format=PDF |publisher=Parallels |date=2010 |access-date=13 September 2020}}
- Hyper-V Discrete Device Assignment (DDA){{cite tech report |type=Manual |title=Hyper-V on Windows Server |url=https://docs.microsoft.com/en-us/windows-server/virtualization/hyper-v/hyper-v-on-windows-server |chapter=Deploy graphics devices using Discrete Device Assignment |chapter-url=https://docs.microsoft.com/en-us/windows-server/virtualization/hyper-v/deploy/deploying-graphics-devices-using-dda |publisher=Microsoft |access-date=13 September 2020}}
- Citrix XenServer GPU pass-through{{cite tech report |type=Manual |title=XenApp and XenDesktop 7.15 LTSR |url=https://docs.citrix.com/en-us/xenapp-and-xendesktop/7-15-ltsr |chapter=HDX 3D Pro |chapter-url=https://docs.citrix.com/en-us/xenapp-and-xendesktop/7-15-ltsr/graphics/hdx-3d-pro.html |publisher=Citrix Systems |access-date=15 September 2020}}{{cite tech report |type=Manual |title=Citrix Hypervisor 8.2 |url=https://docs.citrix.com/en-us/citrix-hypervisor |chapter=Graphics overview |chapter-url=https://docs.citrix.com/en-us/citrix-hypervisor/graphics.html |publisher=Citrix Systems |access-date=15 September 2020}}
- Xen{{cite tech report |type=Guide |title=GVT-d Setup Guide |url=https://github.com/intel/gvt-linux/wiki/GVTd_Setup_Guide |website=GitHub |access-date=13 September 2020}} and QEMU/KVM{{cite news |last=Larabel |first=Michael |title=Intel Pushes Their Graphics Virtualization Capabilities |url=https://www.phoronix.com/scan.php?page=news_item&px=MTY4MTc |work=Phoronix |date=4 May 2014 |access-date=13 September 2020}} with Intel GVT-d{{cite web |type=Flyer |title=Bringing New Use Cases and Workloads to the Cloud with Intel Graphics Virtualization Technology (Intel GVT-g) |url=https://01.org/sites/default/files/documentation/gvt_flyer_final.pdf |publisher=Intel |website=Intel Open Source Technology Center |date=2016 |access-date=14 August 2020}}{{cite web |type=Article |last1=Jain |first1=Sunil |title=Intel Graphics Virtualization Update |url=https://01.org/blogs/2014/intel%C2%AE-graphics-virtualization-update |publisher=Intel |date=4 May 2014 |access-date=13 September 2020}}
VirtualBox removed support for PCI pass-through in version 6.1.0.{{cite web |title=Changelog for VirtualBox 6.1 |url=https://www.virtualbox.org/wiki/Changelog-6.1 |publisher=Oracle Corporation |website=VirtualBox |date=10 December 2019 |access-date=12 September 2020}}
= QEMU/KVM =
For certain GPU models, Nvidia and AMD video card drivers attempt to detect the GPU is being accessed by a virtual machine and disable some or all GPU features.{{cite web|title=PCI passthrough via OVMF - Video card driver virtualization detection|url=https://wiki.archlinux.org/index.php/PCI_passthrough_via_OVMF#Video_card_driver_virtualisation_detection|access-date=13 September 2020|website=Arch Linux Wiki|type=Wiki}} NVIDIA has recently changed virtualization rules for consumer GPUs by disabling the check in GeForce Game Ready driver 465.xx and later.{{Cite web|date=2021-03-30|title=GeForce GPU Passthrough for Windows Virtual Machine (Beta)|url=https://nvidia.custhelp.com/app/answers/detail/a_id/5173/~/geforce-gpu-passthrough-for-windows-virtual-machine-%28beta%29|website=NVIDIA Support}}
For NVIDIA, various architectures of desktop and laptop consumer GPUs can be passed through in various ways. For desktop graphics cards, passthrough can be done via the KVM using either the legacy or UEFI BIOS configuration via SeaBIOS and OVMF, respectively.
= NVIDIA =
== Desktops ==
For desktops, most graphics cards can be passed through, although for graphics cards with the Pascal architecture or older, the VBIOS of the graphics card must be passed through in the virtual machine if the GPU is used to boot the host.{{Cite web|title=PCI passthrough via OVMF - ArchWiki|url=https://wiki.archlinux.org/title/PCI_passthrough_via_OVMF#Video_card_driver_virtualisation_detection|access-date=2021-05-20|website=wiki.archlinux.org}}
== Laptops ==
=== Pascal and earlier ===
For the laptop graphics cards that are Pascal and older, passthrough varies widely on the configuration of the graphics card. For laptops that do not have NVIDIA Optimus, such as the MXM variants, passthrough can be achieved through traditional methods. For laptops that have NVIDIA Optimus on as well as rendering through the CPU's integrated graphics framebuffer as opposed to its own, the passthrough is more complicated, requiring a remote rendering display or service, the use of Intel GVT-g, as well as integrating the VBIOS into the boot configuration due to the VBIOS being present in the laptop's system BIOS as opposed to the GPU itself. For laptops that have a GPU with NVIDIA Optimus and have a dedicated framebuffer, the configurations may vary. If NVIDIA Optimus can be switched off, then passthrough is possible through traditional means. However, if Optimus is the only configuration, then it is most likely that the VBIOS is present in the laptop's system BIOS, requiring the same steps as the laptop rendering only on the integrated graphics framebuffer, but an external monitor is also possible.{{Cite web|last=Tian|first=Lan|date=2020-06-25|title=Intel and NVIDIA GPU Passthrough on a Optimus MUXless Laptop|url=https://lantian.pub/en/article/modify-computer/laptop-intel-nvidia-optimus-passthrough.lantian/#Stop-Host-OS-from-Tampering-with-NVIDIA-GPU}}
Mediated pass-through
In mediated device pass-through or full GPU virtualization, the GPU hardware provides contexts with virtual memory ranges for each guest through IOMMU and the hypervisor sends graphical commands from guests directly to the GPU. This technique is a form of hardware-assisted virtualization and achieves near-native{{efn|name=mediated-performance|Intel GVT-g achieves 80{{ndash}}90% of native performance.{{cite conference |type=Presentation slide |last=Zheng |first=Xiao |title=Media Cloud Based on Intel Graphics Virtualization Technology (Intel GVT-g) and OpenStack |url=https://01.org/sites/default/files/documentation/sz15_sfts002_100_engf.pdf |conference=Intel Developer Forum |location=San Francisco |publisher=Intel |date=August 2015 |access-date=14 September 2020}}{{cite conference |type=Presentation slide |last=Wang |first=Zhenyu |title=Full GPU virtualization in mediated pass-through way |url=https://www.x.org/wiki/Events/XDC2017/wang_gvt.pdf |conference=XDC2017 |conference-url=https://www.x.org/wiki/Events/XDC2017/ |location=Mountain View, California |publisher=X.Org Foundation |date=September 2017 |access-date=14 September 2020}} Nvidia vGPU achieves 88{{ndash}}96% of native performance considering the overhead on a VMware hypervisor.{{cite tech report |type=Article |last=Kurkure |first=Uday |title=Performance Comparison of Native GPU to Virtualized GPU and Scalability of Virtualized GPUs for Machine Learning |url=https://blogs.vmware.com/performance/2017/10/episode-3-performance-comparison-native-gpu-virtualized-gpu-scalability-virtualized-gpus-machine-learning.html |website=VMware VROOM! Performance Blog |number=Episode 3 |publisher=VMware |date=12 October 2017 |access-date=14 September 2020}}}} performance and high fidelity. If the hardware exposes contexts as full logical devices, then guests can use any API. Otherwise, APIs and drivers must manage the additional complexity of GPU contexts. As a disadvantage, there may be little isolation between virtual machines when accessing GPU resources.
The following software and hardware technologies implement mediated pass-through:
- VMware Virtual Shared Pass-Through Graphics Acceleration{{efn|name=vmware-workstation-vsga-only}} with Nvidia vGPU{{cite tech report |type=Guide |title=Virtual GPU Software User Guide |url=https://docs.nvidia.com/grid/latest/grid-vgpu-user-guide/ |publisher=Nvidia |access-date=13 September 2020}} or AMD MxGPU{{cite tech report |type=White paper |last=Wong |first=Tonny |title=AMD multiuser GPU: hardware-enabled GPU virtualization for a true workstation experience |url=https://www.amd.com/system/files/documents/amd-mxgpu-white-paper.pdf |format=PDF |publisher=AMD |date=28 January 2016 |access-date=12 September 2020}}
- Citrix XenServer shared GPU with Nvidia vGPU, AMD MxGPU or Intel GVT-g
- Xen{{cite press release |last=Wang |first=Hongbo |date=18 October 2018 |title=2018-Q3 release of XenGT (Intel GVT-g for Xen) |url=https://01.org/igvt-g/blogs/wangbo85/2018/2018-q3-release-xengt-intel-gvt-g-xen |publisher=Intel Open Source Technology Center |access-date=14 August 2020}}{{cite tech report |type=Guide |title=GVT-g Setup Guide |url=https://github.com/intel/gvt-linux/wiki/GVTg_Setup_Guide |website=GitHub |access-date=13 September 2020}} and KVM{{cite press release |last=Wang |first=Hongbo |date=18 October 2018 |title=2018-Q3 release of KVMGT (Intel GVT-g for KVM) |url=https://01.org/igvt-g/blogs/wangbo85/2018/2018-q3-release-kvmgt-intel-gvt-g-kvm |publisher=Intel Open Source Technology Center |access-date=14 August 2020}} with Intel GVT-g
- Thincast Workstation - Virtual 3D feature (Direct X 12 & Vulkan 3D API)
While API remoting is generally available for current and older GPUs, mediated pass-through requires hardware support available only on specific devices.
class="wikitable"
|+Hardware support for mediated pass-through virtualization |
rowspan="2"|Vendor
!rowspan="2"|Technology !colspan="3"|Dedicated graphics card families !rowspan="2"|Integrated GPU families |
---|
Server
!Professional !Consumer |
Nvidia
|vGPU{{cite web |url=https://docs.nvidia.com/grid/gpus-supported-by-vgpu.html |title=NVIDIA Virtual GPU Software Supported GPUs |publisher=Nvidia |access-date=9 September 2020}} |{{no}} |{{sdash}} |
AMD
|MxGPU{{cite tech report |type=Datasheet |title=AMD FirePro S-Series for Virtualization |url=https://www.amd.com/system/files/documents/firepro-s-series-datasheet.pdf |format=PDF |publisher=AMD |date=2016 |access-date=13 September 2020}} |FirePro Server, Radeon Instinct |{{no}} |{{no}} |
Intel
|{{sdash}} |{{sdash}} |{{sdash}} |Broadwell and newer |
Device emulation
GPU architectures are very complex and change quickly, and their internal details are often kept secret. It is generally not feasible to fully virtualize new generations of GPUs, only older and simpler generations. For example, PCem, a specialized emulator of the IBM PC architecture, can emulate a S3 ViRGE/DX graphics device, which supports Direct3D 3, and a 3dfx Voodoo2, which supports Glide, among others.{{cite web |type=Project |title=Systems/motherboards emulated |url=https://pcem-emulator.co.uk/status.html |website=PCem |access-date=26 October 2020}}
When using a VGA or an SVGA virtual display adapter,{{cite tech report |type=Manual |title=VMware Tools Documentation |url=https://docs.vmware.com/en/VMware-Tools/index.html |chapter=VMware Tools Device Drivers |chapter-url=https://docs.vmware.com/en/VMware-Tools/10.1.0/com.vmware.vsphere.vmwaretools.doc/GUID-6994A5F9-B62B-4BF1-99D8-E325874A4C7A.html?hWord=N4IghgNiBcIGoFkDuYBOBTABAFQPa4gGdMBlOAcQEEQBfIA |publisher=VMware |access-date=12 September 2020}}{{cite tech report |type=Manual |title=Oracle VM VirtualBox User Manual |url=https://www.virtualbox.org/manual/ |chapter=Configuring Virtual Machines |chapter-url=https://www.virtualbox.org/manual/ch03.html |publisher=Oracle Corporation |access-date=12 September 2020}}{{cite tech report |type=Manual |title=QEMU User Documentation |url=https://www.qemu.org/docs/master/qemu-doc.html |section=Display options |website=QEMU |access-date=12 September 2020}} the guest may not have 3D graphics acceleration, providing only minimal functionality to allow access to the machine via a graphics terminal. The emulated device may expose only basic 2D graphics modes to guests. The virtual machine manager may also provide common API implementations using software rendering to enable 3D graphics applications on the guest, albeit at speeds that may be low as 3% of hardware-accelerated native performance. The following software technologies implement graphics APIs using software rendering:
- VMware SVGA 3D software renderer{{cite tech report |type=White paper |last=Long |first=Simon |title=Virtual Machine Graphics Acceleration Deployment Guide |url=https://www.vmware.com/content/dam/digitalmarketing/vmware/en/pdf/whitepaper/vmware-horizon-view-graphics-acceleration-deployment-white-paper.pdf |format=PDF |publisher=VMware |date=2013 |access-date=14 September 2020}}
- VirtualBox VMSVGA graphics controller
- Citrix XenServer OpenGL Software Accelerator{{cite tech report |type=Manual |title=XenApp and XenDesktop 7.15 LTSR |url=https://docs.citrix.com/en-us/xenapp-and-xendesktop/7-15-ltsr |chapter=OpenGL Software Accelerator |chapter-url=https://docs.citrix.com/en-us/xenapp-and-xendesktop/7-15-ltsr/graphics/opengl-software-accelerator.html |publisher=Citrix Systems |access-date=15 September 2020}}
- Windows Advanced Rasterization Platform
- Core OpenGL software renderer
- Mesa software renderer
- Swift Shader (implements WebGPU) software renderer
See also
Notes
{{notelist}}
References
{{reflist}}