It can be a struggle to provide adequate performance on virtual desktops, especially when you're trying to deliver...
demanding applications. But there are three ways you can use GPUs to solve that problem.
Graphics processing unit (GPU) sharing, pass-through and virtual GPUs (vGPUs) each help in different use cases. Deciding which option is right for you depends on the applications you need to deliver, how many users need those apps and how much latency you can tolerate, as well has how much hardware you can afford.
The sharing and pass-through options work well in limited cases because they do not scale well; the GPU pass-through method is especially limiting. The ability virtualize GPUs is a welcome advance, however, and engineers, analysts and other users with demanding graphics applications may get the performance they expect with vGPUs on their VDI desktops.
Under the GPU sharing model, a hypervisor runs a translation manager that abstracts the GPU, and each virtual machine (VM) functions like it has its own GPU. The translation manager is responsible for ensuring that application programming interface (API) calls and application-specific data are directed to and from the appropriate VM. This is a reasonable approach for low-demand applications and modest numbers of users.
If you are considering using GPU sharing, verify that your translation manager implements the API (such as DirectX and OpenGL) that you expect to use, and make sure it runs on your hypervisor. Also, test your applications in the virtualized environment to ensure that all of the API functions you need are adequately supported.
The abstraction layer does increase the latency between the calling application and the GPU, so if users can't tolerate that latency, you might consider one of the other options.
With GPU pass-through, a physical GPU is dedicated to each virtual desktop user. This approach avoids the abstraction layer overhead that comes with the GPU sharing approach, and it delivers the kind of performance that power users get on dedicated desktops or workstations.
There are a couple of obvious drawbacks to this model, however. A one-user to one-GPU implementation is more expensive than sharing resources. And with a fixed number of GPUs, you run the risk of not having resources to scale up to service peak demand.
The pass-through approach can work well when there are a small number of demanding users who need predictable access to GPUs. For example, a small team of engineers using CAD software or analysts working with big data visualizations might be good candidates for the GPU pass-through model.
A third approach has become feasible with advances in GPU design: You can virtualize GPUs the same way as other server components. Older GPUs did not support this kind of virtualization, so shops used GPU sharing as an alternative.
Newer GPUs incorporate memory management units to handle address translation between a VM's address space and the physical address space. These more advanced GPUs may also include a sufficient number of separate input buffers to receive input streams from different VMs. This allows each VM to have its own vGPU.
Virtualizing GPUs the same way as other system components comes with the benefits of GPU sharing and GPU pass-through. Unlike GPU sharing, there is no additional abstraction layer or API translation, so latency is lower. And unlike GPU pass-through, it is possible to share a single GPU across multiple VMs at the same time.
Virtualizing GPUs is an appropriate option when you have a substantial number of users, less predictable demand, and if you need to scale to meet a growing user base. Be sure to evaluate GPU hardware with your hypervisor and hardware to ensure compatibility.
VMware and Citrix virtual GPU support
RDS and RemoteFX updates in Windows Server 2012
GPU support in Citrix, Microsoft, and VMware VDI
Installing and configuring Citrix XenDesktop's GRID vGPU feature