.shock - Fotolia
Live migration for virtual GPUs has arrived, and the technology will help organizations more easily distribute resources and improve performance for virtual desktop users.
As more applications get graphic rich, VDI shops need better ways to support and manage virtualized GPUs (vGPUs). Citrix and Nvidia said this month they will support live migration of GPU-accelerated VMs, allowing administrators to move VMs between physical servers with no downtime. VMware has also demonstrated similar capabilities but has not yet brought them to market.
"The first time we all saw vMotion of a normal VM, we were all amazed," said Rob Beekmans, an end-user computing consultant in the Netherlands. "So it's the same thing. It's amazing that this is possible."
How vGPU VM live migration works
Live migration, the ability to move a VM from one host to another while the VM is still running, has been around for years. But it was not possible to live migrate a VM that included GPU acceleration technology such as Nvidia's Grid. VMware's VM live migration tool, vMotion, and Citrix's counterpart, XenMotion, did not allow migration of VMs that had direct access to a physical hardware component. Complicating matters was the fact that live migration must replicate the GPU on one server to another server, and essentially map its processes one to one. That's difficult because a GPU is such a dense processor, said Anne Hecht, senior director of product marketing for vGPU at Nvidia.
Zeus Kerravalafounder and principal analyst, ZK Research
XenMotion is now capable of live migrating a GPU-enabled VM on XenServer. Using the Citrix Director management console, administrators can monitor and migrate these VMs. They simply select the VM and from a drop-down menu choose the host they want to move it to. This migration process takes a few seconds, according to a demo Nvidia showed at Citrix Synergy 2017. XenMotion with vGPUs is available now as a tech preview for select customers, and Nvidia did not disclose a planned date for general availability.
This ability to redistribute VMs without having to shut them down brings several benefits. It could be useful for a single project, such as a designer working on a task that needs a lot of GPU resources for a few months, or adding more virtual desktop users overall. If a user needs more GPU power all of a sudden, IT can migrate his or her desktop VM to a different server that has more GPU resources available. IT may use live migration on a regular basis to change the amount of processing on different servers as users go through peaks and valleys of GPU needs.
Most important to users themselves, VM live migration means that there is no downtime on their virtual desktop during maintenance or when IT has to move a machine.
"The amount of time needed to save and close down a project can number in the tens of minutes in complex cases, and that makes for a lot of lost production time," said Tobias Kreidl, desktop computing team lead at Northern Arizona University, who manages around 500 Citrix virtual desktops and applications. "Having this option is in bigger operations a huge plus. Even in a smaller shop, not having to deal with downtime is always a good thing as many maintenance processes require reboots."
VMware vs. Citrix
The new Citrix capability only supports VM live migration between servers that have Nvidia GPU cards of the same type. Nvidia offers a variety of Grid options, which differ in the amounts of memory they include, how many GPUs they support and other aspects. So, XenMotion live migration can only happen from one Tesla M10 to another Tesla M10 card, for example, Hecht said.
At VMworld 2017, VMware demoed a similar process for Nvidia vGPUs with vMotion. This capability was not in beta or tech preview at the time, however, and still isn't. Plus, the VMware capability works a little differently from Citrix's. With VMware Horizon, IT cannot migrate without downtime; instead, a process called Suspend and Resume allows a GPU-enabled VM to hibernate, move to another host, then restart from its last running state. Users experience desktop downtime, but when it restarts it automatically logs in and runs with all of the last existing data saved.
Nvidia is working with VMware to develop and release an official tech preview of this Suspend and Resume capability for vGPU migration and hopes to develop a fully live scenario for Horizon in the future as well, Hecht said.
"VMware will catch up, but I think it gives Citrix an early mover advantage," said Zeus Kerravala, founder and principal analyst of ZK Research. "This might wake VMware up a little bit and be more aggressive with a lot of these emerging technologies."
Who needs GPU acceleration?
Virtualized GPUs are becoming more necessary for VDI shops as more applications require intensive graphics and multimedia processing. Applications that use video, augmented or virtual reality and even many Windows 10 apps use more CPU than ever before -- and vGPUs can help offload that.
"The GPU isn't just for gamers anymore," Kerravala said. "It is becoming more mainstream, and the more you have Microsoft and IBM and the big mainstream IT vendors doing this, it will help accelerate [GPU acceleration] adoption. It becomes a really important part of any data center strategy."
At the same time, not every user needs GPU acceleration. Beekmans' clients sometimes think they need vGPUs, when actually the CPU will provide good enough processing for the application in question, he said. And vGPU technology isn't cheap, so organizations must weigh the cost versus benefits of adopting it.
"I don't think everybody needs a GPU," Beekmans said. "It's hype. You have to look at the cost."
More competition in the GPU acceleration market -- which Nvidia currently dominates in terms of virtual desktop GPU cards -- would help bring costs down and increase innovation, Beekmans said.
Still, the market is here to stay as more apps begin to require more GPU power, he added.
"If you work with anything that uses compute resources, you need to keep an eye on the world of GPUs, because it's coming and it's coming fast," Kerravala agreed.