Recovering virtual desktops isn't the same as recovering traditional PCs. You need different backup methods to prepare for a disaster, and when that day comes, there are some specific DR tactics to keep in mind.
Although the disaster recovery methods are different, virtual desktop infrastructure (VDI) does bring some advantages over standard desktops. Virtual desktops are simpler to deploy in a disaster situation as long as you have them backed up. Plus, with a virtual desktop, you don't have to think about the hardware -- users can start running their desktop anywhere and get back to work quickly.
In this Q&A, expert Alastair Cooke spoke with SearchVirtualDesktop.com about the virtual desktop recovery process to help you understand best practices for backing up and recovering your VDI environment.
What does IT have to consider when recovering virtual desktops?
Alastair Cooke: You've got to think about where the failure is going to occur. If we've got a failure in the location where the staff sits -- maybe there's been a fire in the building where these people work -- you're possibly going to be denied access to that building for a period of time. With a conventional build-out where you have PCs in the office, you're losing access to all the PCs, all the shared files that are stored locally and access to all the users' profile data.
In a VDI use case, all that's sitting at the user's desk is a thin client. As long as we can provide them access to their virtual desktop from something else that can be a thin client, then they have the same compute environment. If they just lose the location they're working in, so long as the staff can find a PC in a computer shop and connect it to the Internet, they can resume working …
[When using VDI to recover desktops] and you shift from having four or five -- or even 10 or 100 users -- working from home intermittently, to having maybe 100 users rather than 5 working from home, you need to have provisioned enough Internet bandwidth out of your data center that it doesn't get saturated the minute you get above 10 users. In a disaster environment … you're going to be stuck with [the Internet bandwidth] you've always had. That's an important part of planning for failure: working out if you have network contingency plans.
Is recovering virtual desktops easier or more difficult than recovering traditional PCs?
Cooke: It really depends on how you've implemented VDI: whether you've given yourself an easy failover process to an alternative data center or only a slightly better failover than traditional desktops.
If you've gotten to that nirvana state of VDI where the virtual machine (VM) itself is completely disposable and you're persisting everything that's important to the user outside of the desktop (to a file server for profiles or using some kind of layering to pull all the user uniqueness out of the desktop) … it's relatively quick to bring up an environment that can support those users. That assumes you have enough capacity and bandwidth in the recovery data center to run all the desktops or the skeleton staff you need to have working during a disaster.
If you have dedicated desktops and user data inside those desktops, it's just as hard to recover those desktops as it would be on traditional PCs. [In a disaster situation], you're typically shifting from having lots of little locations with staff at them to a single, central well-protected location … VDI does allow a lot more flexibility in how you plan for a failure than in a conventional desktop environment where you have to put the compute next to the user.
What are some ways to recover virtual desktops and their data?
Cooke: A lot of customers find it hard to get to stateless desktops, so they have user data inside the VM, and this can be quite problematic because often virtualization-aware backup software is not able to read these disks. For VMware View, you can't apply a snapshot to the user data, the persistent disk. So this multiplies your problem of having data inside the user's VM.
You have to work out how you're going to back it up. You can use roaming profiles, where the profile is inside the VM but it uploads back into a file share … Look at how you're backing up that data, so that if a single VM, entire cluster or even a storage array has a very bad firmware upgrade and you lose all your VMs, you have a method for how you're going to recover them. Stateless VMs are so much easier to recover because you only have to recover the master from which they're built.
More on virtual desktop DR
A two-pronged approach to VDI disaster recovery
Using VHD backups for VDI disaster recovery
Creating a disaster recovery plan for View desktops
Backup strategies for virtual desktop infrastructure
If you lose the entire data center and need to run out of the alternative data center for a period of time, that's a bigger issue. Usually we work with replication of the user data. Hopefully you can get the user data out of the VMs to something that's much easier to replicate, like a big file server or a NAS device. Then that device itself can be replicated to a recovery site. With Windows File Server, the easiest way to do that is with Microsoft's Distributed File System Replication, but you might also use storage array replication or some kind of DR automation product like VMware Site Recovery Manager.
You also need to provision desktops at the alternate site. Again, if you're using stateless desktops, you just need to replicate the master VM or virtual disk that's the source for your virtual desktops. You need to create the new pool or have a pre-created pool of desktops that doesn't have a very large population.
I've seen one customer that has fully stateful desktops in their normal operation but skeleton stateless desktops for disaster recovery. It has two data centers, and half the staff is hosted at each one … In normal operation, the users get a unique VM, but if one data centers fails, the staff can access a stateless desktop at the other data center that provides them a minimum amount of applications to get the fundamentals of their job done… In an extreme situation, zero in on how many applications and how much data the users really need access to.
The technical stuff is complex but it's driven by the business analysis: What's the impact on the business? What do we need to get out of the hole and minimize that impact? If you've done that analysis, you'll find that you don't necessarily need the fully stateful desktop [in a disaster recovery situation].
You've mentioned that backup is a big part of disaster recovery. What are some of the best ways to back up virtual desktops so the DR process goes smoothly?
Cooke: Backup becomes important for the unique data, or the unique VMs, that the users get. You may end up using an agent inside the virtual machine to do the backups if you can't use roaming profiles or some kind of replication tool. There are a lot of applications that will do simple file replication for you. Or, you can use offline folders to get the uniqueness inside the VMs.
Does it complicate backup and recovery if the user stores personal data directly on his or her virtual desktop?
Cooke: Yes, this is again breaking that whole "the virtual machine is disposable" thing. It is worth looking at why you have a unique VM: Sometimes it's simply that the unique VM is required to make a particular application work. Maybe there's a licensing requirement. In that case, there's no additional data inside the VM that you'd really need … but building a culture where you have the ability to save things on the VM is just as dangerous for VDI as it is with standard desktops … It's definitely one of those enterprise culture things: Don't store data anywhere that's not on a server.
What's different about disaster recovery for VDI than for traditional PCs?
Cooke: One of the key things is that in VMware shops, there's no integration between VMware's disaster recovery product, Site Recovery Manager, and its VDI product, View. So you can recover all your virtual server with SRM but there's no supported methodology for recovering your desktops ... That means that typical disaster recovery for a View environment is to have two VDI environments like my example of the company with two separate data centers …
You need to clear your mind of what the normal working state is like and think about how the environment is going to operate under the failure. If you have all these users working from home accessing your DR center, does that mean I'm relying on generator power at the disaster recovery site? Because, potentially the same failure that has affected my primary site has had a tangential effect on my DR site. Or, do I have enough diesel to run all my servers and storage at full load between refills of that diesel within the time it takes for my utilities company to get the power back on?
With disaster recovery, you need to build it from the ground up. But do it as a thought-and-design exercise before the unthinkable occurs, because you will not have time when a hurricane comes through to save the business.