VDI storage is getting a face-lift with inline deduplication. So how does this kind of dedupe differ from the traditional...
A few weeks ago, I wrote about how two of the biggest obstacles to VDI adoption have been overcome this year. The first is graphics, which has been solved by offload cards and virtualized graphics processing units (GPUs). The second is storage, and the predominant technology that has led to a new take on VDI storage is called inline deduplication -- or block-level single-instance storage.
In the storage world, the term deduplication has been around for many years. Deduplication in its traditional sense is a process that executes after the data has been committed to the storage device. This is called process-level deduplication. In general, it works as follows.
After the blocks have been written to the storage device, the storage controller examines the data at the block level, comparing each one against the others. It uses this information to identify common blocks and creates a single copy of common blocks that are then referred to by the files that need them. It's an intense process that happens in the background when the storage system has free cycles.
The challenge with this method for VDI storage is that with virtual desktop environments, we care about performance much more than capacity. So we put super-expensive, high-performance storage into systems in an effort to get as many IOPS as possible. Most SANs, on the other hand, focus on capacity -- with performance being secondary -- and have workloads that are more static than desktop-based workloads.
When you run process-level deduplication, you need to have additional super-expensive, high-performance storage to handle the data being written to the storage before it's deduplicated, which drives up the cost of the storage solution. Yes, it's rendered down to its common blocks later, but you have to have the capacity to hold that data before it's been deduplicated.
That's why inline deduplication is getting a lot of buzz lately. It's doing the same thing as process-based deduplication, but it's doing it before the data is committed to storage, not after.
More on VDI storage
Using server RAM instead of SANs for VDI storage
Do you need a new array for VDI storage?
New hybrid storage options for VDI
Tools from Atlantis, Tegile, VeloBit, SimpliVity, GreenBytes, Nimble Storage, Nexenta, DataCore, Nutanix and Pure Storage (along with others I'm surely missing) all do some sort of inline deduplication as part of their optimizations. Some use memory, some use solid-state drives, and all of them use some tiered architecture that places the high-priority, dynamic data into faster storage while passing the rest of the data off to slower, cheaper storage.
The key takeaway is that the optimization is done before the data is committed to storage. That means you need less storage, and you can direct your storage funds at faster, better storage.
With desktops, a lot of the information is the same. Some say as much as 80% to 90% of the information in a typical desktop can be deduplicated, even in persistent VDI environments where each person has his own dedicated VM. With process-based deduplication, that means you need to have room on your storage for all of these desktops; however, after the process runs, you'll only utilize a fraction of that space.
With inline deduplication, the only thing you need is that fraction of the space, which means you can provide a faster experience for less money.