My last post began with my three-year old daughter trying to wear only a t-shirt and underwear to church, as I weighed the risk of excommunication vs. post-mass jelly donuts. Obviously, the story was a thinly veiled parable for Virtual Machine (VM) backup and recovery.
In that post, I covered the traditional VM backup mechanisms – guest OS and VCB. I explained how deduplication can improve more than storage efficiency and network bandwidth utilization – specifically, how it can reduce the backup performance impact on the host environment to free up servers to do more business-critical processing.
In this post and the next in the series, I will talk about versioned replication – both storage and VM-based – and how customers are increasingly turning to these strategies to help scale VM backup. I have seen many customers’ VM strategies stall or fail because their backup environments don’t scale. I’ve heard countless stories of customers unable to fully load their ESX servers with IT-owned tier 3 and 4 applications because their environments lacked the resources to complete the backups. I’ve heard nearly as many stories of customers unable to load mission-critical tier 1 and 2 applications because their backups were creating too much application performance variance. Finally, customers also tell me that rigid backup processes, requiring backup teams to install agents, configure backup schedules, manage media servers and set up restores, are compromising their visions of dynamic, delegated management in VM environments.
Considering these challenges, customers have more rapidly adopted innovative techniques like versioned replication for backup and recovery of VMs than they have for any other part of the IT environment.
Let’s start with the characteristics of a versioned replica. A versioned replica:
- Backs up only changed data from the source (incremental forever)
- Stores each incremental backup as a full backup on the target
- Stores the full backups efficiently on the target
In other words, it is a replica that efficiently stores multiple point-in-time copies on a distinct set of disks. Among the leading versioned replication solutions are EMC Avamar, EMC RecoverPoint, EMC Isilon SyncIQ, NetApp SnapVault, NetApp SnapMirror, Symantec PureDisk, and Microsoft Data Protection Manager. Within those solutions, some have been optimized for space/network efficiency, others for rapid detection of changed data and still others for fine-grained recovery point objectives (RPOs) and recovery time objectives (RTOs).
Over the past decade, backup performance has driven the adoption of versioned replication. Storage-based versioned replication has been the fastest approach, since the primary storage can track or compute data changes quickly; they own the data after all. A storage system that monitors its own activity can vastly outperform a backup client that wakes up each night, with no context, and is told “go figure out what happened while you were asleep.” With solutions like RecoverPoint, SnapVault, or SyncIQ, the storage system replicates it data container (volume, LUN, qtree) to a remote system that stores multiple point-in-time copies. The benefits are significant:
- No extra load on the application server
- Minimal load on the storage system CPU, disk and cache
- Light network load
These solutions, however, also come with limitations. Among the challenges:
- Limited functionality: Primary storage arrays do not focus on the backup storage design center – cost optimization, long-term backup retention, etc. Nor do they focus on the backup software workflows – granular recovery catalog, application consistency, rich scheduling interface, etc.
- Granularity of backup management: VMware administrators want to provision and protect virtual machines, grouped into a small number of service categories – e.g. gold, silver, bronze – with the flexibility to seamlessly change the service category of any VM at any time. Unfortunately, storage-based versioned replication schemes work at a container granularity larger than an individual VM. They cannot seamlessly move the VM and its backups to a container with a different service level, which means you can’t easily manage the backup process and retention at the user-desired granularity. (Speaking of service levels: Why don’t providers offer more creative service level names? ‘For your protection level, would you like Eastwood, Affleck, or Sheen?’ You know Eastwood would keep your data safe. Affleck would try really hard, but occasionally make a goofy mistake. I would probably pick the Sheen just to see how many viruses my data could contract.)
- Storage vendor lock-in: Each array type has a dedicated, isolated environment.
In the past, customers with intractable backup window challenges simply had to accept the drawbacks of storage-based versioned replication; they had no other alternative. But all that’s changed.
What exactly has changed? What does it mean for your backups? Did I ever get my jelly donut? Has child protective services taken my children from me? Who is still reading this?
In the next post, I will cover the changes created by VMware’s Changed Block Tracking, what it means for versioned replication and the future of backup applications and make more lame jokes. In the final post in this series, I will walk through some customer examples of how they have evolved their backup environments to incorporate versioned replication, the challenges they overcame and their current direction.