When it comes to the world of backup, all that glitters isn’t… truly backup.
To best protect your data, it’s essential to distinguish between approaches that could truly be considered backup and those which are better amalgamated under a heading which could be termed “backup-like”.
One procedure which is commonly mistaken for backup is replication. As the name suggests, replication involves making copies of data. Consider random array of independent disks (RAID) approaches for instance — particularly RAID 1 (disk mirroring).
If you’re reading this, then the expression “RAID isn’t backup” probably isn’t new to you (Find out more in our article about RAID). But equally, other forms of replication aren’t quite backup either, even though they’re useful and important. Here’s why, and what the differences are.
Main Differences Between Backup and Replication
What are the main differences between backup and replication? We’d like to draw your attention to some distinctions. But first, let’s get clear about the definitions:
What Is Replication?
As we mentioned, replication is concerned with making copies of data. Exactly what data is being replicated will depend upon what users wish to protect. Replication might cover:
- Copying, in real time, the files and folders on a computer
- Mirroring an entire operating system at the OS or disk level
- Copying only aspects of the OS, such as applications and settings
Replication can be used to instantaneously and automatically mirror system resources to a secondary location — or several of them.
- A replication program could instantaneously replicate changes to a mission-critical database, such as a CRM MySQL table, to a second onsite copy for quick failover, and to an offsite copy for disaster recovery (DR) in the event that all onsite resources are inaccessible.
- Replication could be used to replicate all on-premises servers to backup servers in the cloud. Failover to cloud resources could automatically occur if onsite resources were unavailable, such as during a power outage (if the company did not have backup power).
- Replicate copies are taken so that there is a secondary system which can mirror the primary system to ensure seamless failover and to minimize the recovery time objective (RTO) — the maximum length of time it will take to restore operations.
For replication systems, the RTO, for failover, would typically be measured in minutes. Unlike some backup use cases (for instance, hardware storage failures), these interruptions to routine operations are also expected to be temporary in nature and the business hopes that operations can quickly be resumed on the primary system.
All these data protection strategies are important in order to ensure business continuity by ensuring the provision of applications and systems in the event of disaster and to make them available in real time. But, as we will soon explain, none of these approaches are considered backup.
What Is Backup?
Backup creates a point-in-time copy of a file system or another digital asset (such as a database). Unlike snapshots — another “backup-like” technology — backup contains the files themselves, rather than a note of changes since the last run.
A backup can be used for many disaster recovery (DR) use cases. Most basically, if a change degrades stability or otherwise compromises data, restoring from a backup can bring a system back to its previous state. Some backup approaches (like disk imaging) can even be used to back up a system onto new hardware in the event of mechanical or physical destruction.
Finally, backup can be used for compliance and archival purposes by businesses that need to take a picture of their system at a point in time for regulatory reasons.
How Do They Work?
Backup approaches typically involve either making full copies of an origin device onto a target medium (full backups) or writing notes of additions, changes, and deletions (incremental or differential backups).
These approaches are used to ensure either that a system can be restored to a specific point in time with one operation (restoring from a full backup), or that the same thing can be achieved through aggregating a series of files which note changes to the file system.
Further reading Backup Types Comparison
Replication, on the other hand, doesn’t involve taking historical copies — instead it simply replicates every change live and in real time. With replication, the moment the primary file system is changed, those changes are reflected onto the target medium.
The writing process can happen instantaneously (synchronous replication) or can entail some latency (asynchronous or near synchronous replication). The latter approach involves the replicate storage acknowledging its writing operations back to the primary storage and is popular on storage devices such as network attached storage (NAS) devices.
Backups are used to make sure that there is a backup point for everything from user endpoints, like desktops, through to servers. If there is a machine which businesses need to ensure can be restored in the event of degradation, accidental deletion, or other error, then backups are the tool of choice.
Replication is used to protect mission-critical applications such as CRM servers, payment processing machines, and anything else which the business needs to have operational all of the time. If even five minutes of downtime is too much, then a replica will be provisioned for failover.
Which Costs More?
Backup is usually the cheaper of the two approaches. That’s because, in order to take a backup, users typically just need to provision some storage and a backup program to run the operations.
Further reading Benefits of Cloud Storage Explained
To set up a replication system, by contrast, users would need to provision parallel resources.
In order to provide replication for a storage center, for instance, users would need to set up computer storage with the same capacity as the primary storage system at a remote site. This could entail substantial hardware costs.
Which Needs More Resources?
As a general principle, replication approaches also need more resources.
While a backup administrator, software, and some storage media might be enough to ensure a backup strategy within a small business, in order to set up proper replication of a mission-critical system, businesses might need to assign significant resources to the rollout. They might also need to implement new business processes, teach staff how to manage failover, and invest in new infrastructure to support the continued operation of replication systems.
Benefits Head to Head
- Are simple to implement: Users simply need to determine which backup approach is appropriate to support their target RTO and recovery point objective (RPO).
- Provide high isolation: Many backup systems — such as tape libraries — are stored cold. Additionally, they’re not typically connected live to primary systems. Thus, if the primary system is attacked by a virus or ransomware, the backup copy will remain clean.
Further reading Backup Best Practices In Ransomware Protection Strategy
- Are inexpensive: Comparatively speaking, backup approaches don’t cost a lot to set up and maintain.
- Focus on disaster recovery: If quick RTO is what’s required, then replication approaches have the advantage over backups.
- Provide high availability: Likewise, for resources that need to be highly available — such as components of a cloud — then ensuring replication makes more sense than backup. Of course, in many instances both should be used!
- Facilitate quick RTO: RTO for backup is typically longer than for replication systems.
Each approach has its disadvantages, too.
For backup approaches, there can be a relatively long time between backup snapshots. The length of time between two backups is termed the recovery point objective (RPO). The longer the RPO, the more data a company stands to lose in the event that recovery from that backup is required.
Additionally, the RTO is typically longer. This is because an entire operating system — potentially across a whole network of servers — needs to be written. The RTO for replication is often almost non-existent, because these are typically parallel systems that are continuously running and being updated.
Replication’s main disadvantage is that it is more expensive to maintain than backups. Backups can use highly cost-effective LTO storage by storing petabytes of data at low cost. For replication, because the systems are expected to be ready for near-instant failover, this storage class is sometimes not viable.
Nevertheless, there is a key disadvantage relative to backups: they can’t be used to restore historical system states. If a primary device suffers a ransomware attack, for instance, then the ransomware will quickly propagate onto the replicate device. If the primary system needed to be restored to a point in time before it became infected, then a backup, rather than a replica, would be required. For system administrators who are taking backups, systems could simply be rolled back to a restore point before they became compromised.
Which Do You Need?
Businesses can commonly be found operating both backup systems and replication ones. A CRM server, for instance, could be regularly backed up to ensure restorability in the event of hardware failure. Meanwhile, the business could provision a parallel replication server which could be used for high availability failover in the event that the primary system were unavailable due to a temporary interruption in service.
Comprehensive Protection Needs Both
In today’s business environment, data protection commonly requires a multifaceted approach.
Any system with a high availability requirement — that needs near-zero downtime — should be replicated onto a parallel resource.
Meanwhile, backups remain essential. They can be used to roll back changes that degrade system performance (a use case for which replicate copies would be useless). Additionally, for cybersecurity, administrators need to retain the capability of restoring a system to a point of time before it became infected.