World Backup Day – how well do you know backup?
March 2021 by Databarracks
The 31st of March is World Backup Day. Backup is something every IT professional does at some point in their career. How well do you know the terminology? Here’s our quick Backup terminology glossary to test you:
Full Backup
A full backup is an entire copy of your data. The disadvantage of a full backup is the length of time it takes to complete within a backup window. This means it is very rarely possible to run full backups over a WAN.
Differential Backup
Differential backups record the changes between full backups. The first captures the changed data from the initial full backup. After that, they’re cumulative, so the second also contains the changes in the first, and so on. This means full restorations are quick because the recovery needs a maximum of two backups. However, it requires more storage than other methods.
Incremental Backup
Incremental backups are similar to differential backups except they’re not cumulative, so only the changes from the previous backup are captured. Incremental backups are therefore smaller and faster to complete than differential, although restores can take longer because the individual archives must be merged. As individual backups are smaller, they are the most efficient to run over the WAN.
8 N.B. online backup is sometimes referred to as ‘incremental-forever backup’ because after one initial backup only incremental backups are ever taken.
Synthetic Full Backup
Synthetic full backups are not strictly a methodology but rather a technology that sits on top of the methods described above. Simply put, a synthetic full backup is the server-side construction of a ‘full backup’, comprised of smaller individual backups. This means when a full restore is needed, the reconstruction of files or file parts into a usable whole is already done.
Hot Backup
Hot backup, also known as dynamic or active backup, is performed while a database is online and accessible to users.
Hierarchical storage management
Hierarchical storage management is a policy-based management layer placed over backup and archive operations. It invisibly moves files between backup and archive storage depending on its age and user-demand. This ensures the most economical use of expensive higher performance storage, whilst automating the migration of old and unused data to the archive. From the user perspective, there is no administration required to restore from the archive – it’s simply one connected environment in which everything is available.
Agent vs. agentless
Backup solutions capture and transmit source data using either an agent-based, or agentless architecture. Agent-based backup systems install a software instance on every protected component in the network.
Agent-based backup is normally loaded in the OS stack. This means more control and visibility of the host system. Agent-based also uses local resources and won’t hamper bandwidth.
Agentless backups use one centralised installation, usually onsite, that captures all the target infrastructure in one place
Continuous data protection
CDP is a form of constant backup that continuously scans the environment for changes and sends them to the backup environment in near real-time.
Snapshots
Snapshotting is best likened to taking a photograph of the target environment. It’s a static image that represents how the live environment looked at a given point in time. The environment must pause operations whilst the snapshot is taken in order to ensure accuracy.
Deduplication
Deduplication removes superfluous duplicate files from backup sets at file or block level. By saving just one file copy, storage and transmission become faster and cheaper.
Backup window
To guarantee their integrity, it’s advisable to perform backups during non-peak hours. Network traffic can lead to inconsistencies between the source data and the backup as the operation is taking place. As such, system administrators tend to schedule backup windows overnight, outside of regular office hours.
Copy data management
“Copy data” is the collective set of all data not currently being used in production (e.g., a snapshot, backup, vault, or replica of a version made for various IT or business functions—data recovery, Dev-Test, analytics or other business or operational functions).
Copy Data Management (CDM) is designed to manage the creation, use, distribution, retention, and clean-up of copies of production data - or “copy data.”
A range of IT functions depend on copies of data beyond data protection; a CDM solution manages the creation and use of these copies in an efficient and automated way.