Compression
and Backup in Linux
-
Selecting a
Backup Strategy
There are several
approaches to backing up your data. You need to ask yourself a few
questions to decide which approach is best for you. Some things that
you should consider are:
-
In the event of a
crash, how much downtime can I tolerate?
-
Will
I need to recover older versions of my files or is the most recent
revision sufficient?
-
Do I need to back
up files for just one computer or for many computers on a network?
Your answers to
these questions will help you decide how often to do full backups and
how often to do incremental backups. If the data is particularly
critical, you may even decide that you need to have your data
duplicated constantly, using a technique called disk
mirroring.
The following sections describe different backup methods.
Full
backup
A full backup is one
that stores every file on a particular disk or partition. If that
disk should ever crash, you can rebuild your system by restoring the
entire backup to a new disk. Whatever backup strategy you decide on,
some sort of full backup should be part of it. You may perform full
backups every night or perhaps only once every week; it depends on
how often you add or modify files on your system, as well as the
capacity of your backup equipment.
Incremental
backup
An incremental
backup is one that contains only those files that have been added or
modified since the last time a more complete backup was made. You may
choose to do incremental backups to
conserve your backup media.
Incremental backups also take
less time to complete.This
can be important when systems are in high use during the work week
and running a full backup would degrade system performance.
Full backups can be reserved for the weekend when the system is not
in use.
Disk
mirroring
Full and incremental
backups can take time to restore, and sometimes you just can t afford
that downtime. By duplicating your operating system and data on an
additional hard drive, you can greatly increase the speed with which
you can recover from a server crash.
With disk mirroring,
it is usually common for the system to continuously update the
duplicate drive with the most current information. In fact, with a
type of mirroring called RAID 1, the duplicate drive is written to at
the same time as the original, and if the main drive fails, the
duplicate can immediately take over. This is called fault-tolerantbehavior, which is a must if you are running a mission-critical
server of some kind.
Network
backup
All of the preceding
backup strategies can be performed over a network. This is good
because you can share a single backup device with many computers on a
network. This is much cheaper and more convenient than installing a
tape drive or other backup device in every system on your network. If
you have many computers, however, your backup device will require a
lot of capacity. In such a case, you may consider a mechanical tape
loader, writable DVD drive or CD jukebox.
It is even possible
to do a form of disk mirroring over the network. For example, a Web
server may store a duplicate copy of its data on another server. If
the first server crashes, a simple TCP/IP host name change can
redirect the Web traffic to the second server. When the original
server is rebuilt, it can recover all of its data from the backup
server and be back in business.
|
Table
1: Comparison of Common Backup Media |
|
Backup
Medium |
Advantage |
Disadvantage |
|
Magnetic
tape |
High
capacity, low cost for archiving massive amounts of data. |
Sequential
access medium, so recovery of individual files can be slow. |
|
Writable
CDs |
Random
access medium, so recovery of individual files is easier. Backups
can be restored from any CD-ROM. |
Limited
storage space (approximately 650MB per CD). |
|
Writable
DVD |
Random
access medium (like CDs). Large capacity (4.7GB, although the
actual capacity you can achieve might be less). |
DVD-RW
drives and DVD-R disks are relatively expensive (though coming
down in price). Slower and less common than CD-ROM drives. |
|
Additional
hard drive |
Allows
faster and more frequent backups. Fast recovery from crashes. No
media to load. Data can be located and recovered more quickly.
You can configure the second disk to be a virtual clone of the
first disk, so that you can boot off of the second disk if the
first disk crashes. |
Data
cannot be stored offsite, thus there is risk of data loss if the
entire server is destroyed. This method is not well suited to
keeping historical archives of the many revisions of your files.
The hard drive will eventually fill up. |
-
Data
Integrity
Please,
see the first power point file.
-
Compression
Please,
see the PDF file.
-
backup
Please,
see the PDF file.
-
Christopher
Negus, Red Hat Linux Bible: Fedora and Enterprise Edition, John
Wiley & Sons, 2003.