FeatureUSENIX

 

source code UNIX

What's Your Data Worth?

gray_bob

by Bob Gray
<bob@boulderlabs.com>

Bob Gray is co-founder of Boulder Labs, a software consulting company. Designing architectures for performance has been his focus ever since he built an image processor system on UNIX in the late 1970s. He has a Ph.D. in computer science from the University of Colorado.



Thanks to Michael Durian and Steve Gaede for suggestions and comments.

The title could also be phrased, "What's your time worth if you have to re-create lost data?" Last issue we took steps to protect computers from Internet invaders. Now let's consider the threats of equipment failures, user mistakes, and natural disasters. What can we do to minimize the damage from these events? "Make copies" is the answer and the topic of this article.

Backups are redundant copies of computer information that enable reconstruction of the destroyed primary version. The typical backup medium is magnetic tape; alternatives include CD-ROMs, FLASH cards, "floppies," and other separate hard disks. There are many factors to consider when formulating a backup policy and I have written a section on each of these: tradeoffs of data protection, pros and cons of various media, and backup mechanisms and strategies.

Tradeoffs

The time required to create backups periodically has to be balanced against the risk of not having your data and system available. When you have a loss, you can always re-create the system — but it might take weeks. What is the cost of not having that information for a period? Without your system running, how will you process new transactions and information?

I hate typing a document twice — so I'll take measures to reduce the chance of it happening. Consequently, this article will be backed up many times before it is finally published. (For this small amount of text I'll partly rely on emailing the document to myself on another machine.) As the deadline gets closer, I'll become more cautious because I have more to lose and less time to recreate the document.

In the last few years, I've noticed two disturbing patterns: Too many people naively trust their disks, and huge cheap disks are unmanageable for many. It's no surprise that the proletarian computer owners have no idea of the intricacies of a modern disk. They've never seen a disk platter that's sustained a head crash. Their technology models are mostly things like automotive component failures that cause inconvenience. But a more suitable model is an airplane with a structural failure or a boat with a sizable hole — it's much more than just an inconvenience.

Over 25 years, I've seen no dramatic increase in the reliability of disks. Yes, they've become much faster, their capacity has grown by orders of magnitude, and the price per byte has fallen by orders of magnitude. But the invariant is that a given disk still has a significant probability of failure during its lifetime. I'm delighted that $150 buys me 9GB of storage, but I'll trust my ;login: article on it about as much as I'd trust the article on a 1970s DEC RK05 disk that held 1MB for thousands of dollars.

So change your mindset — disks do fail. I've seen several PC IDE disks fail in the last few months. One of my past projects involved storing movies on disks. We saw plenty of dead-on-arrival disks from the factory and large numbers of "infant mortality" disks (ones that fail early in their lifetime). Disks that were put into production stabilized to failure rates of just several per month in a thousand-drive video server. (Of course, RAID was used to keep the video server running while the bad drive was hot-swapped for a good one.) I suggest you think of today's disks as people thought of car tires in the early 1900s — you know you are going to have a flat, so carry a spare and know how to change it. All disks wear out and fail.

The other scary trend is the proliferation of multi-gigabyte disks in personal computers. It's one thing to have tens of gigabytes on a shared server maintained by a staff of administrators, but when every Tom, Dick, and Harry gets 10GB on a personal computer, there is going to be disaster sooner or later. It's too easy to collect software and data. Never before has humanity had the ability to "pack rat" stuff to this degree. At least stamp collectors have to physically exchange albums and provide shelf space for the collection. But PC owners have their disks connected to an infinite supply of bits — the Internet — where a mouse click turns on the faucet.

Trusting huge cheap disks by itself isn't a problem, but without backups it is. Most PC owners have no way to segregate their data. What's highly valuable and hard to re-create? What's reproducible by taking the time to find the distribution CD-ROMs and reload? What material came from the Internet? Are all of the locations recorded? Do those locations still serve the data? How much transport time will be required? What is plain junk? It isn't practical to handle gigabytes of materials with "point-and-click" techniques.

Of the people you know with PCs, how many of them have tape drives? When was the last time you saw a nationally advertised PC system that included a tape drive? Our industry is doing a disservice to the masses — encouraging them to accumulate programs, games, and data without providing a mechanism for backup. When the disk fails the user gets totally screwed. They buy another disk, spend weeks patching their environment back together, and use floppies more often. Some will buy a 100MB Zip, or even a 1GB Jaz drive, but these are rarely used properly (definition of properly: can a failed hard drive be quickly re-created?). As mainstream PC users migrate to Source Code UNIX, we need to help them adopt better practices.

Media

"Best" tape media discussions are endless. I offer some ideas in <http://www.boulderlabs.com/hardware.html>. Look at drive cost, media cost, capacity, and transfer rate. Be sure the transfer rate is acceptable to you — many drives do less than 1GB/hour, so we're talking about more than overnight for 10GB. You'll want a drive with capacity significantly larger than that of your disk(s). Who will be around at 4:00 am to change the 8GB tape when backing up a 10GB drive? Stay with brand-name drives and mainstream formats such as 4mm DDS, 8mm, or DLT and you'll be fine.

After you create the backups there are two more extremely important steps:

  • Verify that the tape can be read on a different drive.

  • Store the tape off-site.

The verification step doesn't have to be performed for every tape, but do it periodically to ensure that your tape-writing drive isn't misaligned and producing garbage. In a business environment, it's best to give the responsibility of creating backups to one person and the responsibility of verifying the backups to another. The second person can also be responsible for off-site storage. The basement of the same office is not a remote location. A number of firms affected by the World Trade Center bombing learned this lesson the hard way.

Tapes aren't the only good media for backups. Writable CD-ROMs have a role. The media is cheap and you end up with a random-access copy. Most CD systems deal only with 650MB of data, but the new, multi-gigabyte DVD equipment is gaining acceptance.

I've used an additional hard disk for backups in many environments during my career. Sometimes it makes sense to create an image copy of your primary disk. This is fast and you end up with something immediately bootable. (It's also handy for backing up foreign operating systems.) SCSI controllers do most of the work of remapping damaged disk sectors, so the old problem of bad blocks for image copies isn't too serious. Under FreeBSD,

      dd if=/dev/rsd0 of=/dev/rsd1 bs=64b

takes just a few minutes to transfer 1GB, and by switching SCSI ID numbers you can make the copy immediately become the new primary disk. This technique requires either identically sized disks or a good understanding of disk labels, partitions, and what is going on. Write me if you want more details.

Another way to use an additional disk is to create a complete dump onto tape and to put the newly changed files on the extra disk. (The dump program makes this easy.) The process can be automated with scripts and you don't have to be tending the tape drive on a daily basis. Further, your incremental changes are immediately accessible online. But make sure that you take some copies off-site. You can extend this practice by dumping over a network, maybe even to an off-site machine.

Incremental dumps stored on disk create a sort of "poor man's RAID." The next improvement might be software disk mirroring. You have the operating system write the data twice to different drives. Then a single drive failure isn't a big problem. Preferably, you would have real RAID controllers with optional hot-swappable drives. These solutions are not expensive, and you get protection and in many cases higher aggregate disk bandwidth. In an NFS environment, a Network Appliance box is the Cadillac of servers. It protects your data with RAID and keeps a number of recent versions of your files online.

For other ideas and opinions see <http://www.freebsd.org/handbook/backups.html> and the Linux HOWTOs at <http://metalab.unc.edu/mdw/linux.html>.

Strategies

I highly recommend that you organize your information in a way that facilitates backups. If possible, separate the relatively static, unchanging material from your core development. UNIX partitions enable you to implement different backup policies for different material. It's common to find these partitions on UNIX systems: root, tmp, var, and users.

Once configured, a root partition is mostly static and won't require frequent backups. Monthly might be appropriate. The /tmp filesystem by its nature doesn't require a backup. The /var filesystem has highly transient files such as mail messages and print spool files; you had better not lose someone's queued mail!

The essence of a company's development is probably in the user filesystem. For software-development shops and content developers, I believe daily backups are appropriate. The equation to balance is the cost of backups versus the cost of loss. The latter is a function of the risk of loss, the amount of work at risk, the cost of the work, and the time required to reproduce the work. As a project approaches a delivery deadline, the cost of loss rises because there is less time to recover and still make the delivery. Therefore the value of frequent backups increases significantly.

The UNIX dump program is the mainstay of data protection. It is the most reliable program across UNIX systems for making a perfect copy of your filesystem. (Most UNIX dump programs correctly handle the fringe filesystem features like sparse files, long funny names, and special files.) You start by creating a "level 0" dump; it has everything needed to re-create your environment on a fresh disk. The level 0 will take a while. (I require about four hours to back up a couple of gigabytes onto some DAT tapes.) Successively higher dump levels save files that are newer than when a lower dump level was taken. Here is a simple example

   Weekend: Level 0
   Monday: Level 1
   Tuesday: Level 2
   Wednesday: Level 3
   Thursday: Level 4
   Friday: Level 5

The weekend dump will take a lot of time and tape. The weekday dumps will capture only what has changed since the previous day. These are called "incremental" dumps — mine take only a few minutes. The small disadvantage of incremental dumps is that to re-create a given state of the system, you must have that day's backup plus the sequence of lower-level backups. The strategy below depends only on the weekend level 0 dump, and the incremental dump of interest. As the week progresses, the dumps will take longer and require more space.

   Weekend: Level 0
   Monday: Level 1
   Tuesday: Level 1
   Wednesday: Level 1
   Thursday: Level 1
   Friday: Level 1

Here is a suggestion from SGI's IRIX dump manual:

SunMonTueWedThuFri
Week 1:Full55553
Week 2:55553
Week 3:55553
Week 4:55553

    To guard against data loss as a result of a media failure (a rare but possible occurrence), it is a good idea to capture active files on (at least) two sets of dump volumes. Keep unnecessary duplication of files to a minimum to save both operator time and media storage. A third consideration is the ease with which a particular backed-up version of a file can be located and restored. The following four-week schedule offers a reasonable trade-off among these goals. Although the Tuesday-through-Friday incrementals contain extra copies of files from Monday, this scheme assures that any file modified during the week can be recovered from the previous day's incremental dump.

Many UNIX variants recommend the modified Tower of Hanoi algorithm using the dump-level sequence 3 2 5 4 7 6 9 8 9 9 . . . From the FreeBSD man page:

    Each week, a level 1 dump is taken, and the daily Hanoi sequence repeats beginning with 3. For weekly dumps, another fixed set of tapes per dumped file system is used, also on a cyclical basis. After several months or so, the daily and weekly tapes should get rotated out of the dump cycle and fresh tapes brought in.

Tar is the fundamental and universal "tape archiving" program in the UNIX world. It copies hierarchies to and from tape, disk, or pipelines. Tar is better than dump for creating backups that may need to be restored onto a different flavor of UNIX. The dump incompatibilities are due to the fact that it intimately deals with the filesystem, whereas tar just uses the filesystem services. But then tar doesn't save all the filesystem information such as last access time or creation time, which may be important. Frequently, tar is used with a compression program. So common is the command tar cf - . | gzip > XYZ.tz that GNU tar has compression built in as an option: tar czf XYZ.tz. All over the Internet, source code and documentation is provided in tar-gzip files. (In the Windows world, pkzip or winzip provide similar capabilities, but the compression is not as good.)

Pax is the superset archiver. It can generate and handle IEEE Std1003.2 ("POSIX.2'') TAR and CPIO formats plus several of the prestandard versions. Pax has more flexibility than tar. It has a built-in mechanism for selecting files on the basis of a range of timestamps. As with CPIO, you can give pax a list of filenames to be archived. This list can be created any way you want, including with the find command. For example, to generate the list of files owned by bob, modified during the past seven days, and with a character count of less than 500,000:

      find <startingPlace> -user bob -mtime -7 -size -500000c -print

There is plenty of rope to hang yourself with all of the pax options, but it gives you tremendous flexibility. Be careful.

Organizations with large backup needs may want to look at solutions such as AMANDA, the Advanced Maryland Automatic Network Disk Archiver. It is a backup system that enables the administrator of a LAN to set up a single master backup server to back up multiple hosts to a single large-capacity tape drive. AMANDA uses native dump and/or GNU tar facilities and can back up a large number of workstations running multiple versions of UNIX. See <ftp://ftp.cs.umd.edu/pub/amanda>.

Conclusion

One of the reasons we don't hear of many disk-failure catastrophes is that competent system administrators are quietly doing their jobs. When a disk fails — no big deal — they just go to the backups and get everybody up and running again quickly. The memorable stories are about the few who lose precious information and don't have backup protection. Is that how you want to be remembered?

 

?Need help? Use our Contacts page.
Last changed: 18 Nov. 1999 mc
Issue index
;login: index
USENIX home