Check out the new USENIX Web site. next up previous
Next: Filename Matching Interface Up: MBFS Interface Previous: Kernel Interface

   
Category Interface

To simplify the task of specifying a persistence guarantee, the MBFS interface includes a user-level library called the category interface. The premise of the category interface is that many files have similar persistence requirements that can be classified into persistence categories. Thus, the category interface allows applications to select the category which most resembles the file to be created. Category names are predefined or user-defined ASCII character strings that are specified in the open() call. The open() call optionally also takes a reconstruction parameter like the kernel interface.

The category library maps category names to full volatility specifications and then invokes the raw kernel-level interface. The mapping is stored in a process' environment variables which are typically loaded at login from a system or user-specific category configuration file. The environment variable is a list of (category name, volatility specification) pairs.

The system configuration file must minimally define the categories listed below to ensure portability of applications. Programmers and applications may also create any number of additional custom categories. The (minimal) system categories are divided into two sets. The first set defines categories based on the class of applications that use the file. The second set defines categories that span the persistence continuum. The continuum categories are useful for files that do not obviously fall into any of the predefined application classes. The application categories are:

EDITED:
Files that are manually created or manually edited and typically require strong persistence (for example, source code files, email messages, text documents, word processing documents).
GENERATED:
Files generated as output from programs that require very weak persistence because they can be easily recreated (for example, object files, temporary or intermediate file formats such as *.aux, *.dvi, *.log, and executables generated from source code).
MULTIMEDIA:
Video, audio, and image files that are down-loaded or copied (as opposed to edited or captured multimedia data) such as gif, jpeg, mpeg, or wav files.
COLLECTION:
A collection of files (archive) create from other files in the system or down-load (for example, *.Z, *.gz, *.tar, *.zip)
DATABASE:
Database files often of large size, requiring strong persistence, high-availability, and top performance.
Categories that span the persistence spectrum are (from most volatile to least volatile):
DISPOSABLE:
Data that is to be immediately discarded (such as /dev/null).
TEMPORARY:
Temporary files that can be discarded if necessary and which will not reach DA1.
EVENTUAL:
Data can be easily recreated but should reach DA1if it lives long enough (several hours or more).
SOMETIME:
Reconstructable data that is likely to be modified or deleted soon. If the data is not modified soon, the data should be archived to DA1 relatively soon to minimize the need for reconstruction (repeatedly generated object files).
SOON:
Data that should be sent to DA1 as soon as possible, but it is not a disaster if the data is lost within a few seconds of the write.
SAFE:
Data is guaranteed to be written to DA1 before the write operation completes.
ROBUST:
Data is stored at two or more DA levels.
PARANOID:
Data is replicated at multiple DA levels of the storage hierarchy.


next up previous
Next: Filename Matching Interface Up: MBFS Interface Previous: Kernel Interface
Todd Anderson
1999-04-26