Check out the new USENIX Web site. next up previous
Next: Category Interface Up: MBFS Interface Previous: MBFS Interface

Kernel Interface

The kernel interface provides applications and advanced programmers with complete control over the lowest-level details of MBFS's persistence specification. Applications can select the desired storage level and the amount of time data may reside at a level before being archived. Applications must be rewritten to take advantage of this interface. For this interface, only the open() routines are modified to provide volatility specification and to support the reconstruction extension.2 Consequently, a file's volatility specification can be changed only when a file is opened. The open() call introduces two new parameters: one for specifying the volatility and one for specifying the reconstruction extension. The reconstruction parameter is discussed in Section 6.1.

The volatility specification defines a file's persistence requirements at various points in its lifetime. The volatility specification consists of a count and a variable size array of mbfs_storage_level structures described below. Each entry in the array corresponds to a level in the logical storage hierarchy (LM, LCM, DA1, ..., DAn). A NULL volatility specification is replaced by a default volatility specification defined by a system or user configuration file. The default volatility specification is typically loaded into the environment by the login shell or read by a call to a C-level library.

typedef struct {
    struct timeval   time_till_next_level;
    void            *replication_type;
} mbfs_storage_level;

Specifies the maximum amount of time that newly written or modified data can reside at this level without being archived to the next level of the hierarchy. Values greater than or equal to 0 mean that write() operations to this level will return immediately and the newly written data will be archived to the next level within time_till_next_level time. Two special values of time_till_next_level can be used to block future write() and close() operations. UNTIL_CLOSE (-1) blocks the close operation until all file blocks reach the next level. BEFORE_WRITE (-2) blocks subsequent write() operations at this level until the data reaches the next level.
This is used to specify the type of replication to use at this level.

For increased persistence, availability, and performance, MBFS supports data replication and striping as specified by the replication_type field. The replication_type field defines the type and degree of replication used for that level as defined by the following structure:

typedef struct {
  int  Type;
  int  Degree;
} mbfs_replication;

A literal corresponding to the desired form of replication selected from SINGLE_COPY, MIRRORING, and STRIPING [22]. SINGLE_COPY provides no replication. MIRRORING saves multiple copies on different machines. STRIPING distributes a single file and check bits across multiple machines. The default is SINGLE_COPY.
The number of machines to replicate the data on. If Type is SINGLE_COPY this field is ignored. If MIRRORING, it defines the number of copies. If STRIPING, it defines the size of the stripe group.

Mirroring and striping increase reliability by ensuring data persists across single machine or disk failures. Because unexpected machine failures are not uncommon in a distributed system (for example, the OS crashes, a user accidentally or intentionally reboots a machine, user accidentally unplugs machine), replication at the LCM level greatly increases the probability LCM data will survive these common failures.

The kernel-level interface also requires modifications to the system calls used to obtain a file's status or a file system's configuration information (for example, stat() and statvfs() in Solaris). For applications requiring complete knowledge of the environment, MBFS returns information about a file's persistence requirement, reconstruction information, or the estimated mean time to failure of each of the DA levels based on manufacturer's specifications. The raw kernel-level interface provides full control over a file's persistence, but this control comes at the price of elegance and requires that the application provide a substantial amount of detailed information on each open() call.

next up previous
Next: Category Interface Up: MBFS Interface Previous: MBFS Interface
Todd Anderson