Check out the new USENIX Web site. next up previous
Next: A Cooperative Approach Up: Reducing Write Costs Previous: File System Structures

Write Indirection

Many researchers have noted that another way to minimize write delay is to appropriately control the placement of blocks on disk. This work, which introduces a layer of indirection between the file system and disk, can be divided into two camps: that which assumes the traditional interface to disk (an array of blocks), and that which proposes a new, higher-level interface (usually based on objects or similar abstractions).

0 that which has the disk itself decide where blocks should be placed and that which assumes the layer above the disk makes this decision. Most often, when investigating how to optimize write requests, researchers have pushed the decision into the disk; when optimizing read requests, researchers have usually pushed the work into the file system. Each approach has its own strengths and weaknesses, which we now discuss.


Traditional Disks: In the first approach, the disk itself controls the layout of logical blocks written by the file system onto the physical blocks in the disk. The basic approach has been to perform eager writing, in which the data is written to the free disk block currently closest to the disk head. There are three basic problems with these approaches. First, this approach assumes that an indirection map exists to map the logical block address used by the file system to its actual physical location on disk [7,10,34]. Unfortunately, updating the indirection map atomically and recovering after crashes can incur a significant performance overhead. Second, these systems need to know which blocks are free versus allocated. Unfortunately, although the file system readily knows the state of each logical block, it is quite challenging for disks to know whether a block is live or dead [30]. Third, this approach forces the file system to completely relinquish any control over placement; given that the file system knows which blocks are related to one another and thus are likely to exhibit temporal locality (e.g., the inode and all data blocks of the same file), the file system would like to ensure that those blocks are placed somewhat near one another to optimize future reads. Thus, pushing full responsibility for block placement into the disk is not the best division of labor.


New Interfaces: A related set of efforts allows the disk to control placement but requires a new interface; this idea has appeared in different forms in the literature as Logical Disks [8], Network-Attached Storage Devices [13], and Object-based Storage [1]. With this type of new interface, the disk controls exactly where each object is placed, and thus can make intelligent low-level decisions. However, such an approach also has its drawbacks. First, and most importantly, it requires more substantial change of both disks and the clients that use them, which is likely a major impediment to widespread acceptance. Second, allowing the disk to manage objects (or similar constructs) implies that the disk must now be concerned with consistent update. Consider object-based storage: when adding a new block to an object, both the new block and a pointer to it must be allocated inside the disk and committed in a consistent fashion. Thus, the disk must now also include some kind of logging machinery (perhaps to NVRAM), duplicating effort and increasing the complexity of the drive. Logical disks go a step further, adding a new ``atomic recovery unit'' interface to allow for arbitrary writes to be grouped and committed together [8]. In either approach, complexity within the disk is increased.

0 File System Placement: In the second approach, the file system controls where blocks are positioned on disk. This approach has been often applied to improve read performance; the basic idea is for the file system to replicate blocks and to try to read from the closer copy [,16,38]. The basic problem is that the file system does not know the precise location of the disk head at any point in time. Thus, replication at the file system level is better for minimizing seek costs than rotation [16]; alternatively, the file system can try to predict the current location of the disk head, but this approach is fragile [38].


next up previous
Next: A Cooperative Approach Up: Reducing Write Costs Previous: File System Structures
Remzi Arpaci-Dusseau 2008-10-08