Check out the new USENIX Web site. next up previous
Next: Eager-writing Disk Arrays Up: Eager-Writing Disk Arrays Previous: Eager-writing


Mirroring and Striping

A $D_m$-way mirror, in addition to ensuring a high degree of reliability, can improve small read performance in terms of both latency and throughput. It can improve latency because the system can schedule the disk head that is closest to a replica to satisfy a read request [2,5]. It can improve throughput because any request can be satisfied by any disk, and an intelligent scheduler should be able to exploit the freedom in distributing the incoming requests to balance load.

Although cost per byte and capacity per drive remain the predominant concerns of the consumer market, due to the large cost and performance gaps between disk and memory, database vendors have long recognized the need for trading capacity to obtain higher performance while configuring storage systems. A $D_m$-way mirror is just one of the ways to improve performance by exploiting excess capacity. This approach, however, has an obvious limitation--as one increases the degree of replication, the cost of replica propagation becomes prohibitive. One possible way of addressing this high cost is to perform some of the propagations in the background during idle periods. Unfortunately, TPC-C-like workloads are characterized by a combination of high write ratio and scarce idle time, a combination that makes it difficult to realize the potential benefits of mirroring.

An alternative to mirroring is striping--by partitioning and distributing data across a $D_s$-way striped system, the system reduces the maximum seek distance by a factor of $D_s$ as only a fraction of each disk is used. This is attractive compared to mirroring because there is no replica propagation cost. Unlike mirroring, unfortunately, striping cannot reduce rotational delay. As we raise $D_s$, only the seek time is lowered and that too at a diminishing rate. Furthermore, unlike mirroring, due to the partitioning of data, the choice of which disk to send a request to is limited, so it is more difficult to perform load-balancing.

In practice, disk array designers have used a combination of mirroring and striping to form a striped mirror [3,11,26]. In a $D_m
\times D_s$ striped mirror, data is partitioned into $D_s$ sets, each of which is replicated $D_m$ times. The configuration where $D_m = 2$ is commonly referred to as ``RAID-10''. The replica propagation cost remains an obstacle to achieving good performance on RAID-10; and one seldom chooses a replication factor $D_m$ that is greater than two.


next up previous
Next: Eager-writing Disk Arrays Up: Eager-Writing Disk Arrays Previous: Eager-writing
Chi Zhang
2001-11-16