Check out the new USENIX Web site. next up previous
Next: Self-Tuning Up: Experimental Results Previous: Experimental Results

Recovery-Friendly

In a sufficiently-provisioned, non-overloaded system, the failure and recovery of a single brick does not affect:

SSM is recovery-friendly. In this benchmark, $W$ is set to 3, $WQ$ is set to 2, $timeout$ is set to 60 ms, $R$ is set to 2, and the size of state written is 8K.

We run four bricks in the experiment, each on a different physical machine in the cluster. We use a single machine as the load generator, with ten worker threads generating requests at a rate of approximately 450 requests per second.

Figure: SSM running with 4 Bricks. One brick is killed manually at time 30, and restarted at time 40. Throughput and availability are unaffected. Although not displayed in the graph, all requests are all fulfilled correctly, within the specified timeout.

We induce a fault at time 30 by killing a brick by hand. As can be shown from the graph, throughput remains unaffected. Furthermore, all requests complete successfully; the load generator showed no failures. This microbenchmark is intended to demonstrate the recovery-friendly aspect of SSM. In a non-overloaded system, the failure and recovery of a brick has no negative effect on correctness, system throughput, availability, or performance. All generated requests completed within the specified timeout, and all requests returned successfully.


next up previous
Next: Self-Tuning Up: Experimental Results Previous: Experimental Results
Benjamin Chan-Bin Ling 2004-03-04