Check out the new USENIX Web site.

End-to-End Control

We now present an end-to-end test where multiple VMs run a mix of realistic workloads with different shares. We use Filebench [20], a well-known IO modeling tool, to generate an OLTP workload similar to TPC-C. We employ four VMs running Filebench, and two generating 16 KB random reads. A pair of Filebench VMs are placed on each of two hosts, whereas the micro-benchmark VMs occupy one host each. This is exactly the same experiment discussed in Section 2; data for the uncontrolled baseline case is presented in Table 1. Recall that without PARDA, hosts 1 and 2 obtain similar throughput even though the overall sum of their VM shares is different. Table 6 provides setup details and reports data using PARDA control. Results for the OLTP VMs are presented as Filebench operations per second (Ops/s).

Table 6: PARDA end-to-end control for Filebench OLTP and micro-benchmark VMs issuing 16 KB random reads. Configured shares ($ s_i$ ), host weights ($ \beta _h$ ), Ops/s for Filebench VMs and IOPS ($ T_h$ for hosts) are respected across hosts. $ \cal {L}$ = 25 ms, $ w_{max}$ = 64.

Host VM Type $ s_1$ , $ s_2$ $ \beta_{h}$ $ VM1    $ $ VM2$ $ T_{h}$
1 $ 2\times$ OLTP 20, 10 6 1266 Ops/s 591 Ops/s 1857
2 $ 2\times$ OLTP 10, 10 4 681 Ops/s 673 Ops/s 1316
3 $ 1\times$ Micro 20 4 740 IOPS n/a 740
4 $ 1\times$ Micro 10 2 400 IOPS n/a 400

Figure 12: PARDA End-to-End Control. VM IOPS are proportional to shares. Host window sizes are proportional to overall $ \beta $ values.




(a) Window Size (b) Latency (ms) (c) Throughput (IOPS)

We run PARDA ($ \cal {L}$ = 25 ms) with host weights ($ \beta _h$ ) set according to shares of their VMs ( $ \beta_h = 6:4:4:2$ for hosts 1 to 4). The maximum window size $ w_{max}$ is 64 for all hosts. The OLTP VMs on host 1 receive 1266 and 591 Ops/s, matching their $ 2:1$ share ratio. Similarly, OLTP VMs on host 2 obtain 681 and 673 Ops/s, close to their $ 1:1$ share ratio. Note that the overall Ops/s for hosts 1 and 2 have a $ 3:2$ ratio, which is not possible in an uncontrolled scenario. Figure 12 plots the window size, latency and throughput observed by the hosts. We note two key properties: (1) window sizes are in proportion to the overall $ \beta $ values and (2) each VM receives throughput in proportion to its shares. This shows that PARDA provides the strong property of enforcing VM shares, independent of their placement on hosts. The local SFQ scheduler divides host-level capacity across VMs in a fair manner, and together with PARDA, is able to provide effective end-to-end isolation among VMs. We also modified one VM workload during the experiment to test our burst-handling mechanism, which we discuss in the next section.

Ajay Gulati 2009-01-14