Check out the new USENIX Web site. next up previous
Next: Performance Enhancements Up: Virtual Machine Networking Performance Previous: Throughput vs. Data Size:


CPU Utilization


Table 3: Percentage of total time spent idle for various configurations transmitting data on PC-733.
Idle Time While Running nettest
PC-733 86%
Optimized VM/PC-733 21.7%
Optimized VM/PC-733 without IRQ notification 17.9%
Optimized VM/PC-733 without send combining and IRQ notification 2.0%
Version 2.0 VM/PC-733 0%


Figure 6 shows that VM/PC-733 is able to saturate a 100 Mbit link without becoming CPU bound, but VM/PC-350 is CPU bound, even with optimizations. Natively, PC-733 and PC-350 easily saturate a 100 Mbit link. The final experiments set out to gather information about how utilized the CPU is in the different configurations. We instrumented the system to obtain a precise measurement of idle time. Normally, when a guest issues a halt (HLT) instruction, VMware Workstation switches back to the VMApp which then blocks on a select()on all devices. Instead, we enabled an option whereby a guest HLT instruction spins and halts the CPU in the VMM rather than yielding control back to the host OS. Using the TSC register, we measure idle time starting from when the guest issues a HLT instruction to when the next hardware interrupt occurs. This idle time represents CPU cycles that is available to the guest OS for running other computation. Note that not all of this idle time would be available for other host OS computation, as there are a couple of world switches and some system call overhead (e.g., the select() system call) if we switched back to the VMApp on a guest HLT instruction. For the native idle times, the standard profiler built into Linux kernels was augmented to account for time spent executing user code and in the kernel idle loop, and then the percentage of total ticks spent in the idle loop was taken. The idle times in Table 3 show that in VM/PC-733, with a transmit size of 4KB, the guest has transitioned from being CPU bound at 64 Mb/s to being I/O bound with 21.7% idle time. In comparison, PC-733 has 86% idle time. At this point, nearly all of the remaining overheads are either part of CPU virtualization or part of the nature of the hosted architecture. The next section discusses further optimizations both within and outside the scope of a hosted architecture.
next up previous
Next: Performance Enhancements Up: Virtual Machine Networking Performance Previous: Throughput vs. Data Size:
Beng-Hong Lim 2001-05-01