Check out the new USENIX Web site. next up previous
Next: Experimental Setup Up: Virtualizing I/O Devices on Previous: Sending and Receiving via


Virtual Machine Networking Performance

Figure 5: Microseconds spent along the path in the Host and VMM worlds processing an OUT instruction issued by the guest OS to a virtual AMD Lance NIC that initiates a physical network packet transmission on a 733 MHz CPU machine.
\begin{figure*}
\centerline{\psfig{figure=figures/xmitlatency.eps,width=6.5in}}
\end{figure*}

A hosted virtualization strategy for I/O devices offers excellent flexibility and portability but at a potential tradeoff in performance for high throughput devices. Due to its nature, the hosted architecture incurs the following overheads: i) a world switch from the VMM to the host is required whenever the virtual machine needs to access real hardware, ii) I/O interrupt handling potentially involves the VMM, host OS, and guest OS interrupt handlers, iii) a packet transmission by the guest OS involves two device drivers - one in the guest and one on the host, and iv) there is an extra copy from the guest OS's physical memory to the host OS's kernel buffers on a packet transmit. Since these overheads consume CPU cycles, a system that is natively capable of saturating a high performance Ethernet link might instead become CPU bound when run within a virtual machine. This section analyzes the overheads of sustained TCP transmits from a virtual machine. Experimental results of sustained TCP receives yield similar results and conclusions, and are not presented here due to space constraints. An analysis of these workloads exposes the major sources of virtualization overhead on a hosted architecture. Frequent switches between the host and VMM worlds is the most significant overhead. A set of optimizations targeted at these overheads improves the virtual networking subsystem substantially. The experimental results show that a set of three optimizations doubles sustained TCP transmit throughput on a slower machine that is CPU bound, and reduces CPU utilization significantly on a faster machine that is I/O bound.

Subsections
next up previous
Next: Experimental Setup Up: Virtualizing I/O Devices on Previous: Sending and Receiving via
Beng-Hong Lim 2001-05-01