Latency Scalability

Next: Related Work Up: The New Servers & Previous: Service Inversion Improvements

Latency Scalability

To understand how latencies are affected by processor speed, we use three generations of hardware with various processor speed but sharing most of the other hardware components. Details about our server machines are shown in Section 2. We begin our study by measuring the infinite-demand capacity of the two original servers while adjusting the data set size. The results, shown in Table 6, indicate that in-memory capacity of both Apache and Flash scales well with processor speed. But once the data set size exceeds physical memory, performance degrades. Even though the heavy-tailed 3GB Web workload only requires reasonable amount of disk activity, we observe the two faster processors have idle CPU, suggesting performance is tied to disk performance on this workload.

A more detailed examination of server latency is shown in Figures 22 and 23. These two graphs represent an in-memory workload and a disk-bound workload, respectively, and show the mean latencies for both server packages across all three processors. Measurements are taken at various load levels, and show a remarkable consistency - at the same relative load levels, both Apache and Flash exhibit similar latencies, the in-memory latencies are much lower than the disk-bound latencies, and the latencies show only minor improvement with processor speed. Figure 24 shows the scalability of our new servers across processors - even with much lower Pentium-II latencies, improvements in processor speed now reduce latency on both servers. This result confirms that once blocking is avoided, the servers can take more advantage of improvements in hardware performance.

In summary, both new servers demonstrate lower initial latencies, slower latency growth, and better decrease of latency with processor speed. These servers are no longer dominated by disk access times, and should scale with improvements in processors, memory, etc. The fact that these changes eliminate over 80% of the latency answers the question about latency origins - these latencies were dominated by blocking, rather than request queuing.

Next: Related Work Up: The New Servers & Previous: Service Inversion Improvements

Yaoping Ruan
2006-04-18