Check out the new USENIX Web site.


Model-guided Load-Picking Algorithm

The general shape of the response-time vs. load curve is well known, and the form is similar for different workloads and server configurations. This suggests that a model-guided approach could fit the curve from a few test loads and converge more quickly to the peak rate. Using the insight offered by well-known open-loop queuing theory results [13], we experimented with a simple model to fit the curve: $ R = 1/(a - b*\lambda)$, where $ R$ is the response time, $ \lambda$ is the load, and $ a$ and $ b$ are constants that depend on the settings of factors in $ \langle \vec{W}, \vec{R}, \vec{C}\rangle$. To learn the model, the controller needs tuples of the form $ \langle \lambda,
R_{\lambda} \rangle$.

Algorithm 4 outlines the model-guided algorithm. If there are insufficient tuples for learning the model, it uses a simple heuristic to pick the test loads for generating the tuples. After that, the algorithm uses the model to predict the peak rate $ \lambda = \lambda^*$ for $ R =
R_{sat}$, returns the prediction as the next test load, and relearns the model using the new $ \langle \lambda,
R_{\lambda} \rangle$ tuple at the prediction. The whole process repeats until the search converges to the peak rate. As the controller observes more $ \langle \lambda,
R_{\lambda} \rangle$ tuples, the model-fit should improve progressively, and the model should guide the search to an accurate peak rate. In many cases, this happens in a single iteration of model learning (Section 5).

However, unlike the previous approaches, a model-guided search is not guaranteed to converge. Model-guided search is dependent on the accuracy of the model, which in turn depends on the choice of $ \langle \lambda,
R_{\lambda} \rangle$ tuples that are used for learning. The choice of tuples is generated by previous model predictions. This creates the possibility of learning an incorrect model which in turn yields incorrect choices for test loads. For example, if most of the test loads chosen for learning the model happen to lie significantly outside the peak rate region, then the model-guided choice of test loads may be incorrect or inefficient. Hence, in the worst case, the search may never converge or converge slowly to the peak rate. We have experimented with other models including polynomial models of the form $ R=a +
b\lambda + c\lambda^2$, which show similar limitations.

To avoid the worst case, the algorithm uses a simple heuristic to choose the tuples from the list of available tuples. Each time the controller learns the model, it chooses two tuples such that one of them is the last prediction, and the other is the tuple that yields the response time closest to threshold mean server response time $ R_{sat}$. More robust techniques for choosing the tuples is a topic of ongoing study. Section 5 reports our experience with the model-guided choice of test loads. Preliminary results suggest that the model-guided approaches are often superior but can be unstable depending on the initial samples used to learn the model.


\begin{algorithm}
% latex2html id marker 1514
[t]
\caption{Model-Guided {\bf Inp...
...a,b$ are the fitted coefficients\;
\par
\end{list}\vspace{-3ex}
\end{algorithm}

varun 2008-05-13