The load-picking algorithms in Sections 3.5-3.6 generate a new load given one or more previous test loads. How can the controller generate the first load, or seed, to try? One way is to use a conservative low load as the seed, but this approach increases the time spent ramping up to a high peak rate. When the benchmarking goal is to plot a response surface, the controller uses another approach that uses the peak rate of the ``nearest'' previous sample as the seed.
To illustrate, assume that the factors of interest,
, in Algorithm 1 are
number of disks, number
of nfsds
(as shown in Figure 2). Suppose the
controller uses Binsearch with a low seed of
to find the peak rate
for sample
. Now, for finding the peak
rate
for sample
, it can use the peak
rate
as seed. Thus, the controller can jump quickly to a load
value close to
.
In the common case, the peak rates for ``nearby'' samples will be close. If they are not, the load-picking algorithms may incur additional cost to recover from a bad seed. The notion of ``nearness'' is not always well defined. While the distance between samples can be measured if the factors are all quantitative, if there are categorical factors--e.g., file system type--the nearest sample may not be well defined. In such cases the controller may use a default seed or an aggregate of peak rates from previous samples to start the search.
varun 2008-05-13