Experimental Design

Next: Experimental Results Up: Performance Measurements Previous: Benchmark Choice

Experimental Design

The benchmark database is made up of a set of part objects interconnected to each other. The benchmark specifies two database sizes based on the number of parts stored in the database--a small database containing 20,000 parts and a large database containing 200,000 parts--to allow performance measurements of a system when the entire database is small enough to fit into main memory and compare it with situations where the database is larger than the available memory.

The parts are indexed by unique part numbers associated with each part.¹⁸ Each part is ``connected'' via a direct link to exactly three other parts, chosen partially randomly to produce some locality of reference. In particular, 90% of the connections are to ``nearby'' 1% of parts where ``nearness'' is defined in terms of part numbers, that is, a given part is considered to be ``near'' other parts if those parts have part numbers that are numerically close to the number of this part. The remaining 10% of the connections are to (uniformly) randomly-chosen parts.

We use the OO1 benchmark traversal operation (perform a depth-first traversal of all connected parts starting from a randomly-chosen part and traversing up to seven levels deep for a total of 3280 parts including possible duplicates, and invoke an empty procedure on each visited part) for our performance measurements. Each traversal set contains a total of 45 traversals split as follows: the first traversal is the cold traversal (when no data is cached in memory), the next 34 are warm traversals (as more and more data is cached in memory) and finally the last 10 are hot traversals (when all data is cached in memory).¹⁹ We use a random number generator to ensure that each warm traversal selects a new ``root'' part as the initial starting point, thus visiting a mostly-different set of parts in each traversal.

Footnotes

... part.¹⁸: The benchmark specification does not define a data structure that must be used for the index; we used a B+ tree for all our experiments.
... memory).¹⁹: This is different from the standard benchmark specification containing only 20 traversals (split as 1 cold, 9 warm, and 10 hot traversals); we run more warm traversals because we believe that 9 traversals are not sufficient to provide meaningful results, especially for the large database case.

Next: Experimental Results Up: Performance Measurements Previous: Benchmark Choice

Sheetal V. Kakkad