Check out the new USENIX Web site. next up previous
Next: Overview Up: No Title Previous: No Title

   
Introduction

The hardware resources available to a network service determine its maximum request throughput and--under typical network conditions--a large share of the response delays perceived by its clients. As hardware performance advances, emphasis is shifting from server software performance (e.g., [19,26,27,37]) to improving the manageability and robustness of large-scale services  [3,5,12,31]. This paper focuses on a key subproblem: automated on-demand resource provisioning for multiple competing services hosted by a shared server infrastructure--a utility. It applies to Web-based services in a shared hosting center or a Content Distribution Network (CDN).

The utility allocates each service a slice of its resources, including shares of memory, CPU time, and available throughput from storage units. Slices provide performance isolation and enable the utility to use its resources efficiently. The slices are chosen to allow each hosted service to meet service quality targets (e.g., response time) negotiated in Service Level Agreements (SLAs) with the utility. Slices vary dynamically to respond to changes in load and resource status. This paper addresses the provisioning problem: how much resource does a service need to meet SLA targets at its projected load level? A closely related aspect of utility resource allocation is assignment: which servers and storage units will provide the resources to host each service?

Previous work addresses various aspects of utility resource management, including mechanisms to enforce resource shares (e.g., [7,9,36]), policies to provision shares adaptively [12,21,39], admission control with probabilistically safe overbooking [4,6,34], scheduling to meet SLA targets or maximize yield [21,22,23,32], and utility data center architectures [5,25,30].

The key contribution of this paper is to demonstrate the potential of a new model-based approach to provisioning multiple resources that interact in complex ways. The premise of model-based resource provisioning (MBRP) is that internal models capturing service workload and behavior can enable the utility to predict the effects of changes to the workload intensity or resource allotment. Experimental results illustrate model-based dynamic provisioning of memory and storage shares for hosted Web services with static content. Given adequate models, this approach may generalize to a wide range of services including complex multi-tier services [29] with interacting components, or services with multiple functional stages [37]. Moreover, model-based provisioning is flexible enough to adjust to resource constraints or surpluses exposed during assignment.

This paper is organized as follows. Section 2 motivates the work and summarizes our approach. Section 3 outlines simple models for Web services; Section 4 describes a resource allocator based on the models, and demonstrates its behavior in various scenarios. Section 5 describes our prototype and presents experimental results. Section 6 sets our approach in context with related work, and Section 7 concludes.


next up previous
Next: Overview Up: No Title Previous: No Title
Ronald Doyle
2003-01-20