Kamino: Efficient VM Allocation at Scale with Latency-Driven Cache-Aware Scheduling

David Domingo, Rutgers University; Hugo Barbalho and Marco Molinaro, Microsoft Research; Kuan Liu, Abhisek Pan, David Dion, and Thomas Moscibroda, Microsoft Azure; Sudarsun Kannan, Rutgers University; Ishai Menache, Microsoft Research

In virtual machine (VM) allocation systems, caching repetitive and similar VM allocation requests and associated resolution rules is crucial for reducing computational costs and meeting strict latency requirements. While modern allocation systems distribute requests among multiple allocator agents and use caching to improve performance, current schedulers often neglect the cache state and latency considerations when assigning each new request to an agent. Due to the high variance in costs of cache hits and misses and the associated processing overheads of updating the caches, simple load-balancing and cache-aware mechanisms result in high latencies. We introduce Kamino, a high-performance, latency-driven and cache-aware request scheduling system aimed at minimizing end-to-end latencies. Kamino employs a novel scheduling algorithm grounded in theory which uses partial indicators from the cache state to assign each new request to the agent with the lowest estimated latency. Evaluation of Kamino using a high-fidelity simulator on large-scale production workloads shows a 42% reduction in average request latencies. Our deployment of Kamino in the control plane of a large public cloud confirms these improvements, with a 33% decrease in cache miss rates and 17% reduction in memory usage.

OSDI '25 Open Access Sponsored by
King Abdullah University of Science and Technology (KAUST)

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

BibTeX
@inproceedings {308754,
author = {David Domingo and Hugo Barbalho and Marco Molinaro and Kuan Liu and Abhisek Pan and David Dion and Thomas Moscibroda and Sudarsun Kannan and Ishai Menache},
title = {Kamino: Efficient {VM} Allocation at Scale with {Latency-Driven} {Cache-Aware} Scheduling},
booktitle = {19th USENIX Symposium on Operating Systems Design and Implementation (OSDI 25)},
year = {2025},
isbn = {978-1-939133-47-2},
address = {Boston, MA},
pages = {519--535},
url = {https://www.usenix.org/conference/osdi25/presentation/domingo},
publisher = {USENIX Association},
month = jul
}

Presentation Video