{SMT} {QoS}: Hardware Prototyping of Thread-level  Performance Differentiation Mechanisms

Authors:

Andrew Herdrich, Ramesh Illikkal, Ravi Iyer, Ronak Singhal, Matt Merten, and Martin Dixon, Intel Corporation

Abstract:

Absolute throughput often fails to scale linearly with core count in chip multiprocessors (CMPs) due to contention in shared platform resources, including cache, memory bandwidth and busses. This nonlinear scaling is exacerbated by the addition of simultaneous multithreading (SMT) to CMPs by introducing resource contention at the pipeline resource level, and increasing the number of active threads in the system which further increases contention in shared resources, leading to a loss in performance stability and fairness.

This work introduces and evaluates a new form of source-based execution rate control at the processor pipeline level, based on instruction fetch, instruction queue and reservation station partitioning between SMT threads. The efficacy of these controls is demonstrated through experiments with SPEC workloads on a modified test version of an Intel® codename Nehalem microprocessor. This new SMT rate control is presented as a critical building block to restoring the fairness and determinism in performance once inherent in simpler uniprocessors utilizing time-slicing schedulers, and is proposed for inclusion in future microprocessors supporting SMT.

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

BibTeX

@conference {259237,
title = {{SMT} {QoS}: Hardware Prototyping of Thread-level Performance Differentiation Mechanisms},
year = {2012},
address = {Berkeley, CA},
publisher = {USENIX Association},
month = jun
}

Download

Herdrich PDF

connect with us

Open Access Media