Jiajian Zhang, Xi'an Jiaotong-Liverpool University and University of Liverpool; Fangyu Wu, Xi'an Jiaotong-Liverpool University; Hai Jiang, Beijing University of Posts and Telecommunications; Qiufeng Wang, Xi'an Jiaotong-Liverpool University; Genlang Chen and Chaoyi Pang, NingboTech University
GPU communication plays a pivotal role in collaborative computation across multiple devices. Despite advancements in inter-device communication fabrics and architectures, synchronization still remains a significant challenge due to the manual coordination required between producers and consumers at the application level. In this work, we first reveal that traditional synchronization is a primary bottleneck in GPU communication, where consumers frequently poll for producer data availability. Specifically, early-started polling leads to the unnecessary occupation of computational resources. To address this issue, we propose Warp-level Interrupt-based Communication (WIC), a novel synchronization framework for GPU communication that introduces a fine-grained interruption mechanism at the warp level to replace repetitive polling. WIC preemptively stalls warps engaged in frequent polling and releases computational resources for other warps, thereby effectively overlapping producer-consumer synchronization with ongoing computations. Comprehensive experiments demonstrate that WIC significantly outperforms conventional polling methods by 1.13 × on average across various applications with diverse communication patterns.
USENIX ATC '25 Open Access Sponsored by
King Abdullah University of Science and Technology (KAUST)
Open Access Media
USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

author = {Jiajian Zhang and Fangyu Wu and Hai Jiang and Qiufeng Wang and Genlang Chen and Chaoyi Pang},
title = {{WIC}: Hiding {Producer-Consumer} Synchronization Delays with {Warp-Level} Interrupt-based {GPU} Communications},
booktitle = {2025 USENIX Annual Technical Conference (USENIX ATC 25)},
year = {2025},
isbn = {978-1-939133-48-9},
address = {Boston, MA},
pages = {889--904},
url = {https://www.usenix.org/conference/atc25/presentation/zhang-jiajian},
publisher = {USENIX Association},
month = jul
}