{APUNet}: Revitalizing {GPU} as Packet Processing Accelerator

Younghwan Go; Muhammad Asim Jamshed; YoungGyoun Moon; Changho Hwang; KyoungSoo Park

Younghwan Go, Muhammad Asim Jamshed, YoungGyoun Moon, Changho Hwang, and KyoungSoo Park, Korea Advanced Institute of Science and Technology (KAIST)

Many research works have recently experimented with GPU to accelerate packet processing in network applications. Most works have shown that GPU brings a significant performance boost when it is compared to the CPUonly approach, thanks to its highly-parallel computation capacity and large memory bandwidth. However, a recent work argues that for many applications, the key enabler for high performance is the inherent feature of GPU that automatically hides memory access latency rather than its parallel computation power. It also claims that CPU can outperform or achieve a similar performance as GPU if its code is re-arranged to run concurrently with memory access, employing optimization techniques such as group prefetching and software pipelining.

In this paper, we revisit the claim of the work and see if it can be generalized to a large class of network applications. Our findings with eight popular algorithms widely used in network applications show that (a) there are many compute-bound algorithms that do benefit from the parallel computation capacity of GPU while CPU-based optimizations fail to help, and (b) the relative performance advantage of CPU over GPU in most applications is due to data transfer bottleneck in PCIe communication of discrete GPU rather than lack of capacity of GPU itself. To avoid the PCIe bottleneck, we suggest employing integrated GPU in recent APU platforms as a cost-effective packet processing accelerator. We address a number of practical issues in fully exploiting the capacity of APU and show that network applications based on APU achieve multi-10 Gbps performance for many compute/memory-intensive algorithms.

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

BibTeX

@inproceedings {201544,
author = {Younghwan Go and Muhammad Asim Jamshed and YoungGyoun Moon and Changho Hwang and KyoungSoo Park},
title = {{APUNet}: Revitalizing {GPU} as Packet Processing Accelerator},
booktitle = {14th USENIX Symposium on Networked Systems Design and Implementation (NSDI 17)},
year = {2017},
isbn = {978-1-931971-37-9},
address = {Boston, MA},
pages = {83--96},
url = {https://www.usenix.org/conference/nsdi17/technical-sessions/presentation/go},
publisher = {USENIX Association},
month = mar
}

Download

APUNet: Revitalizing GPU as Packet Processing Accelerator

Open Access Media

Presentation Video

Presentation Audio