{UNIFUZZ}: A Holistic and Pragmatic {Metrics-Driven} Platform for Evaluating Fuzzers

Yuwei Li; Shouling Ji; Yuan Chen; Sizhuang Liang; Wei-Han Lee; Yueyao Chen; Chenyang Lyu; Chunming Wu; Raheem Beyah; Peng Cheng; Kangjie Lu; Ting Wang

Yuwei Li, Zhejiang University; Shouling Ji, Zhejiang University/Zhejiang University NGICS Platform; Yuan Chen, Zhejiang University; Sizhuang Liang, Georgia Institute of Technology; Wei-Han Lee, IBM Research; Yueyao Chen and Chenyang Lyu, Zhejiang University; Chunming Wu, Zhejiang University/Zhejiang Lab, Hangzhou, China; Raheem Beyah, Georgia Institute of Technology; Peng Cheng, Zhejiang University NGICS Platform/Zhejiang University; Kangjie Lu, University of Minnesota; Ting Wang, Pennsylvania State University

A flurry of fuzzing tools (fuzzers) have been proposed in the literature, aiming at detecting software vulnerabilities effectively and efficiently. To date, it is however still challenging to compare fuzzers due to the inconsistency of the benchmarks, performance metrics, and/or environments for evaluation, which buries the useful insights and thus impedes the discovery of promising fuzzing primitives. In this paper, we design and develop UNIFUZZ, an open-source and metrics-driven platform for assessing fuzzers in a comprehensive and quantitative manner. Specifically, UNIFUZZ to date has incorporated 35 usable fuzzers, a benchmark of 20 real-world programs, and six categories of performance metrics. We first systematically study the usability of existing fuzzers, find and fix a number of flaws, and integrate them into UNIFUZZ. Based on the study, we propose a collection of pragmatic performance metrics to evaluate fuzzers from six complementary perspectives. Using UNIFUZZ, we conduct in-depth evaluations of several prominent fuzzers including AFL [1], AFLFast [2], Angora [3], Honggfuzz [4], MOPT [5], QSYM [6], T-Fuzz [7] and VUzzer64 [8]. We find that none of them outperforms the others across all the target programs, and that using a single metric to assess the performance of a fuzzer may lead to unilateral conclusions, which demonstrates the significance of comprehensive metrics. Moreover, we identify and investigate previously overlooked factors that may significantly affect a fuzzer's performance, including instrumentation methods and crash analysis tools. Our empirical results show that they are critical to the evaluation of a fuzzer. We hope that our findings can shed light on reliable fuzzing evaluation, so that we can discover promising fuzzing primitives to effectively facilitate fuzzer designs in the future.

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

BibTeX

@inproceedings {263814,
author = {Yuwei Li and Shouling Ji and Yuan Chen and Sizhuang Liang and Wei-Han Lee and Yueyao Chen and Chenyang Lyu and Chunming Wu and Raheem Beyah and Peng Cheng and Kangjie Lu and Ting Wang},
title = {{UNIFUZZ}: A Holistic and Pragmatic {Metrics-Driven} Platform for Evaluating Fuzzers},
booktitle = {30th USENIX Security Symposium (USENIX Security 21)},
year = {2021},
isbn = {978-1-939133-24-3},
pages = {2777--2794},
url = {https://www.usenix.org/conference/usenixsecurity21/presentation/li-yuwei},
publisher = {USENIX Association},
month = aug
}

Download

Li PDF

Li Paper (Prepublication) PDF

View the slides

UNIFUZZ: A Holistic and Pragmatic Metrics-Driven Platform for Evaluating Fuzzers

Open Access Media

Presentation Video