Extracting Protocol Format as State Machine via Controlled Static Loop Analysis

Authors: 

Qingkai Shi, Xiangzhe Xu, and Xiangyu Zhang, Purdue University

Abstract: 

Reverse engineering of protocol message formats is critical for many security applications. Mainstream techniques use dynamic analysis and inherit its low-coverage problem—the inferred message formats only reflect the features of their inputs. To achieve high coverage, we choose to use static program analysis to infer message formats from the implementation of protocol parsers. In this work, we focus on a class of extremely challenging protocols whose formats can be described through constraint-enhanced regular expressions and are parsed via finite state machines. Such state machines are often implemented as complicated parsing loops, which are inherently difficult to analyze via conventional static analysis. Our new technique extracts a sound state machine by regarding each loop iteration as a state and the dependency between loop iterations as state transitions. To achieve high, i.e., path-sensitive, precision but avoid path explosion, the analysis is controlled to merge as many paths as possible based on carefully-designed rules. The evaluation results show that we can infer a state machine and, thus, the message formats, in five minutes with over 90% precision and recall, far better than state of the art. We have also applied the state machines to enhance protocol fuzzers, which are improved by 20% to 230% in terms of coverage and detect ten more zero-days compared to baselines.

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

BibTeX
@inproceedings {291090,
author = {Qingkai Shi and Xiangzhe Xu and Xiangyu Zhang},
title = {Extracting Protocol Format as State Machine via Controlled Static Loop Analysis},
booktitle = {32nd USENIX Security Symposium (USENIX Security 23)},
year = {2023},
isbn = {978-1-939133-37-3},
address = {Anaheim, CA},
pages = {7019--7036},
url = {https://www.usenix.org/conference/usenixsecurity23/presentation/shi-qingkai},
publisher = {USENIX Association},
month = aug
}

Presentation Video