Jianqiang Wang, Shanghai Jiao Tong University; Siqi Ma, CSIRO DATA61; Yuanyuan Zhang and Juanru Li, Shanghai Jiao Tong University; Zheyu Ma, Northwestern Polytechnical University; Long Mai, Tiancheng Chen, and Dawu Gu, Shanghai Jiao Tong University
Memory corruption vulnerabilities are serious threats to software security, which is often triggered by improperly use of memory operation functions. The detection of memory corruptions relies on identifying memory operation functions and examining the corresponding manipulation applied on memories. Nevertheless, distinguishing memory operation functions is challenging that both standard and customized memory operation functions are declared in real-world software. In this paper, we propose NLP-EYE, an NLP-based memory corruption detection system. NLP-EYE is able to identify memory operation functions through a semantic-aware source code analysis automatically. It first creates a programming language friendly corpus in order to parse function prototypes. Based on the similarity comparison by utilizing both semantic and syntax information, NLP-EYE identifies and labels both standard and customized memory operation functions. It finally uses symbolic execution to check whether a memory operation causes incorrect memory usages.
Instead of analyzing data dependencies of the entire source code, NLP-EYE only focuses on memory operation parts. We evaluated the performance of NLP-EYE by using seven real-world libraries and programs, including Vim, Git, CPython, etc. NLP-EYE successfully identifies 27 null pointer de-reference, two double-free and three use-after-free that are not discovered before in the latest versions of analysis targets.
Open Access Media
USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.