DEEPVSA: Facilitating Value-set Analysis with Deep Learning for Postmortem Program Analysis

Authors: 

Wenbo Guo, Dongliang Mu, and Xinyu Xing, The Pennsylvania State University; Min Du and Dawn Song, University of California, Berkeley

Abstract: 

Value set analysis (VSA) is one of the most powerful binary analysis tools, which has been broadly adopted in many use cases, ranging from verifying software properties (e.g., variable range analysis) to identifying software vulnerabilities (e.g., buffer overflow detection). Using it to facilitate data flow analysis in the context of postmortem program analysis, it however exhibits an insufficient capability in handling memory alias identification. Technically speaking, this is due to the fact that VSA needs to infer memory reference based on the context of a control flow, but accidental termination of a running program left behind incomplete control flow information, making memory alias analysis clueless.

To address this issue, we propose a new technical approach. At the high level, this approach first employs a layer of instruction embedding along with a bi-directional sequence-to-sequence neural network to learn the machine code pattern pertaining to memory region accesses. Then, it utilizes the network to infer the memory region that VSA fails to recognize. Since the memory references to different regions naturally indicate the non-alias relationship, the proposed neural architecture can facilitate the ability of VSA to perform better alias analysis. Different from previous research that utilizes deep learning for other binary analysis tasks, the neural network proposed in this work is fundamentally novel. Instead of simply using off-the-shelf neural networks, we introduce a new neural network architecture which could capture the data dependency between and within instructions. %machine code.

In this work, we implement our deep neural architecture as DEEPVSA, a neural network assisted alias analysis tool. To demonstrate the utility of this tool, we use it to analyze software crashes corresponding to 40 memory corruption vulnerabilities archived in Offensive Security Exploit Database. We show that, DEEPVSA can significantly improve VSA with respect to its capability in analyzing memory alias and thus escalate the ability of security analysts to pinpoint the root cause of software crashes. In addition, we demonstrate that our proposed neural network outperforms state-of-the-art neural architectures broadly adopted in other binary analysis tasks. Last but not least, we show that DEEPVSA exhibits nearly no false positives when performing alias analysis.

BibTeX
@inproceedings {236242,
title = {{DEEPVSA}: Facilitating Value-set Analysis with Deep Learning for Postmortem Program Analysis},
booktitle = {28th {USENIX} Security Symposium ({USENIX} Security 19)},
year = {2019},
address = {Santa Clara, CA},
url = {https://www.usenix.org/conference/usenixsecurity19/presentation/guo},
publisher = {{USENIX} Association},
}