DeepDi: Learning a Relational Graph Convolutional Network Model on Instructions for Fast and Accurate Disassembly

Authors: 

Sheng Yu, University of California Riverside and Deepbits Technology Inc.; Yu Qu, University of California Riverside; Xunchao Hu, Deepbits Technology Inc.; Heng Yin, University of California Riverside and Deepbits Technology Inc.

Abstract: 

Disassembly is the cornerstone of many binary analysis tasks. Traditional disassembly approaches (e.g., linear and recursive) are not accurate enough, while more sophisticated approaches (e.g., Probabilistic Disassembly, Datalog Disassembly, and XDA) have high overhead, which hinders them from being widely used in time-critical security practices. In this paper, we propose DEEPDI, a novel approach that achieves both accuracy and efficiency. The key idea of DEEPDI is to use a graph neural network model to capture and propagate instruction relations. Specifically, DEEPDI firstly uses superset disassembly to get a superset of instructions. Then we construct a graph model called Instruction Flow Graph to capture different instruction relations. Then a Relational Graph Convolutional Network is used to propagate instruction embeddings for accurate instruction classification. DEEPDI also provides heuristics to recover function entrypoints. We evaluate DEEPDI on several large-scale datasets containing real-world and obfuscated binaries. We show that DEEPDI is comparable or superior to the state-of-the-art disassemblers in terms of accuracy, and is robust against unseen binaries, compilers, platforms, obfuscated binaries, and adversarial attacks. Its CPU version is two times faster than IDA Pro, and its GPU version is 350 times faster.

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

BibTeX
@inproceedings {277106,
author = {Sheng Yu and Yu Qu and Xunchao Hu and Heng Yin},
title = {{DeepDi}: Learning a Relational Graph Convolutional Network Model on Instructions for Fast and Accurate Disassembly},
booktitle = {31st USENIX Security Symposium (USENIX Security 22)},
year = {2022},
isbn = {978-1-939133-31-1},
address = {Boston, MA},
pages = {2709--2725},
url = {https://www.usenix.org/conference/usenixsecurity22/presentation/yu-sheng},
publisher = {USENIX Association},
month = aug
}

Presentation Video