Predicting the Resilience of Obfuscated Code Against Symbolic Execution Attacks via Machine Learning

Authors: 

Sebastian Banescu, Technische Universität München; Christian Collberg, University of Arizona; Alexander Pretschner, Technische Universität München

Abstract: 

Software obfuscation transforms code such that it is more difficult to reverse engineer. However, it is known that given enough resources, an attacker will successfully reverse engineer an obfuscated program. Therefore, an open challenge for software obfuscation is estimating the time an obfuscated program is able to withstand a given reverse engineering attack. This paper proposes a general framework for choosing the most relevant software features to estimate the effort of automated attacks. Our framework uses these software features to build regression models that can predict the resilience of different software protection transformations against automated attacks. To evaluate the effectiveness of our approach, we instantiate it in a case-study about predicting the time needed to deobfuscate a set of C programs, using an attack based on symbolic execution. To train regression models our system requires a large set of programs as input. We have therefore implemented a code generator that can generate large numbers of arbitrarily complex random C functions. Our results show that features such as the number of community structures in the graph representation of symbolic path-constraints, are far more relevant for predicting deobfuscation time than other features generally used to measure the potency of controlflow obfuscation (e.g. cyclomatic complexity). Our best model is able to predict the number of seconds of symbolic execution-based deobfuscation attacks with over 90% accuracy for 80% of the programs in our dataset, which also includes several realistic hash functions.

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

BibTeX
@inproceedings {203634,
author = {Sebastian Banescu and Christian Collberg and Alexander Pretschner},
title = {Predicting the Resilience of Obfuscated Code Against Symbolic Execution Attacks via Machine Learning},
booktitle = {26th USENIX Security Symposium (USENIX Security 17)},
year = {2017},
isbn = {978-1-931971-40-9},
address = {Vancouver, BC},
pages = {661--678},
url = {https://www.usenix.org/conference/usenixsecurity17/technical-sessions/presentation/banescu},
publisher = {USENIX Association},
month = aug
}

Presentation Video