Skip to main content
USENIX
  • Conferences
  • Students
Sign in
  • Overview
  • Symposium Organizers
  • At a Glance
  • Registration Information
    • Registration Discounts
    • Venue, Hotel, and Travel
  • Technical Sessions
  • Co-Located Workshops
  • Accepted Posters
  • Activities
    • Birds-of-a-Feather Sessions
    • Work-in-Progress Reports
  • Sponsorship
  • Students and Grants
  • Services
  • Questions?
  • Help Promote!
  • Flyer PDF
  • For Participants
  • Call for Papers
  • Past Symposia

sponsors

Gold Sponsor
Gold Sponsor
Gold Sponsor
Silver Sponsor
Bronze Sponsor
Bronze Sponsor
Bronze Sponsor
Media Sponsor
Media Sponsor
Media Sponsor
Media Sponsor
Media Sponsor
Media Sponsor
Media Sponsor
Media Sponsor
Media Sponsor
Media Sponsor
Media Sponsor
Industry Partner

twitter

Tweets by USENIXSecurity

usenix conference policies

  • Event Code of Conduct
  • Conference Network Policy
  • Statement on Environmental Responsibility Policy

You are here

Home ยป BYTEWEIGHT: Learning to Recognize Functions in Binary Code
Tweet

connect with us

http://twitter.com/usenixsecurity
https://www.facebook.com/usenixassociation
http://www.linkedin.com/groups/USENIX-Association-49559/about
https://plus.google.com/108588319090208187909/posts
http://www.youtube.com/user/USENIXAssociation

BYTEWEIGHT: Learning to Recognize Functions in Binary Code

Friday, August 1, 2014 - 10:30am
Authors: 

Tiffany Bao, Jonathan Burket, and Maverick Woo, Carnegie Mellon University; Rafael Turner, University of Chicago; David Brumley, Carnegie Mellon University

Abstract: 

Function identification is a fundamental challenge in reverse engineering and binary program analysis. For instance, binary rewriting and control flow integrity rely on accurate function detection and identification in binaries. Although many binary program analyses assume functions can be identified a priori, identifying functions in stripped binaries remains a challenge.

In this paper, we propose BYTEWEIGHT, a new automatic function identification algorithm. Our approach automatically learns key features for recognizing functions and can therefore easily be adapted to different platforms, new compilers, and new optimizations. We evaluated our tool against three well-known tools that feature function identification: IDA, BAP, and Dyninst. Our data set consists of 2,200 binaries created with three different compilers, with four different optimization levels, and across two different operating systems. In our experiments with 2,200 binaries, we found that BYTEWEIGHT missed 44,621 functions in comparison with the 266,672 functions missed by the industry-leading tool IDA. Furthermore, while IDA misidentified 459,247 functions, BYTEWEIGHT misidentified only 43,992 functions.

Tiffany Bao, Carnegie Mellon University

Jonathan Burket, Carnegie Mellon University

Maverick Woo, Carnegie Mellon University

Rafael Turner, University of Chicago

David Brumley, Carnegie Mellon University

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

BibTeX
@inproceedings {184521,
author = {Tiffany Bao and Jonathan Burket and Maverick Woo and Rafael Turner and David Brumley},
title = {{BYTEWEIGHT}: Learning to Recognize Functions in Binary Code},
booktitle = {23rd USENIX Security Symposium (USENIX Security 14)},
year = {2014},
isbn = {978-1-931971-15-7},
address = {San Diego, CA},
pages = {845--860},
url = {https://www.usenix.org/conference/usenixsecurity14/technical-sessions/presentation/bao},
publisher = {USENIX Association},
month = aug,
}
Download
Bao PDF
View the slides

Presentation Video 

Presentation Audio

MP3 Download

Download Audio

  • Log in or    Register to post comments

Gold Sponsors

Silver Sponsors

Bronze Sponsors

Media Sponsors & Industry Partners

© USENIX

  • Privacy Policy
  • Contact Us