Skip to main content
USENIX
  • Conferences
  • Students
Sign in
  • Home
  • Attend
    • Venue, Hotel, and Travel
    • Students and Grants
    • Co-Located Workshops
  • Program
    • At a Glance
    • Technical Sessions
    • Poster Session
  • Activities
    • Birds-of-a-Feather Sessions
    • Poster Session
    • WiPs
  • Participate
    • Call for Papers
      • Important Dates
      • Symposium Organizers
      • Symposium Topics
      • Refereed Papers
      • Shadow PC
      • Symposium Activities
      • Submitting Papers
    • Instructions for Participants
  • Sponsorship
  • About
    • Symposium Organizers
    • Services
    • Questions
    • Help Promote!
    • Past Symposia
  • Home
  • Attend
    • Venue, Hotel, and Travel
    • Students and Grants
    • Co-Located Workshops
  • Program
  • Activities
  • Participate
    • Call for Papers
    • Instructions for Participants
  • Sponsorship
  • About
    • Symposium Organizers
    • Services
    • Questions
    • Help Promote!
    • Past Symposia

sponsors

Platinum Sponsor
Gold Sponsor
Gold Sponsor
Silver Sponsor
Silver Sponsor
Silver Sponsor
Bronze Sponsor
Bronze Sponsor
General Sponsor
Media Sponsor
Media Sponsor
Media Sponsor
Media Sponsor
Media Sponsor
Media Sponsor
Media Sponsor
Media Sponsor
Media Sponsor
Media Sponsor
Industry Partner
Industry Partner

help promote

USENIX Security '16 button

Get more
Help Promote graphics!

connect with usenix


  •  Twitter
  •  Facebook
  •  LinkedIn
  •  Google+
  •  YouTube

twitter

Tweets by USENIXSecurity

usenix conference policies

  • Event Code of Conduct
  • Conference Network Policy
  • Statement on Environmental Responsibility Policy

You are here

Home ยป De-anonymizing Programmers via Code Stylometry
Tweet

connect with us

De-anonymizing Programmers via Code Stylometry

Wednesday, August 12, 2015 - 4:00pm-4:30pm
Authors: 

Aylin Caliskan-Islam, Drexel University; Richard Harang, U.S. Army Research Laboratory; Andrew Liu, University of Maryland; Arvind Narayanan, Princeton University; Clare Voss, U.S. Army Research Laboratory; Fabian Yamaguchi, University of Goettingen; Rachel Greenstadt, Drexel University

Abstract: 

Source code authorship attribution is a significant privacy threat to anonymous code contributors. However, it may also enable attribution of successful attacks from code left behind on an infected system, or aid in resolving copyright, copyleft, and plagiarism issues in the programming fields. In this work, we investigate machine learning methods to de-anonymize source code authors of C/C++ using coding style. Our Code Stylometry Feature Set is a novel representation of coding style found in source code that reflects coding style from properties derived from abstract syntax trees.

Our random forest and abstract syntax tree-based approach attributes more authors (1,600 and 250) with significantly higher accuracy (94% and 98%) on a larger data set (Google Code Jam) than has been previously achieved. Furthermore, these novel features are robust, difficult to obfuscate, and can be used in other programming languages, such as Python. We also find that (i) the code resulting from difficult programming tasks is easier to attribute than easier tasks and (ii) skilled programmers (who can complete the more difficult tasks) are easier to attribute than less skilled programmers.

Aylin Caliskan-Islam, Drexel University

Richard Harang, U.S. Army Research Laboratory

Andrew Liu, University of Maryland

Arvind Narayanan, Princeton University

Clare Voss, U.S. Army Research Laboratory

Fabian Yamaguchi, University of Goettingen

Rachel Greenstadt, Drexel University

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

BibTeX
@inproceedings {190844,
author = {Aylin Caliskan-Islam and Richard Harang and Andrew Liu and Arvind Narayanan and Clare Voss and Fabian Yamaguchi and Rachel Greenstadt},
title = {De-anonymizing Programmers via Code Stylometry},
booktitle = {24th USENIX Security Symposium (USENIX Security 15)},
year = {2015},
isbn = {978-1-939133-11-3},
address = {Washington, D.C.},
pages = {255--270},
url = {https://www.usenix.org/conference/usenixsecurity15/technical-sessions/presentation/caliskan-islam},
publisher = {USENIX Association},
month = aug,
}
Download
Caliskan-Islam PDF

Presentation Video 

Presentation Audio

MP3 Download

Download Audio

  • Log in or    Register to post comments

Open access to the USENIX Security '15 videos sponsored by Symantec.

Platinum Sponsors

Gold Sponsors

Silver Sponsors

Bronze Sponsors

General Sponsors

Media Sponsors & Industry Partners

Open Access Publishing Partner

© USENIX

  • Privacy Policy
  • Contact Us