High Accuracy and High Fidelity Extraction of Neural Networks

Matthew Jagielski; Nicholas Carlini; David Berthelot; Alex Kurakin; Nicolas Papernot

Matthew Jagielski, Northeastern University, Google Brain; Nicholas Carlini, David Berthelot, Alex Kurakin, and Nicolas Papernot, Google Brain

In a model extraction attack, an adversary steals a copy of a remotely deployed machine learning model, given oracle prediction access. We taxonomize model extraction attacks around two objectives: accuracy, i.e., performing well on the underlying learning task, and fidelity, i.e., matching the predictions of the remote victim classifier on any input.

To extract a high-accuracy model, we develop a learning-based attack exploiting the victim to supervise the training of an extracted model. Through analytical and empirical arguments, we then explain the inherent limitations that prevent any learning-based strategy from extracting a truly high-fidelity model—i.e., extracting a functionally-equivalent model whose predictions are identical to those of the victim model on all possible inputs. Addressing these limitations, we expand on prior work to develop the first practical functionally-equivalent extraction attack for direct extraction (i.e., without training) of a model's weights.

We perform experiments both on academic datasets and a state-of-the-art image classifier trained with 1 billion proprietary images. In addition to broadening the scope of model extraction research, our work demonstrates the practicality of model extraction attacks against production-grade systems.

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

BibTeX

@inproceedings {251526,
author = {Matthew Jagielski and Nicholas Carlini and David Berthelot and Alex Kurakin and Nicolas Papernot},
title = {High Accuracy and High Fidelity Extraction of Neural Networks},
booktitle = {29th USENIX Security Symposium (USENIX Security 20)},
year = {2020},
isbn = {978-1-939133-17-5},
pages = {1345--1362},
url = {https://www.usenix.org/conference/usenixsecurity20/presentation/jagielski},
publisher = {USENIX Association},
month = aug
}

Download

Jagielski PDF

Jagielski Paper (Prepublication) PDF

View the slides

High Accuracy and High Fidelity Extraction of Neural Networks

Open Access Media

Presentation Video