Mitigating Membership Inference Attacks by Self-Distillation Through a Novel Ensemble Architecture


Xinyu Tang, Saeed Mahloujifar, and Liwei Song, Princeton University; Virat Shejwalkar, Milad Nasr, and Amir Houmansadr, University of Massachusetts Amherst; Prateek Mittal, Princeton University


Membership inference attacks are a key measure to evaluate privacy leakage in machine learning (ML) models. It is important to train ML models that have high membership privacy while largely preserving their utility. In this work, we propose a new framework to train privacy-preserving models that induce similar behavior on member and non-member inputs to mitigate membership inference attacks. Our framework, called SELENA, has two major components. The first component and the core of our defense is a novel ensemble architecture for training. This architecture, which we call Split-AI, splits the training data into random subsets, and trains a model on each subset of the data. We use an adaptive inference strategy at test time: our ensemble architecture aggregates the outputs of only those models that did not contain the input sample in their training data. Our second component, Self-Distillation, (self-)distills the training dataset through our Split-AI ensemble, without using any external public datasets. We prove that our Split-AI architecture defends against a family of membership inference attacks, however, our defense does not provide provable guarantees against all possible attackers as opposed to differential privacy. This enables us to improve the utility of models compared to DP. Through extensive experiments on major benchmark datasets we show that SELENA presents a superior trade-off between (empirical) membership privacy and utility compared to the state of the art empirical privacy defenses. In particular, SELENA incurs no more than 3.9% drop in classification accuracy compared to the undefended model while reducing the membership inference attack advantage by a factor of up to 3.7 compared to MemGuard and a factor of up to 2.1 compared to adversarial regularization.

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

@inproceedings {280000,
author = {Xinyu Tang and Saeed Mahloujifar and Liwei Song and Virat Shejwalkar and Milad Nasr and Amir Houmansadr and Prateek Mittal},
title = {Mitigating Membership Inference Attacks by {Self-Distillation} Through a Novel Ensemble Architecture},
booktitle = {31st USENIX Security Symposium (USENIX Security 22)},
year = {2022},
isbn = {978-1-939133-31-1},
address = {Boston, MA},
pages = {1433--1450},
url = {},
publisher = {USENIX Association},
month = aug

Presentation Video