The Adverse Effects of Omitting Records in Differential Privacy: How Sampling and Suppression Degrade the Privacy–Utility Tradeoff

Àlex Miranda-Pascual, Kalrsruhe Institute of Technology and Universitat Politècnica de Catalunya; Javier Parra-Arnau, Universitat Politècnica de Catalunya; Thorsten Strufe, Karlsruhe Institute of Technology

Sampling is renowned for its privacy amplification in differential privacy (DP), and is often assumed to improve the utility of a DP mechanism by allowing a noise reduction. In this paper, we further show that this last assumption is flawed: When measuring utility at equal privacy levels, sampling as preprocessing consistently yields penalties due to utility loss from omitting records over all canonical DP mechanisms—Laplace, Gaussian, exponential, and report noisy max— , as well as recent applications of sampling, such as clustering.

Extending this analysis, we investigate suppression as a generalized method of choosing, or omitting, records. Developing a theoretical analysis of this technique, we derive privacy bounds for arbitrary suppression strategies under unbounded approximate DP. We find that our tested suppression strategy also fails to improve the privacy–utility tradeoff. Surprisingly, uniform sampling emerges as one of the best suppression methods—despite its still degrading effect. Our results call into question common preprocessing assumptions in DP practice.

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.