POLICYCOMP: Counterpart Comparison of Privacy Policies Uncovers Overbroad Personal Data Collection Practices


Lu Zhou, Xidian University and Shanghai Jiao Tong University; Chengyongxiao Wei, Tong Zhu, and Guoxing Chen, Shanghai Jiao Tong University; Xiaokuan Zhang, George Mason University; Suguo Du, Hui Cao, and Haojin Zhu, Shanghai Jiao Tong University


Since mobile apps' privacy policies are usually complex, various tools have been developed to examine whether privacy policies have contradictions and verify whether privacy policies are consistent with the apps' behaviors. However, to the best of our knowledge, no prior work answers whether the personal data collection practices (PDCPs) in an app's privacy policy are necessary for given purposes (i.e., whether to comply with the principle of data minimization). Though defined by most existing privacy regulations/laws such as GDPR, the principle of data minimization has been translated into different privacy practices depending on the different contexts (e.g., various developers and targeted users). In the end, the developers can collect personal data claimed in the privacy policies as long as they receive authorizations from the users.

Currently, it mainly relies on legal experts to manually audit the necessity of personal data collection according to the specific contexts, which is not very scalable for millions of apps. In this study, we aim to take the first step to automatically investigate whether PDCPs in an app's privacy policy are overbroad from the perspective of counterpart comparison. Our basic insight is that, if an app claims to collect much more personal data in its privacy policy than most of its counterparts, it is more likely to be conducting overbroad collection. To achieve this, POLICYCOMP, an automatic framework for detecting overbroad PDCPs is proposed. We use POLICYCOMP to perform a large-scale analysis on 10,042 privacy policies and flag 48.29% of PDCPs to be overbroad. We shared our findings with 2,000 app developers and received 52 responses from them, 39 of which acknowledged our findings and took actions (e.g., removing overbroad PDCPs).

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

@inproceedings {285365,
author = {Lu Zhou and Chengyongxiao Wei and Tong Zhu and Guoxing Chen and Xiaokuan Zhang and Suguo Du and Hui Cao and Haojin Zhu},
title = {{POLICYCOMP}: Counterpart Comparison of Privacy Policies Uncovers Overbroad Personal Data Collection Practices},
booktitle = {32nd USENIX Security Symposium (USENIX Security 23)},
year = {2023},
isbn = {978-1-939133-37-3},
address = {Anaheim, CA},
pages = {1073--1090},
url = {https://www.usenix.org/conference/usenixsecurity23/presentation/zhou-lu},
publisher = {USENIX Association},
month = aug

Presentation Video