Actions Speak Louder than Words: Entity-Sensitive Privacy Policy and Data Flow Analysis with PoliCheck

Authors: 

Benjamin Andow, IBM T.J. Watson Research Center; Samin Yaseer Mahmud, Justin Whitaker, William Enck, and Bradley Reaves, North Carolina State University; Kapil Singh, IBM T.J. Watson Research Center; Serge Egelman, U.C. Berkeley; ICSI; AppCensus Inc.

Abstract: 

Identifying privacy-sensitive data leaks by mobile applications has been a topic of great research interest for the past decade. Technically, such data flows are not “leaks” if they are disclosed in a privacy policy. To address this limitation in automated analysis, recent work has combined program analysis of applications with analysis of privacy policies to determine the flow-to-policy consistency, and hence violations thereof. However, this prior work has a fundamental weakness: it does not differentiate the entity (e.g., first-party vs. third-party) receiving the privacy-sensitive data. In this paper, we propose POLICHECK, which formalizes and implements an entity-sensitive flow-to-policy consistency model. We use POLICHECK to study 13,796 applications and their privacy policies and find that up to 42.4% of applications either incorrectly disclose or omit disclosing their privacy-sensitive data flows. Our results also demonstrate the significance of considering entities: without considering entity, prior approaches would falsely classify up to 38.4% of applications as having privacy-sensitive data flows consistent with their privacy policies. These false classifications include data flows to third-parties that are omitted (e.g., the policy states only the first-party collects the data type), incorrect (e.g., the policy states the third-party does not collect the data type), and ambiguous (e.g., the policy has conflicting statements about the data type collection). By defining a novel automated, entity-sensitive flow-to-policy consistency analysis, POLICHECK provides the highest-precision method to date to determine if applications properly disclose their privacy-sensitive behaviors.

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

BibTeX
@inproceedings {247632,
author = {Benjamin Andow and Samin Yaseer Mahmud and Justin Whitaker and William Enck and Bradley Reaves and Kapil Singh and Serge Egelman},
title = {Actions Speak Louder than Words: {Entity-Sensitive} Privacy Policy and Data Flow Analysis with {PoliCheck}},
booktitle = {29th USENIX Security Symposium (USENIX Security 20)},
year = {2020},
isbn = {978-1-939133-17-5},
pages = {985--1002},
url = {https://www.usenix.org/conference/usenixsecurity20/presentation/andow},
publisher = {USENIX Association},
month = aug
}

Presentation Video