Session 1: Semi-Automated Extraction of Data Practice Statements from Natural Language Privacy Policies

1:00 pm–2:15 pm


Privacy policies are known to be long and difficult to read and understand. This session will provide an overview of crowdsourcing, machine learning and natural language processing techniques developed to extract data practice statements from privacy policies. It will include a discussion of major findings and also introduce several large-scale data sets and interactive web-based tools released or soon-to-be released to the research community on our Explore website ( One of these tools relies on automated annotation techniques to interactively generate privacy reports for any of the top Alexa 10,000 websites, including questions about opt-out choices available to users. The session will also feature some group exercises around the use of the tools and a discussion of opportunities to conduct large-scale analyses of privacy policies.

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

@conference {205185,
title = {Session 3: Personalized Privacy Assistants and Infrastructure for {IoT}},
year = {2017},
address = {Santa Clara, CA},
publisher = {USENIX Association},
month = jul,