PEPR '22 Conference Program

All the times listed below are in Pacific Daylight Time (PDT).

Attendee Files 
PEPR '22 Attendee List (PDF)

Thursday, June 23

8:00 am–9:00 am

Continental Breakfast

9:00 am–9:15 am

Opening Remarks

Divya Sharma, Google, and Blase Ur, University of Chicago

9:15 am–10:45 am

Differential Privacy

Negotiating Privacy/Utility Trade-Offs under Differential Privacy

Thursday, 9:15 am9:35 am

Gerome Miklau, CEO/Founder, Tumult Labs and Professor, University of Massachusetts Amherst

Available Media

Differential privacy is a model of privacy protection currently being adopted by commercial enterprises and government institutions. Using differential privacy, data custodians can share data in new ways while quantifying the potential privacy loss incurred by individuals present in the data. However the choice to limit privacy loss in the model of differential privacy must be weighed against the impact on the accuracy of the data shared.

The full complexity of making choices about privacy/utility trade-offs has rarely been considered by the research community. Using a real case study of Internal Revenue Service data shared with the Department of Education, we describe the social and technical challenges faced by data custodians as they negotiate with data consumers to establish standards for data release.

Gerome Miklau, Tumult Labs and University of Massachusetts Amherst

Gerome Miklau is CEO and co-founder of Tumult Labs, whose mission is expanding the use and sharing of data while respecting individual privacy. He is also a Professor in the College of Information and Computer Sciences at the University of Massachusetts Amherst, where his research focuses on private, secure, and equitable data management. He received the ACM PODS Test-of-Time Award in 2020 and 2012, the Best Paper Award at the International Conference of Database Theory in 2013, and an NSF CAREER Award in 2007.

Publishing Wikipedia Project Usage Data with Strong Privacy Protections and without Tracking

Thursday, 9:35 am10:00 am

Hal Triedman, Privacy Engineer, Wikimedia Foundation

Available Media

The Wikimedia Foundation places a premium on protecting reader and editor privacy while being maximally transparent and releasing data to support users, developers, and researchers. Datasets containing sensitive information historically have been kept private. Differential privacy offers a promising approach to safely releasing sensitive datasets, which would be valuable for e.g., editors looking to identify important articles to their language and region. This talk will discuss differentially private approaches to releasing data, the technical and social challenges that we have faced so far (Wikimedia doesn’t track users, thus enforcing a maximum number of data points per person is at best approximate), and proposed solutions that avoid tracking users while maintaining privacy guarantees. We want to share findings with the rest of the field and enable differential privacy without tracking cookies.

Hal Triedman, Wikimedia Foundation

Hal is a privacy engineer with the Wikimedia Foundation, implementing and researching issues related to privacy, transparency, and algorithmic fairness. He is interested in what institutional accountability should look like in a world of open, differentially private datasets, and hopes to put tools for differential privacy in the hands of more developers and analysts.

Expanding Differentially Private Solutions: A Python Case Study

Thursday, 10:00 am10:15 am

Vadym Doroshenko, Google

Available Media

Differential privacy in practice proved to be hard: there are a lot of subtle, easy-to-get-wrong implementation nuances, it can be difficult to preserve data utility, choose optimal parameters, interpret results, etc.

In this talk we introduce PipelineDP, a tool that allows Python developers to produce differentially-private versions of their data processing pipelines. PipelineDP was designed with small and massive applications in mind: it can be run both locally and on scalable data processing such as Apache Spark or Beam. Our aspiration is that PipelineDP allows differential privacy to be deployed in production by engineers who are not experts in DP.

Vadym Doroshenko‎, Google

Vadym Doroshenko is a software engineer at Google where he works on building anonymizaiton infrastructure and helping teams to apply anonymization. He is tech lead at PipelineDP (pipelinedp.io). He is passionate about Differential Privacy research and in bringing it to production. He received his PhD in mathematics from Taras Shevchenko National University of Kyiv.

Designing an Open-Source Platform for Differentially Private Analytics That Is Usable, Scalable, and Extensible

Thursday, 10:15 am10:30 am

Michael Hay, Tumult Labs and Colgate University

Available Media

This talk will present Tumult Analytics, a soon-to-be-open-source platform for SQL-like analytics with configurable differential privacy guarantees. It is currently used by a variety of organizations -- including the US Census Bureau, US Internal Revenue Service, and Wikimedia -- to publicly share aggregate statistics about populations of interest. The platform’s design emphasizes usability, especially for data scientists who may be new to differential privacy, but also scalability and expressivity so that it can power production use cases that produce 100s millions of statistics with tight privacy accounting. The platform is supported by multi-layer architecture consisting of a user-friendly dataframe-like interface on top of an extensible privacy framework on top of Apache Spark. This talk will describe the platform with a focus on usability features that help programmers write deployable DP programs safely and quickly.

Michael Hay, Tumult Labs and Colgate University

Michael Hay is the Founder/CTO of Tumult Labs, a startup that helps organizations safely release data using differential privacy, and an Associate Professor of Computer Science at Colgate University. He was previously a Research Data Scientist at the US Census Bureau and a Computing Innovation Fellow at Cornell University. He holds a Ph.D. from the University of Massachusetts Amherst and a bachelor's degree from Dartmouth College.

Compiling Python Programs into Differentially Private Ones

Thursday, 10:30 am10:45 am

Johan Leduc and Nicolas Grislain, Sarus Technologies

Available Media

Working with privacy-sensitive data today, whether it is in health-care, insurance or any other industry, is a complex and slow process often involving long manual reviews by compliance teams. The recent development of differential privacy helped standardize what privacy protection means. As such it has the potential to unlock the automation and scaling of data analysis on privacy sensitive data. To help realize this promise, we designed and built a framework in which an analyst can write data analysis jobs with common data-science tools and languages: SQL, numpy, pandas, scikit-learn, and have them compiled into differentially private jobs executed remotely on the sensitive data. In this talk, we will describe how a user expresses his job declaratively in python and how his python code is analyzed and compiled, before it is run and a result is eventually returned.

Johan Leduc, Sarus

Johan Leduc is a senior data scientist at Sarus Technologies. He graduated from Ecole Polytechnique in 2014. He started his career in the energy sector and switched to data science in 2019. He joined Sarus (YC W22) in 2020 as the first employee and has been working on private synthetic data generation and private data analysis.

Nicolas Grislain, Sarus

Nicolas Grislain is Chief Science Officer at Sarus Technologies. He graduated from École Normale Supérieure de Lyon in Mathematics and Computer Science. Nicolas started his career in economics and finance modeling at the French Treasury and then at Société Générale. He co-founded a first company: AlephD, in 2012, where he was also leading Research and Development. AlephD was acquired by Yahoo in 2016. In 2020 he co-founded Sarus Technologies (YC W22) with the same founding team as AlephD.

10:45 am–11:15 am

Break with Refreshments

11:15 am–12:40 pm

Threat Modeling

Privacy Threat Modeling

Thursday, 11:15 am11:40 am

Cara Bloom, MITRE

Available Media

This applied research talk will discuss the privacy threat modeling gap, challenges and opportunities of privacy threat modeling in practice, and a new qualitative threat model currently under development. In privacy risk management, there are well-respected methods for modeling vulnerabilities and consequences (or harms), but there is no commonly used model nor lexicon for characterizing privacy threats. We will discuss the gap in privacy risk modeling, how privacy threat-informed defense could better protect systems from privacy harms, and a working definition for a “privacy attack.” Then we will present a draft qualitative threat model – the Privacy Threat Taxonomy – developed to fill this gap in privacy risk modeling. This model was generated iteratively and collaboratively using a dataset of almost 150 non-breach privacy events, which includes directed, accidental, and passive attacks on systems. We will also discuss how practitioners can incorporate a threat model into their privacy risk management program.

Cara Bloom, MITRE

Cara Bloom is a Senior Cybersecurity and Privacy Scientist at MITRE where she leads research teams on privacy threat modeling and measuring privacy expectations for emerging technologies. She has provided privacy and cybersecurity expertise on international data protection legislation, autonomous and connected vehicle technology, and primary research on data de-identification and blockchain for identity. Cara has presented at the USENIX Symposium on Usable Privacy and Security, the FTC Data Privacy Day Conference, and the IAPP Global Summit. She holds an MS in Information Security Policy from Carnegie Mellon University and has experience at the Federal Trade Commission and CMU’s CyLab Security and Privacy Institute.

Privacy Audits 101

Thursday, 11:40 am11:55 am

Lauren Reid, President and Principal Consultant, The Privacy Pro

Available Media

Most privacy engineers, lawyers, regulators, and scholars are aware of the importance of privacy audits, but few have been through it as an auditor or auditee. There are often assumptions about what a privacy audit entails and what the results can tell us about the strength of a privacy program or product design. Participants may be surprised to learn about the limitations of a privacy audit and the resources required to complete one. This session will address privacy audit frameworks, methodologies for testing controls, and interpreting audit results. Participants will come away with important points to consider when deciding whether they need an audit and if so, which kind and when.

Lauren Reid, The Privacy Pro

Lauren Reid is President of the boutique privacy and data ethics consulting firm, The Privacy Pro. The Privacy Pro helps global companies implement practical solutions to protect data, comply with laws, and most importantly, respect people. Lauren has over 15 years of global privacy experience, having worked in several countries and industries. She was the Director of Data Governance and Privacy for Sidewalk Labs, Alphabet’s smart city portfolio company. She also led the National Privacy Advisory Services practice for KPMG Canada and was Senior Manager accountable for strategic privacy initiatives at Bank of Montreal, one of Canada’s largest financial institutions.

Privacy Design Flaws

Thursday, 11:55 am12:15 pm

Eivind Arvesen, Sector Alarm

Available Media

Annoyingly, many things can go wrong when it comes to privacy engineering. Even more annoyingly, they often do. To make matters worse, many issues are more complex than mere bugs in the implementation. This talk will address flaws – defects on an architectural or design level – that result in privacy issues. Using real-world examples, we will demonstrate how common flaws lead to undesirable outcomes, and how to intentionally build systems with privacy in mind in order to avoid these flaws when designing a new system or reviewing an existing system.

Eivind Arvesen, Sector Alarm

Eivind is a security and privacy professional who currently heads up the Cyber Security department at one of Europe's leading providers of monitored home security solutions. He holds a master’s degree in computer science (with a focus on machine learning), and has previously worked as software developer and architect, both in-house and as a consultant, on projects ranging from product MVPs to critical infrastructure. In 2020, he was part of a government appointed expert panel tasked with evaluating the Norwegian COVID-19 app.

Privacy Shift Left: A Machine-Assisted Threat Modeling Approach

Thursday, 12:15 pm12:40 pm

Kristen Tan, Comcast NBCUniversal

Available Media

As cybersecurity and privacy have become a core part of product development, there has been a push to shift their implementation left (earlier) in the product development lifecycle. One facet of shifting left is employing threat modeling to identify areas of potential risk in a system’s architecture. While beneficial, though, threat modeling can be labor- and time-intensive. To address this, tools, some of which are open-source, are being developed to automate aspects of the process. These tools have primarily focused on security, but privacy threat detection functionality is being introduced. This talk presents a comparative evaluation of six of these open-source tools. It then introduces possible sources for use in developing a custom library of privacy threats. Finally, it ties the two discussions together by walking through an example of how detection capability for a specific privacy threat can be introduced into one of the six tools.

Kristen Tan, Comcast NBCUniversal

Kristen Tan is a CORE Technology Associate at Comcast NBCUniversal with a M.S. in Computer Science from Stevens Institute of Technology. She is currently rotating at Comcast Cable on an Accessibility team, but prior to this, she rotated on Comcast Cable’s Cybersecurity Research team. Her research during that rotation focused on the emerging field of Privacy Engineering and how it fits into the world of Cybersecurity. Previous engagements in both academia and industry have also given her experience in robotics, smart home technologies, and cloud infrastructure in a production environment. She has co-authored two peer reviewed publications and looks forward to continuing to write about and share her work going forward.

12:40 pm–1:40 pm

Lunch

1:40 pm–2:10 pm

Networking and Birds-of-a-Feather Sessions

2:10 pm–3:35 pm

Consent

Consent on the Fly: Developing Ethical Verbal Consent for Voice Assistants

Thursday, 2:10 pm2:25 pm

William Seymour, King's College London

Available Media

The conundrum of how voice assistants should broker consent to share data with third party software has proven to be a tough problem without a clear solution. Contemporary assistants often require users to switch to other devices in order to navigate permissions menus for their otherwise hands-free voice assistant, as with smartphone apps. More in line with modern smartphones, Alexa now offers "voice-forward consent", allowing users to grant skills access to personal data mid-conversation using speech.

While more usable and convenient than opening a companion app, asking for consent 'on the fly' can undermine several concepts core to the consent process. The intangible nature of voice interfaces also blurs the boundary between parts of an interaction that are controlled by third-party developers from the underlying platforms. This talk draws on the GDPR and work on consent in HCI and Ubicomp to outline and address the problems with brokering consent verbally.

William Seymour, King's College London

William Seymour is a researcher at King's College London, exploring ways of explaining the privacy and security behaviours of AI assistants as part of the EPSRC-funded Secure AI Assistants project. Before this he was part of a project funded by the UK data protection regulator on the future of data protection in smart homes. William recieved his PhD in Cyber Security from the University of Oxford for research exploring ethical issues with IoT devices.

Informing the Design of Cookie Consent Interfaces with Research

Thursday, 2:25 pm2:50 pm

Lorrie Cranor, Carnegie Mellon University

Available Media

Websites frequently deploy cookie consent banners to comply with regulatory requirements. However, many of these consent banners do not actually meet regulatory requirements and may even be considered misleading or classified as dark patterns. Often well-intentioned practitioners use templates from popular cookie-management platforms and assume they are doing the right thing. In this talk I will walk through some of the common compliance mistakes that have been highlighted by privacy advocates and researchers, and provide evidence from large-scale online user studies at Carnegie Mellon University to demonstrate the impact of simple cookie consent banner design changes. We examined the impact on the decision users make, as well as on their comprehension, sentiment, and other factors. I will also talk about some of the commonly used terminology (some of which has been recommended by regulators) that may actually confuse and mislead users.

Lorrie Cranor, Carnegie Mellon University

Lorrie Faith Cranor is the Director and Bosch Distinguished Professor of the CyLab Security and Privacy Institute and FORE Systems Professor of Computer Science and of Engineering and Public Policy at Carnegie Mellon University. She is also co-director of the Collaboratory Against Hate: Research and Action Center. She directs the CyLab Usable Privacy and Security Laboratory (CUPS) and co-directs the Privacy Engineering masters program. In 2016 she served as Chief Technologist at the US Federal Trade Commission and previously she co-founded Wombat Security Technologies. She is a fellow of the ACM, IEEE, and AAAS and a member of the ACM CHI Academy.

Panel

DNS Privacy Vs.

Thursday, 2:50 pm3:35 pm

Mallory Knodel, Center for Democracy & Technology, and Shivan Kaul Sahib, Brave Software

Available Media

There is a need to catalogue and treat the predominant emerging tensions that impact the public interest, introduced by DNS privacy measures. One of the most recent examples is that when internet protocols like DNS-over-HTTPS made DNS lookup more private for the user, it had an initial adverse affect on internet measurements, it consolidated provision to large ISPs and service providers, it made abuse mitigation harder, it broke browser add-ons for accessibility tools, and it provoked internet shutdowns and censorship measures. And yet, it is in the public interest and the interest of the internet protocol standards community to continue pushing privacy-respecting protocols. The panel is a call to the community to properly research tensions as they emerge, informed as much as possible by the effects on end users. A draft paper summarizes these emerging tensions in the case of Private DNS and points towards better mitigations in the public interest.

Mallory Knodel, Center for Democracy & Technology

Mallory Knodel is the CTO at the Center for Democracy & Technology in Washington, DC. She is the co-chair of the Human Rights and Protocol Considerations research group of the Internet Research Task Force, co-chair of the Stay Home Meet Only Online working group of the IETF and an advisor to the Freedom Online Coalition. Mallory takes a human rights, people-centred approach to technology implementation and cybersecurity policy advocacy. Originally from the US, she has worked with grassroots organisations around the world. She has used free software throughout her professional career and considers herself a public interest technologist. She holds a BS in Physics and Mathematics and an MA in Science Education.

Shivan Kaul Sahib, Brave Software

Shivan Kaul Sahib works on privacy at Brave Software, where he focuses on shipping privacy features in the browser and conducting privacy reviews across the company. He is active in the IETF and W3C and previously worked on DNS traffic encryption and consent tooling. He has a keen interest in public interest technology and was previously a fellow at ARTICLE 19, a free expression charity in the UK. He studied Software Engineering at McGill University.

3:35 pm–4:05 pm

Break with Refreshments

4:05 pm–5:40 pm

Privacy Labels

Helping Mobile App Developers Create Accurate Privacy Labels

Thursday, 4:05 pm4:30 pm

Jack Gardner and Akshath Jain, Carnegie Mellon University

Available Media

This talk is based on research conducted in collaboration with Yuanyuan Feng, Kayla Reiman, Zhi Lin, and Norman Sadeh.

Apple and Google recently began requiring developers to disclose their data collection and use practices to generate a “privacy label” for their applications. The use of mobile application Software Development Kits (SDKs) and third-party libraries, coupled with a typical lack of expertise in privacy, makes it challenging for developers to accurately report their data collection and use practices. In this presentation we discuss the design and evaluation of a tool to help iOS developers generate privacy labels. The tool combines static code analysis to identify likely data collection and use practices with interactive functionality designed to prompt developers to elucidate analysis results and carefully reflect on their applications' data practices. We conducted semi-structured interviews with iOS developers as they used an initial version of the tool. We discuss how these results motivated us to develop an enhanced software tool, Privacy Label Wiz, that more closely resembles interactions developers reported to be most useful in our semi-structured interviews. We present findings from our interviews and the enhanced tool motivated by our study. We also outline future directions for software tools to better assist developers communicating their mobile application’s data practices.

Jack Gardner, Carnegie Mellon University

Jack Gardner is a Master’s student in Privacy Engineering at Carnegie Mellon University. He is studying what resources are available to mobile application developers and how software tools can support them as they work to achieve privacy compliance. Jack’s other interests include integrating privacy into the software development lifecycle and promoting user adoption of privacy enhancing technologies.

Akshath Jain, Carnegie Mellon University

Akshath Jain is currently a Master’s student at Carnegie Mellon studying Computer Science. Having recently finished his undergraduate work there as well, he’s investigating how developers and users interact with privacy policies and their associated privacy labels. He’s also researching how automation strategies can help generate privacy labels to ensure compliance with privacy regulations. In his free time, you can often find him biking and playing tennis.

Creating Effective Labels

Thursday, 4:30 pm4:45 pm

Shakthi Gopavaram, Indeed.com

Available Media

This talk is based on research conducted in collaboration with Ece Gumusel, Peter Caven, Jayati Dev, and L Jean Camp.

This talk is motivated in part by President Biden's Executive Order on Improving the Nation's Cybersecurity, which required the National Institute of Standards and Technology (NIST) to issue labeling guidance for software and Internet of Things (IoT) devices. The focus of this presentation is on how to create more effective and usable labels. A consumer's perception of a label is influenced by the usability of the interaction, embedded information, as well as common psychological biases in decision-making. Often, even technically proficient and knowledgeable consumers are not able to make accurate comparisons across a products' provision of security or privacy. In this presentation, we integrate research from behavioral economics, usable security, and risk communication design to inform the creation of effective labels.

Shakthi Gopavaram, Indeed.com

Shakthidhar Gopavaram is a doctoral candidate in Computer Science at Indiana University's Luddy School of Informatics, Computing, and Engineering, graduating in May of 2022. He is a full-stack dev with expertise in data science, human-subjects experiment design, and labeling systems. He is currently working at Indeed.com.

Smart Home Privacy

Three Years of Crowdsourcing Network Traffic from Smart Homes

Thursday, 4:45 pm5:10 pm

Danny Yuxing Huang, New York University

Available Media

In April 2019, we released IoT Inspector, an open-source tool for everyday consumers to visualize their smart home network traffic and identify potential security and privacy risks. Since then, 5,500+ users have used IoT Inspector to monitor 55,000+ devices and shared a subset of the traffic with us, effectively contributing to the largest known non-proprietary dataset of smart home network traffic.

In this talk, we will discuss some of the practical challenges we faced while maintaining and operating IoT Inspector in the past three years. Unlike our IMWUT/Ubicomp 2020 paper (which focuses on the initial design and preliminary data), this talk will highlight unexpected real-world issues, including identifying devices, communicating risks, allowing users to take actions, incentivizing users, respecting user privacy, and label quality. We hope our lessons will benefit researchers interested in studying real-world security and privacy issues through crowdsourcing

Danny Yuxing Huang, New York University

Danny Yuxing Huang is an Assistant Professor at New York University’s Electrical and Computer Engineering Department. He is generally interested in building networking systems to crowdsource hidden security and privacy issues from real-world consumers. Before joining New York University, he was a postdoctoral fellow at Princeton University’s Center for Information Technology Policy. He obtained his PhD from University of California, San Diego.

Informing the Design of Privacy Awareness Mechanisms for Users and Bystanders in Smart Homes

Thursday, 5:10 pm5:25 pm

Yaxing Yao, University of Maryland, Baltimore County

Available Media

The opaque data practices in smart home devices have raised significant privacy concerns for smart home users and bystanders. One way to learn about the data practices is through privacy-related notifications. However, how to deliver these notifications to users and bystanders and increase their awareness of data practices is not clear. In this talk, we present our recent research on users' and bystanders' responses to four mechanisms that improve privacy awareness. We will demo a subset of the mechanisms, present our findings, discuss the conflicting expectations between users and bystanders, and draw implications for researchers and practitioners on how to design privacy awareness mechanisms for users and bystanders in smart homes.

Yaxing Yao, University of Maryland, Baltimore County

Yaxing Yao is an Assistant Professor in the Department of Information Systems at the University of Maryland, Baltimore County. His research focuses on understanding privacy risks and people’s privacy concerns in emerging technologies and contexts (e.g., smart homes, social VR), then designing and evaluating privacy mechanisms to protect people’s privacy. He is particularly interested in the tension among different stakeholders in these technologies (e.g., users and bystanders) and seeks to support the privacy needs of all stakeholders. He is also interested in the privacy issues in underrepresented populations, such as children and people with disabilities.

Privacy-Preserving Protocols for Smart Camera Systems and Other IoT Devices

Thursday, 5:25 pm5:40 pm

Yohan Beugin, The Pennsylvania State University

Available Media

Smart camera systems are used by millions of consumers to monitor their homes and businesses. However, the architecture and design of current commercial systems require users to relinquish control over their data to untrusted third parties, without users' knowledge or explicit consent---thereby violating users' privacy.

In this talk, we will present how we designed a privacy-preserving smart camera system that returns controls of the system to users while still enabling popular features found in commercial systems. We will then show in the second part of the talk how our techniques and protocols can also be extended to other IoT devices recording real time data and time series such as; temperature, humidity level, heartbeat monitoring, etc.

Yohan Beugin, The Pennsylvania State University

Yohan Beugin is a Ph.D. Student in the Department of Computer Science and Engineering at the Pennsylvania State University. He is a member of the Systems and Internet Infrastructure Security Laboratory, and advised by Prof. Patrick McDaniel. He has received his M.S. in Computer Science from Penn State as well as his Diplôme d’Ingénieur (M.S. and B.S. in Engineering Sciences) from the French Engineering School École Centrale de Lyon. His research focuses on the security and privacy of computer systems. He is mainly interested in building more secure, privacy-preserving, and trustworthy systems.

5:40 pm–7:10 pm

Conference Reception

Sponsored by Ethyca

Friday, June 24

8:00 am–9:00 am

Continental Breakfast

9:00 am–10:20 am

Privacy Research

Data Privacy Vocabulary (DPV): Concepts for Legal Compliance

Friday, 9:00 am9:15 am

Harshvardhan J. Pandit, ADAPT Centre, Trinity College Dublin, and Chair, W3C Data Privacy Vocabularies and Controls Community Group (DemVCG)

Available Media

The Data Privacy Vocabulary (DPV) enables expressing machine-readable metadata about the use and processing of personal data for declarative use in systems, documents, processes, and logic for assisting with requirements of legal compliance, such as the General Data Protection Regulation (GDPR). The DPV is the most extensive collection of its kind, a community effort, an output of W3C DPVCG, and is open and available at https://w3id.org/dpv.

Harshvardhan J. Pandit, ADAPT Centre, Trinity College Dublin, and W3C Data Privacy Vocabularies and Controls Community Group (DemVCG)

Harsh(vardhan) is a Postdoctoral Researcher at Trinity College Dublin exploring the application of semantics to real-world challenges associated with privacy risks, legal and regulatory compliance, and consent. His PhD (Computer Science, Trinity College Dublin) explored the application of linked data and semantic web technologies towards GDPR compliance, with a particular focus on consent and provenance. He currently co-chairs the W3C Data Privacy Vocabularies and Controls Community Group (DPVCG) -- which works on creating interoperable vocabularies for personal data handling based on legal and practical requirements, and the W3C Consent Community Group (CONSENT) which has recently started its work on improving the experience of digital consent and consenting. He also contributes to ISO/IEC efforts on consent and privacy standardisation through the National Standards Authority of Ireland.

Integrating Differential Privacy and Contextual Integrity

Friday, 9:15 am9:30 am

Sebastian Benthall, New York University

Available Media

Differential privacy (DP) is an algorithmic privacy technique that incorporates noise parameters and probabilistic uncertainty, but provides little to no guidance as to the choice of these model parameters in practice. Contextual integrity (CI) instead theorizes privacy as appropriate information flow based on norms that inhere in social contexts. We propose a hybrid theory of DP and CI that can better inform privacy by design in practice. We augment the CI framework with an additional information norm parameter, transmission property, which denotes the quantitative form of the information flow, such as “with differential privacy” or “with 95% confidence”. We use this method to develop a way to define assumptions about contextual purposes and societal values, and to solve for the optimal information norms. DP’s continuous information design can support the purposes of some social spheres better than the coarse-grained information flows understood by CI. We apply this framework to three cases: the U.S. census, medical data sharing, and federated learning.

Sebastian Benthall, New York University

Dr. Sebastian Benthall is a Senior Research Fellow at the Information Law Institute at New York School of Law, as well as National Science Foundation Postdoctoral Research Fellow in Social, Behavioral, and Economic Sciences.

A Closer Look: Evaluating Location Privacy Empirically

Friday, 9:30 am9:50 am

Liyue Fan, UNC Charlotte

Available Media

The breach of users’ location privacy can be catastrophic. To provide users with privacy protection, numerous location privacy methods have been developed. While several literature surveys exist in this field, the lack of comparative empirical evaluations imposes challenges for adopting location privacy by applications and researchers in a wide range of domains. This talk presents our recent study which fills the gap by evaluating location privacy with real-world datasets. For utility evaluation, we consider various types of measures, such as distortion-based and count-based measures, as well as individual's mobility patterns; for privacy protection evaluation, we design two empirical privacy risk measures via inference and re-identification attacks. Furthermore, we study the computational overheads incurred by location privacy. The results show that it is possible to strike a balance between utility and privacy when sharing location data with untrusted servers.

Liyue Fan, UNC Charlotte

Dr. Liyue Fan is an Assistant Professor in Computer Science at the University of North Carolina at Charlotte. With a background in mathematics and computer science, Dr. Fan's research is at the intersection of data privacy and spatio-temporal databases. She was named one of the "Rising Stars in EECS". Her current research activities are supported by the National Science Foundation and UNC Charlotte.

Building a Privacy Testbed

Friday, 9:50 am10:05 am

Dr. Partha Das Chowdhury and Professor Awais Rashid, University of Bristol

Available Media

How do we assure that a software application preserves privacy properties – whether it is with regards to contact tracing in a pandemic, information flows from AI assistants or leakage from client-side scanning of end-to-end encryption services? Formal verification is one means but often not within the purview of most software developers and intractable for end-users and regulators. Privacy researchers also require easy and repeatable ways of generating data on privacy properties of applications and third party APIs. We present experiences of building a first of its kind privacy testbed as part of the UK’s National Research Centre on Privacy, Harm Reduction and Adversarial Influence Online. We will discuss our software-defined networking architecture, design challenges encountered, how these have been overcome and ongoing work on integrating the captured traffic with privacy frameworks. We will also discuss plans for the testbed to be widely accessible to the privacy research community.

Partha Das Chowdhury, University of Bristol

Dr. Partha Das Chowdhury is a Research Associate in the REPHRAIN centre at the University of Bristol, UK. His research interests are in privacy enhancing technologies, intersection of PETs and social sciences, secure software development, security protocols and adoption of tools to protect citizens from online harm. He has published at IEEE SecDev, ESORICS SecPre and the Security Protocols workshop held annually at Cambridge, UK. He has over a decade of industrial technology implementation experience for various client organisations including Tata Steel, city traffic management systems and mining majors. He was associated with the Center for 4th Industrial revolution, an initiative of the World Economic Forum as one of the contributors for the Blockchain toolkit. He was invited as an expert to the Commonwealth Working Group on Interception of Communication and Related Matters at Marlboro House, London in 2005. He also served as an invited panellist at the Vizag Fintech summit 2018, an annual event of the Government of Andhra Pradesh, India. Partha has been a recipient of Army Commander's commendation, Eastern Command, Indian Army, 2021.

Awais Rashid, University of Bristol

Professor Awais Rashid is Director of the REPHRAIN Centre which includes > 80 researchers across 11 universities as well as 25 partners from government, industry and third sector organisations. He is also Director of the EPSRC Centre for Doctoral Training in Trust, Identity, Privacy & Security at Scale with an annual minimum intake of 10 doctoral students per year. His research interests are in secure software development, security of cyber-physical systems and human factors. He has published on security and privacy in software systems at major venues such as International Conference on Software Engineering, PoPETS, ACM CHI and USENIX SOUPS as well as journals such as IEEE Transactions on Software Engineering, IEEE Transactions on Information Forensics and Security, ACM Transactions on Privacy and Security, ACM Transactions on Software Engineering Methodology. Rashid has extensive experience of building testbeds. He led the development of the state-of-the-art testbed on cyber-physical systems, the leading facility in the UK (currently in its 3rd generation), which underpins multiple major programmes of research.

How Developers (Don't) Think about Gender Privacy

Friday, 10:05 am10:20 am

Elijah Bouma-Sims, Carnegie Mellon University

Available Media

Disclosing gender online can present serious accessibility and privacy concerns for members of marginalized groups. Despite these issues, non-inclusive (and likely unnecessary) gender disclosure forms are widespread in computing applications. Developers who implement gender disclosure forms may be unaware of the privacy implications of using gender in programming. I will present results from an interview study with developers and a complementary analysis of Reddit posts on developer target sub-forms. I will conclude by discussing how changes in software engineering education could improve the status quo in order to better respect users.

Elijah Bouma-Sims, Carnegie Mellon University

Hi! I'm Elijah and I am a PhD student in Societal Computing at the Carnegie Mellon School of Computer Science. I completed a B.S. in Computer Engineering and a B.A. in History from North Carolina State University in 2021. My research interests lie in usable security and privacy, particularly for marginalized groups.

10:20 am–10:50 am

Break with Refreshments

10:50 am–12:25 pm

Privacy at Scale

Data Mapping at a Billion Dollar Self-Driving Vehicle Startup

Friday, 10:50 am11:15 am

Marc-Antoine Paré, Cruise

Available Media

At Cruise, Privacy Engineering faces a large scale data mapping challenge: maintaining visibility on petabytes of sensitive data collected daily–location, imagery, and copious metadata. If that weren’t enough, data mapping must be done in the context of a rapidly changing product as we launch self-driving cars to the public. This talk describes how we did it.

In more detail, we summarize the background research that informed our approach to data mapping. Then, we cover key aspects of the technical implementation: a custom web application, high accuracy detectors for Cruise-specific sensitive data, and a validation stack for machine-assisted label verification. Finally, we present the technical privacy risk mitigations we have built on top of data mapping. These pieces all fit together into a foundational capability for Cruise Privacy Engineering, greatly aiding our ability to understand the flow of sensitive data in a way that is not possible without automation.

Marc-Antoine Paré, Cruise

Marc-Antoine Paré is a Senior Privacy Engineer at Cruise. Previously, he was the technical lead for the Department of Energy’s “Energy Data Vault”, which brought differential privacy to the energy efficiency sector.

Bringing Content Blocking to the Masses: Dealing with Filter List Development, Maintenance, and Compatibility for 50 Million Users

Friday, 11:15 am11:40 am

Shivan Kaul Sahib and Anton Lazarev, Brave Software

Available Media

Research has demonstrated the privacy, security, performance and UX benefits of filter-list-based content filtering on the Web. However, content blocking tools also break websites, and are (mostly) maintained by volunteers, who have a range of expertise and motivation. As a result, content blocking tools have ended up being mostly used by expert (or at least non-beginner) users; for “normies”, the web isn’t as nice as it could be.

This talk will discuss how Brave closes this gap and brings filter list content blocking to its 50 million DAU, newbies and experts alike. We’ll focus on how Brave uses a combination of domain-specific engineering, crowd engagement and research (much of it published in venues like USENIX) to improve the coverage and compatibility of filter lists.

Shivan Kaul Sahib, Brave Software

Shivan Kaul Sahib works on privacy at Brave Software, where he focuses on shipping privacy features in the browser and conducting privacy reviews across the company. He is active in the IETF and W3C and previously worked on DNS traffic encryption and consent tooling. He has a keen interest in public interest technology and was previously a fellow at ARTICLE 19, a free expression charity in the UK. He studied Software Engineering at McGill University.

Anton Lazarev, Brave Software

Anton Lazarev is a Research Engineer at Brave Software. He is responsible for many of Brave’s content blocking features, including CNAME uncloaking and cosmetic filtering in Brave’s Rust-based adblocker. In his free time, he is active in the open source community, frequently contributing to software that respects users’ interests. He studied Computer Engineering and Computer Science at Northeastern University.

Lyft and the California Consumer Privacy Act

Friday, 11:40 am12:00 pm

Shankar Garikapati and Alejo Sutro, Lyft

Available Media

Data export, management, and deletion are complex processes with multiple feedback loops. This is exacerbated in a microservices-based environment. In this talk, we will discuss the Lyft data export and erasure architectures and its interactions with our data management systems. The solutions we developed took into account Lyft’s dynamic and fast-paced product and engineering environment. To make this system operationally efficient, our goal was to seamlessly integrate with Legal, Fraud, and Safety. The system also needed to respect user data rights while protecting against fraudulent activities and safety concerns. We’ll share some insights into the hardest technical challenges and surprises we had to overcome.

Shankar Garikapatti, Lyft

Shankar Garikapati is a staff engineer in the Data Privacy team at Lyft. Shankar joined Lyft as a security engineer and got a lucky break into privacy with Lyft having to implement the CCPA mandate. He helped design and build various privacy related systems at Lyft.

Alejo Sutro, Lyft

Alejo Grigera Sutro is a staff privacy analyst on the Data Privacy team at Lyft. He started his privacy journey at Google over 10 years ago, where he managed a team of Privacy Engineers specializing in data collection, storage, and access control. He’s a Certified Information Privacy Professional (US, T) with 4 U.S. patents and 2 publications. He joined Lyft in 2021 to help build the future of privacy in transportation. In his spare time you can find Alejo refurbishing technology for kids in need and teaching courses on financial literacy.

Automating Product Deprecation

Friday, 12:00 pm12:25 pm

Will Shackleton, Meta (Facebook)

Available Media

Heavily integrated products like social networks are composed of many interconnected products and features. Over time, new features are built and deployed, and eventually their usage can drop to a point where some of these features need to be turned off and removed. Meta is establishing a set of best practices and processes for deprecating products to ensure that these operations are conducted safely and rigorously.

This talk will address how Meta handles “Product Deprecation”, the process of turning off an interconnected feature and deleting its code and user data safely and automatically. Topics we will discuss include: how we automate removal of dead data, why we need automated dead code removal as well, and how we have provided engineers with tooling to understand how code and data are being used to allow them to make decisions that unblock these automated removal systems.

Will Shackleton, Meta

Will Shackleton is a Software Engineer on the Privacy Infrastructure Team at Meta in London. He has spent the past few years building automation to remove code and data, as well as building tooling to help engineers remove products safely and effectively. Will also maintains Facebook’s Tor Onion Service. In his free time he plays baritone saxophone in a jazz band in London.

12:25 pm–1:30 pm

Networking Lunch

1:30 pm–2:40 pm

Designing for Privacy

Identifying Personal Data by Fusing Elasticsearch with Neural Networks

Friday, 1:30 pm1:55 pm

Rakshit Wadhwa and Ryan Turner, Twitter

Available Media

A critical aspect of incorporating data privacy is the process of classifying personal or sensitive data. At Twitter, the personal data protection (PDP) annotation system provides a solution to automatically classify columns of data in databases. Listen in as we explore the annotation system that combines Elasticsearch queries with a neural network to provide probabilistically calibrated predictions on PDP data types for every column.

Rakshit Wadhwa, Twitter

Rakshit Wadhwa is a Senior Software Engineer in Privacy Tooling and Infra team at Twitter, working on data privacy challenges within Twitter. Ryan Turner is a Senior ML Researcher in the Cortex team at Twitter, with a special interest in ML and Privacy intersection.

Differentially Private Algorithms for 2020 Decennial Census Detailed DHC Race & Ethnicity

Friday, 1:55 pm2:15 pm

Samuel Haney, Tumult Labs, and Rachel Marks, U.S. Census Bureau

Available Media

This talk describes proposed differentially private (DP) algorithms that the U.S. Census Bureau is considering to release the Detailed Demographic and Housing Characteristics (DDHC) Race & Ethnicity tabulations from the 2020 Census. The tabulations contain hundreds of millions of statistics (counts) of demographic and housing characteristics for the entire U.S. population crossed by detailed races and tribes at varying levels of geography. We will describe a differentially private algorithm that adaptively chooses what statistics to release and what noise to add to the statistics based on the size of the population group. We will highlight our methodology of engaging with key stakeholders to iteratively elicit requirements for privacy and fitness for use, as well as design and tune a differentially private algorithm that meets these requirements.

Samuel Haney, Tumult Labs

Sam is a Scientist at Tumult labs where he works on creating systems for provably private data release. Before Tumult, he completed his PhD at Duke University, where he worked on differential privacy and graph algorithms.

Rachel Marks, U.S. Census Bureau

Rachel is chief of the Racial Statistics Branch in the Census Bureau’s Population Division where she leads a research team that analyzes data on race and ethnicity from the 2020 Census, 2020 Island Areas Census, American Community Survey, and the Current Population Survey. Rachel has conducted extensive outreach, presentations, and workshops with various stakeholder groups throughout her career and was a lead researcher for the 2015 National Content Test, which examined alternative ways to collect data on race and ethnicity.

Data Structures for Data Privacy: Lessons Learned in Production

Friday, 2:15 pm2:40 pm

Dr. Rebecca Bilbro and Dr. Benjamin Bengfort, Rotational Labs

Available Media

In this talk we present a decentralized messaging protocol and storage system in use across North America, Europe, and South East Asia. Built at the behest of an international nonprofit working group, the protocol and system are designed to address a unique problem at the intersection of financial crime regulation, distributed ledger technology, and user privacy. In our talk we discuss the many lessons learned in the process of architecting, implementing, and fostering adoption of this system. We present the Secure Envelope, a data structure that employs a combination of methods (symmetric and asymmetric encryption, mTLS, protocol buffers, etc) to safeguard data privacy both at rest and in flight.

Rebecca Bilbro, Rotational Labs

Dr. Rebecca Bilbro is a teacher, speaker, and author who earned her doctorate in 2011 from the University of Illinois, Urbana-Champaign, where her research centered on communication and visualization in Engineering. A veteran of startups from public sector to media & entertainment to enterprise security, Rebecca specializes in machine learning optimization and API development in distributed data systems. As CTO of Rotational Labs, she hopes to take distributed systems from theory into practice, making them more accessible, understandable and tunable for everyday developers.

Benjamin Bengfort, Rotational Labs

Dr. Benjamin Bengfort is an experienced systems engineer, programmer and data scientist who earned his doctorate in Computer Science from the University of Maryland in 2018. Driven by a desire to build large systems with many users that have a global impact, Benjamin takes pride in solutions where many small interactions combine to create complex dynamics. As the CEO of Rotational Labs, his goal is to apply advanced distributed computing, networking, open source software, education, and machine learning to solutions that allow us to collaborate more effectively around the world to solve big problems.

2:40 pm–3:10 pm

Break with Refreshments

3:10 pm–4:40 pm

Panel

Fueling the Emerging Privacy Tech Industry

Friday, 3:10 pm3:55 pm

Lourdes M. Turrecha, CEO & Founder, The Rise of Privacy Tech, and Nishant Bhajaria, Director of Privacy Engineering & Architecture, Uber

Available Media

In this panel, Lourdes M. Turrecha and Nishant Bhajaria will present on the foundational TROPT Defining the Privacy Tech Landscape Whitepaper 2021, which defines and categorizes the emerging privacy tech industry. In addition to defining each category they've identified, they go over the customer persona(s), basic features, and current limitations of available solutions. They help privacy tech buyer-users articulate their technical privacy pain points, coach them to advocate internally for the privacy tech budget they need and deserve, and teach them how to identify available privacy tech solutions to their biggest technical privacy problems. This panel helps privacy technologists, privacy engineers, and other privacy domain experts to find privacy tech opportunities, from landing privacy tech advisory, consulting, and in-house roles, to cofounding a privacy tech startup and even Angel investing in them.

Lourdes M. Turrecha, The Rise of Privacy Tech

Lourdes M. Turrecha, is a privacy, cybersecurity, and data protection strategist, advisor, and investor. Lourdes is the founder of The Rise of Privacy Tech (TROPT), an initiative with a mission to fuel privacy innovation by bringing together privacy tech founders, investors, and expert-advisors to bridge the existing tech-capital-expertise gaps in privacy tech. Lourdes is the primary author of the foundational TROPT Defining the Privacy Tech Landscape Whitepaper 2021 and the organizer of the TROPT Privacy Tech Summit 2022. Lourdes is currently focusing on defining and categorizing the privacy tech landscape and advising privacy tech startups on their privacy strategy, covering product design and market positioning in the nascent privacy tech landscape. Prior to starting PIX & TROPT, Lourdes worked in Silicon Valley for the enterprise cybersecurity leader, Palo Alto Networks and in Big Law, advising 100+ startups, tech and Fortune companies, and other global organizations on their data protection obligations and managing incident and data breach response.

Nishant Bhajaria, Uber

Nishant Bhajaria leads the Technical Privacy and Strategy teams for Uber. He heads a large team that includes data scientists, engineers, privacy experts and others as they seek to improve data privacy for the customers and the company. His role has significant levels of cross-functional visibility and impact. Previously he worked in compliance, data protection, security, and privacy at Google. He was also the head of privacy engineering at Netflix. He is a well-known expert in the field of data privacy, has developed numerous courses on the topic, and has spoken extensively at conferences and podcasts.

Privacy Case Studies

A Way Forward: What We Know (or Not) about CSAM & Privacy

Friday, 3:55 pm4:10 pm

Tatiana Renae Ringenberg, Indiana University Bloomington, and Lorraine G. Kisselburgh, Purdue University

Available Media

Vulnerable, exploited, and unempowered people are the most in need of privacy and digital agency. Conversely, approaches to protecting the most vulnerable have often harmed the very people they seek to protect. Current approaches, including Apple’s systematic mass surveillance and proposals to break end-to-end encryption in the UK, must be evaluated not only in terms of potential harm mitigated but also in terms of risks of future harm. The growing accessibility of child sexual abuse material (CSAM) is one component of this problem. Thwarting stalking, human trafficking, sextortion, and other forms of criminal harassment and surveillance requires a coordinated effort that centers on prevention, harm mitigation, and recovery for survivors. By focusing on designing for survivors and the at-risk rather than enforcement and punishment, we can identify the general risk to privacy and the specific risks to vulnerable populations. A design approach that centers on harm mitigation can align with the entirety of the Code of Ethics and Professional Conduct of the ACM, without sacrificing one type of harm to detect another.

Tatiana Ringenberg, Indiana University Bloomington

Tatiana Ringenberg is a Postdoctoral Fellow in Informatics at Indiana University Bloomington as part of the Computing Innovation Fellows program. Her research interests revolve around applying natural language processing to societal challenges online. Her specific interests include manipulative processes used to harm vulnerable populations, annotation methodologies for textual social science datasets, biases in law enforcement investigations and triage, and online crime resilience.

Lorraine Kisselburgh, Purdue University

Lorraine Kisselburgh is a fellow in the Center for Education and Research in Information Security (CERIAS), lecturer in the Center for Entrepreneurship, and former professor of media, technology, and society at Purdue University. Her research focuses on the social implications of emerging technologies, including privacy, ethics, and collaboration; technological and cultural contexts of social interaction; and gender in STEM contexts.

Privacy Firefighting: Incident Management Lessons from (Literal) Fires

Friday, 4:10 pm4:25 pm

Katie Hufker, Meta

Available Media

At Meta, we always do our best to build robust systems that honor users’ privacy. However, sometimes things go wrong and we need to respond to events that may lead to privacy vulnerabilities and incidents. This sometimes forces us into incident response mode as we must quickly act to determine what’s broken, fix it, and ensure our systems end up in a better state. After spending time putting out literal fires as a volunteer firefighter, I’ve realized that there are many similarities between the two even if the tools and problems are different. Let’s talk about what we can learn from the fire service around things like incident organization and communication to improve our privacy firefighting.

Katie Hufker, Meta

Katie Hufker has been an engineer on Meta’s Privacy team for over three years, where she helps find and address emerging privacy issues and assists with Meta’s Privacy Incident Review program. Outside of work, Katie volunteers as a firefighter/EMT, where she deals with similarly critical incidents in the physical world. She has a bachelors in Computer Science and a masters in Biomedical Informatics from Stanford University.

Privacy and Respectful Discourse in AI Chatbots

Friday, 4:25 pm4:40 pm

Jayati Dev, Indiana University Bloomington

Available Media

Chatbots’ ability to engage in fluid and complex discussions creates new challenges in privacy engineering. Starting with a brief overview of general-purpose AI chatbots, we present a case study of information disclosure in such bots by conducting a thematic analysis of 37 conversational logs from a publicly available general-purpose chatbot (Cleverbot). Our work provides developers and privacy researchers with an analysis of chatbot relationship-building that leads to information disclosure topics from our sample. We discuss (i) information disclosure mechanisms around the type of information users are willing to share and (ii) the types of sensitive data that can be gathered in an open-ended conversation. Our findings provide general insights into conversational expectations and disclosures of users, indicating a need for privacy-preserving design in chatbots through improved data handling. We close with recommendations for integrating privacy-preserving AI principles into conversational chatbot design and providing examples of how they can work in practice.

Jayati Dev, Indiana University Bloomington

Jayati Dev is a doctoral candidate in Security Informatics, with a minor in Human-Computer Interaction design at Indiana University Bloomington. Her graduate research experiences in privacy include a fellowship in Google Public Policy; cryptographic protocol implementation on Intel processors at Indian Statistical Institute funded by Microsoft Research, and as a lead doctoral researcher in a National Science Foundation multi-year investigation into privacy in IoT. Her current research focus is human-centered design for enhanced privacy and security in conversational applications and IoT devices, especially for culturally distinct populations.

4:40 pm–4:50 pm

Closing Remarks

Divya Sharma, Google, and Blase Ur, University of Chicago