Identifying Personal Data by Fusing Elasticsearch with Neural Networks

Friday, June 24, 2022 - 1:30 pm1:55 pm

Rakshit Wadhwa and Ryan Turner, Twitter


A critical aspect of incorporating data privacy is the process of classifying personal or sensitive data. At Twitter, the personal data protection (PDP) annotation system provides a solution to automatically classify columns of data in databases. Listen in as we explore the annotation system that combines Elasticsearch queries with a neural network to provide probabilistically calibrated predictions on PDP data types for every column.

Rakshit Wadhwa, Twitter

Rakshit Wadhwa is a Senior Software Engineer in Privacy Tooling and Infra team at Twitter, working on data privacy challenges within Twitter. Ryan Turner is a Senior ML Researcher in the Cortex team at Twitter, with a special interest in ML and Privacy intersection.

@conference {280292,
author = {Rakshit Wadhwa and Ryan Turner},
title = {Identifying Personal Data by Fusing Elasticsearch with Neural Networks},
year = {2022},
address = {Santa Clara, CA},
publisher = {USENIX Association},
month = jun

Presentation Video