Helping Users Automatically Find and Manage Sensitive, Expendable Files in Cloud Storage


Mohammad Taha Khan, University of Illinois at Chicago / Washington & Lee University; Christopher Tran and Shubham Singh, University of Illinois at Chicago; Dimitri Vasilkov, University of Chicago; Chris Kanich, University of Illinois at Chicago; Blase Ur, University of Chicago; Elena Zheleva, University of Illinois at Chicago


With the ubiquity of data breaches, forgotten-about files stored in the cloud create latent privacy risks. We take a holistic approach to help users identify sensitive, unwanted files in cloud storage. We first conducted 17 qualitative interviews to characterize factors that make humans perceive a file as sensitive, useful, and worthy of either protection or deletion. Building on our findings, we conducted a primarily quantitative online study. We showed 108 long-term users of Google Drive or Dropbox a selection of files from their accounts. They labeled and explained these files' sensitivity, usefulness, and desired management (whether they wanted to keep, delete, or protect them). For each file, we collected many metadata and content features, building a training dataset of 3,525 labeled files. We then built Aletheia, which predicts a file's perceived sensitivity and usefulness, as well as its desired management. Aletheia improves over state-of-the-art baselines by 26% to 159%, predicting users' desired file-management decisions with 79% accuracy. Notably, predicting subjective perceptions of usefulness and sensitivity led to a 10% absolute accuracy improvement in predicting desired file-management decisions. Aletheia's performance validates a human-centric approach to feature selection when using inference techniques on subjective security-related tasks. It also improves upon the state of the art in minimizing the attack surface of cloud accounts.

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

@inproceedings {272222,
author = {Mohammad Taha Khan and Christopher Tran and Shubham Singh and Dimitri Vasilkov and Chris Kanich and Blase Ur and Elena Zheleva},
title = {Helping Users Automatically Find and Manage Sensitive, Expendable Files in Cloud Storage},
booktitle = {30th USENIX Security Symposium (USENIX Security 21)},
year = {2021},
isbn = {978-1-939133-24-3},
pages = {1145--1162},
url = {},
publisher = {USENIX Association},
month = aug

Presentation Video