Towards a General Video-based Keystroke Inference Attack


Zhuolin Yang, Yuxin Chen, and Zain Sarwar, University of Chicago; Hadleigh Schwartz, Columbia University; Ben Y. Zhao and Haitao Zheng, University of Chicago


A large collection of research literature has identified the privacy risks of keystroke inference attacks that use statistical models to extract content typed onto a keyboard. Yet existing attacks cannot operate in realistic settings, and rely on strong assumptions of labeled training data, knowledge of keyboard layout, carefully placed sensors or data from other side-channels. This paper describes experiences developing and evaluating a general, video-based keystroke inference attack that operates in common public settings using a single commodity camera phone, with no pretraining, no keyboard knowledge, no local sensors, and no side-channels. We show that using a self-supervised approach, noisy finger tracking data from a video can be processed, labeled and filtered to train DNN keystroke inference models that operate accurately on the same video. Using IRB approved user studies, we validate attack efficacy across a variety of environments, keyboards, and content, and users with different typing behaviors and abilities. Our project website is located at:

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

@inproceedings {285471,
author = {Zhuolin Yang and Yuxin Chen and Zain Sarwar and Hadleigh Schwartz and Ben Y. Zhao and Haitao Zheng},
title = {Towards a General Video-based Keystroke Inference Attack},
booktitle = {32nd USENIX Security Symposium (USENIX Security 23)},
year = {2023},
isbn = {978-1-939133-37-3},
address = {Anaheim, CA},
pages = {141--158},
url = {},
publisher = {USENIX Association},
month = aug