"OK, Siri" or "Hey, Google": Evaluating Voiceprint Distinctiveness via Content-based PROLE Score

Authors: 

Ruiwen He, Xiaoyu Ji, and Xinfeng Li, Zhejiang University; Yushi Cheng, Tsinghua University; Wenyuan Xu, Zhejiang University

Abstract: 

A voiceprint is the distinctive pattern of human voices that is spectrographically produced and has been widely used for authentication in the voice assistants. This paper investigates the impact of speech contents on the distinctiveness of voiceprint, and has obtained answers to three questions by studying 2457 speakers and 14,600,000 test samples: 1) What are the influential factors that determine the distinctiveness of voiceprints? 2) How to quantify the distinctiveness of voiceprints for given words, e.g., wake-up words in commercial voice assistants? 3) How to construct wake-up words whose voiceprints have high distinctiveness levels. To answer those questions, we break down voiceprint into phones, and experimentally obtain the correlation between the false recognition rates and the richness of the phone types, the order, the length, and the elements of the phones. Then, we define PROLE Score that can be easily calculated based on speech content yet can reflect the voice distinctiveness. Under the guidance of PROLE Score, we tested 30 wake-up words of 19 commercial voice assistant products, e.g., "Hey, Siri'', "OK, Google'' and "Nihao, Xiaona'' in both English and Chinese. Finally, we provide recommendations for both users and manufacturers, on selecting secure voiceprint words.

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

BibTeX
@inproceedings {277122,
author = {Ruiwen He and Xiaoyu Ji and Xinfeng Li and Yushi Cheng and Wenyuan Xu},
title = {"{{{{{OK}}}}}, Siri" or "Hey, Google": Evaluating Voiceprint Distinctiveness via Content-based {PROLE} Score},
booktitle = {31st USENIX Security Symposium (USENIX Security 22)},
year = {2022},
isbn = {978-1-939133-31-1},
address = {Boston, MA},
pages = {1131--1148},
url = {https://www.usenix.org/conference/usenixsecurity22/presentation/he-ruiwen},
publisher = {USENIX Association},
month = aug
}

Presentation Video