SOTER: Guarding Black-box Inference for General Neural Networks at the Edge

Website Maintenance Alert

Due to scheduled maintenance, the USENIX website will not be available on Saturday, April 13, from 12:00 am–12:30 am Pacific Daylight Time (UTC-7). We apologize for the inconvenience.

If you are trying to register for NSDI '24 or register for PEPR '24, please complete your registration before or after this time period.

Authors: 

Tianxiang Shen, Ji Qi, Jianyu Jiang, Xian Wang, Siyuan Wen, Xusheng Chen, and Shixiong Zhao, The University of Hong Kong; Sen Wang and Li Chen, Huawei Technologies; Xiapu Luo, The Hong Kong Polytechnic University; Fengwei Zhang, Southern University of Science and Technology (SUSTech); Heming Cui, The University of Hong Kong

Abstract: 

The prosperity of AI and edge computing has pushed more and more well-trained DNN models to be deployed on third-party edge devices to compose mission-critical applications. This necessitates protecting model confidentiality at untrusted devices, and using a co-located accelerator (e.g., GPU) to speed up model inference locally. Recently, the community has sought to improve the security with CPU trusted execution environments (TEE). However, existing solutions either run an entire model in TEE, suffering from extremely high inference latency, or take a partition-based approach to handcraft partial model via parameter obfuscation techniques to run on an untrusted GPU, achieving lower inference latency at the expense of both the integrity of partitioned computations outside TEE and accuracy of obfuscated parameters.

We propose SOTER, the first system that can achieve model confidentiality, integrity, low inference latency and high accuracy in the partition-based approach. Our key observation is that there is often an \textit{associativity} property among many inference operators in DNN models. Therefore, SOTER automatically transforms a major fraction of associative operators into \textit{parameter-morphed}, thus \textit{confidentiality-preserved} operators to execute on untrusted GPU, and fully restores the execution results to accurate results with associativity in TEE. Based on these steps, SOTER further designs an \textit{oblivious fingerprinting} technique to safely detect integrity breaches of morphed operators outside TEE to ensure correct executions of inferences. Experimental results on six prevalent models in the three most popular categories show that, even with stronger model protection, SOTER achieves comparable performance with partition-based baselines while retaining the same high accuracy as insecure inference.

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

BibTeX
@inproceedings {280798,
author = {Tianxiang Shen and Ji Qi and Jianyu Jiang and Xian Wang and Siyuan Wen and Xusheng Chen and Shixiong Zhao and Sen Wang and Li Chen and Xiapu Luo and Fengwei Zhang and Heming Cui},
title = {{SOTER}: Guarding Black-box Inference for General Neural Networks at the Edge},
booktitle = {2022 USENIX Annual Technical Conference (USENIX ATC 22)},
year = {2022},
isbn = {978-1-939133-29-68},
address = {Carlsbad, CA},
pages = {723--738},
url = {https://www.usenix.org/conference/atc22/presentation/shen},
publisher = {USENIX Association},
month = jul
}

Presentation Video