Squirrel: A Scalable Secure Two-Party Computation Framework for Training Gradient Boosting Decision Tree

Authors: 

Wen-jie Lu and Zhicong Huang, Alibaba Group; Qizhi Zhang, Ant Group; Yuchen Wang, Alibaba Group; Cheng Hong, Ant Group

Abstract: 

Gradient Boosting Decision Tree (GBDT) and its variants are widely used in industry, due to their high efficiency as well as strong interpretability. Secure multi-party computation allows multiple data owners to compute a function jointly while keeping their input private. In this work, we present Squirrel, a secure two-party GBDT training framework on a vertically split dataset, where two data owners each hold different features of the same data samples. Squirrel is private against semi-honest adversaries, and no sensitive intermediate information is revealed during the training process. Squirrel is also scalable to datasets with millions of samples even under a Wide Area Network (WAN).

Squirrel achieves its high performance via several novel co-designs of the GBDT algorithms and advanced cryptography. Especially, 1) we propose a new mechanism to hide the sample distribution on each node using oblivious transfer. 2) We propose a highly optimized method for secure gradient aggregation using two lattice-based homomorphic encryption schemes. Our empirical results show that our method can be three orders of magnitude faster than the existing approaches. 3) We propose a novel protocol to evaluate the sigmoid function on secretly shared values, showing 19×-200×-fold improvements over two existing methods. Combining all these improvements, Squirrel costs less than 6 seconds per tree on a dataset with 50 thousands samples which outperforms Pivot (VLDB 2020) by more than 28×. We also show that Squirrel can scale up to datasets with more than one million samples, e.g., about 90 seconds per tree over a WAN.

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

BibTeX
@inproceedings {287210,
author = {Wen-jie Lu and Zhicong Huang and Qizhi Zhang and Yuchen Wang and Cheng Hong},
title = {Squirrel: A Scalable Secure {Two-Party} Computation Framework for Training Gradient Boosting Decision Tree},
booktitle = {32nd USENIX Security Symposium (USENIX Security 23)},
year = {2023},
isbn = {978-1-939133-37-3},
address = {Anaheim, CA},
pages = {6435--6451},
url = {https://www.usenix.org/conference/usenixsecurity23/presentation/lu},
publisher = {USENIX Association},
month = aug
}

Presentation Video