RegexScalpel: Regular Expression Denial of Service (ReDoS) Defense by Localize-and-Fix

Authors: 

Yeting Li, CAS-KLONAT, Institute of Information Engineering, Chinese Academy of Sciences; University of Chinese Academy of Sciences; SKLCS, Institute of Software, Chinese Academy of Sciences; Yecheng Sun, SKLCS, Institute of Software, Chinese Academy of Sciences; University of Chinese Academy of Sciences; Zhiwu Xu, College of Computer Science and Software Engineering, Shenzhen University; Jialun Cao, The Hong Kong University of Science and Technology; Yuekang Li, School of Computer Science and Engineering, Nanyang Technological University; Rongchen Li, SKLCS, Institute of Software, Chinese Academy of Sciences; University of Chinese Academy of Sciences; Haiming Chen, SKLCS, Institute of Software, Chinese Academy of Sciences; CAS-KLONAT, Institute of Information Engineering, Chinese Academy of Sciences; Shing-Chi Cheung, The Hong Kong University of Science and Technology; Yang Liu, School of Computer Science and Engineering, Nanyang Technological University; Yang Xiao, CAS-KLONAT, Institute of Information Engineering, Chinese Academy of Sciences; University of Chinese Academy of Sciences

Abstract: 

The Regular expression Denial of Service (ReDoS) is a class of denial of service attacks that exploit vulnerable regular expressions (regexes) whose execution time can be superlinearly related to input sizes. A common approach of defending ReDoS attacks is to repair the vulnerable regexes. Techniques have been recently proposed to synthesize repaired regexes using program-by-example (PBE) techniques. However, these existing techniques may generate regexes, which are not semantically equivalent or similar to the original ones, or are still vulnerable to ReDoS attacks.

To address the challenges, we propose RegexScalpel, an automatic regex repair framework that adopts a localize-andfix strategy. RegexScalpel first localizes the vulnerabilities by leveraging fine-grained vulnerability patterns proposed by us to analyze their vulnerable patterns, the source (i.e., the pathological sub-regexes), and the root causes (e.g., the overlapping sub-regexes). Then, RegexScalpel targets to fix the pathological sub-regexes according to our predefined repair patterns and the localized vulnerability information. Furthermore, our repair patterns ensure that the repair regexes are semantically either equivalent to or similar to the original ones. Our iterative repair method also keeps out vulnerabilities of the repaired regexes. With an experiment on a total number of 448 vulnerable regexes, we demonstrate that RegexScalpel can outperform all existing automatic regexes fixing techniques by fixing 348 more regexes than the best existing work. Also, we adopted RegexScalpel to detect ten popular projects including Python and NLTK, and revealed 16 vulnerable regexes.We then applied RegexScalpel to successfully repair all of them, and these repairs were merged into the later release by the maintainers, resulting in 8 confirmed CVEs.

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

BibTeX
@inproceedings {281448,
author = {Yeting Li and Yecheng Sun and Zhiwu Xu and Jialun Cao and Yuekang Li and Rongchen Li and Haiming Chen and Shing-Chi Cheung and Yang Liu and Yang Xiao},
title = {{RegexScalpel}: Regular Expression Denial of Service ({{{{{ReDoS}}}}}) Defense by {Localize-and-Fix}},
booktitle = {31st USENIX Security Symposium (USENIX Security 22)},
year = {2022},
isbn = {978-1-939133-31-1},
address = {Boston, MA},
pages = {4183--4200},
url = {https://www.usenix.org/conference/usenixsecurity22/presentation/li-yeting},
publisher = {USENIX Association},
month = aug
}

Presentation Video