USENIX Security '22 has three submission deadlines. Prepublication versions of the accepted papers from the winter submission deadline are available below. The full program will be available soon.
Mojtaba Zaheri, Yossi Oren, and Reza Curtmola, New Jersey Institute of Technology
Targeted deanonymization attacks let a malicious website discover whether a website visitor bears a certain public identifier, such as an email address or a Twitter handle. These attacks were previously considered to rely on several assumptions, limiting their practical impact. In this work, we challenge these assumptions and show the attack surface for deanonymization attacks is drastically larger than previously considered. We achieve this by using the cache side channel for our attack, instead of relying on cross-site leaks. This makes our attack oblivious to recently proposed software-based isolation mechanisms, including cross-origin resource policies (CORP), cross-origin opener policies (COOP) and SameSite cookie attribute. We evaluate our attacks on multiple hardware microarchitectures, multiple operating systems and multiple browser versions, including the highly-secure Tor Browser, and demonstrate practical targeted deanonymization attacks on major sites, including Google, Twitter, LinkedIn, TikTok, Facebook, Instagram and Reddit. Our attack runs in less than 3 seconds in most cases, and can be scaled to target an exponentially large amount of users.
To stop these attacks, we present a full-featured defense deployed as a browser extension. To minimize the risk to vulnerable individuals, our defense is already available on the Chrome and Firefox app stores. We have also responsibly disclosed our findings to multiple tech vendors, as well as to the Electronic Frontier Foundation. Finally, we provide guidance to websites and browser vendors, as well as to users who cannot install the extension.
"They Look at Vulnerability and Use That to Abuse You'': Participatory Threat Modelling with Migrant Domestic Workers
Julia Słupska and Selina Cho, University of Oxford; Marissa Begonia, Voice of Domestic Workers; Ruba Abu-Salma, King's College London; Nayanatara Prakash, University of Oxford; Mallika Balakrishnan, Migrants Organise
The needs of marginalised groups like migrant domestic workers (MDWs) are often ignored in digital privacy and security research. If considered, MDWs are treated as 'bystanders' or even as threats rather than as targets of surveillance and legitimate security subjects in their own right. Using participatory threat modelling (PTM) as a method of incorporating marginalised populations' experiences, we designed and conducted five workshops with MDWs (n=32) in the UK to identify threats to their privacy and security. We found that MDWs named government surveillance, scams and harassment, and employer monitoring (in this order) as the primary threats to their privacy and security. We also examined the methods MDWs used to stay safe online, such as configuring the privacy settings of their online accounts and creating on- and offline community support networks. Based on our findings, we developed and disseminated a digital privacy and security guide with links to further resources that MDWs can refer to. We conclude by arguing that security research must consider broader social structures like gendered work and racialised border policy that foster insecurity in the lives of MDWs. We also present the key lessons of our work, including considering data sharing from the perspective of stakeholders who do not own technology devices but are affected by them, and reflecting on how security research can stop enabling harmful forms of surveillance.
Huiying Li, Shawn Shan, and Emily Wenger, University of Chicago; Jiayun Zhang, Fudan University; Haitao Zheng and Ben Y. Zhao, University of Chicago
Deep learning systems are known to be vulnerable to adversarial examples. In particular, query-based black-box attacks do not require knowledge of the deep learning model, but can compute adversarial examples over the network by submitting queries and inspecting returns. Recent work largely improves the efficiency of those attacks, demonstrating their practicality on today's ML-as-a-service platforms.
We propose Blacklight, a new defense against query-based black-box adversarial attacks. Blacklight is driven by a fundamental insight: to compute adversarial examples, these attacks perform iterative optimization over the network, producing queries highly similar in the input space. Thus Blacklight detects query-based black-box attacks by detecting highly similar queries, using an efficient similarity engine operating on probabilistic content fingerprints. We evaluate Blacklight against eight state-of-the-art attacks, across a variety of models and image classification tasks. Blacklight identifies them all, often after only a handful of queries. By rejecting all detected queries, Blacklight prevents any attack from completing, even when persistent attackers continue to submit queries after banned accounts or rejected queries. Blacklight is also robust against several powerful countermeasures, including an optimal black-box attack that approximates white-box attacks in efficiency. Finally, we illustrate how Blacklight generalizes to other domains like text classification.
Themis: Accelerating the Detection of Route Origin Hijacking by Distinguishing Legitimate and Illegitimate MOAS
Lancheng Qin, Tsinghua University; Dan Li, Tsinghua University and Zhongguancun Laboratory; Ruifeng Li, Tsinghua Shenzhen International Graduate School; Kang Wang, Tsinghua University
Route hijacking is one of the most severe security problems in today's Internet, and route origin hijacking is the most common. While origin hijacking detection systems are already available, they suffer from tremendous pressures brought by frequent legitimate Multiple origin ASes (MOAS) conflicts. They detect MOAS conflicts on the control plane and then identify origin hijackings by data-plane probing or even manual verification. However, legitimate changes in prefix ownership can also cause MOAS conflicts, which are the majority of MOAS conflicts daily. Massive legitimate MOAS conflicts consume many resources for probing and identification, resulting in high verification costs and high verification latency in practice. In this paper, we propose a new origin hijacking system Themis to accelerate the detection of origin hijacking. Based on the ground truth dataset we built, we analyze the characteristics of different MOAS conflicts and train a classifier to filter out legitimate MOAS conflicts on the control plane. The accuracy and recall of the MOAS classifier are 95.49% and 99.20%, respectively. Using the MOAS classifier, Themis reduces 56.69% of verification costs than Argus, the state-of-the-art, and significantly accelerates the detection when many concurrent MOAS conflicts occur. The overall accuracy of Themis is almost the same as Argus.
Qi-An Fu, Dept. of Comp. Sci. and Tech., Institute for AI, Tsinghua-Bosch Joint ML Center, THBI Lab, BNRist Center, Tsinghua University, Beijing, China; Yinpeng Dong, Dept. of Comp. Sci. and Tech., Institute for AI, Tsinghua-Bosch Joint ML Center, THBI Lab, BNRist Center, Tsinghua University, Beijing, China; RealAI; Hang Su, Dept. of Comp. Sci. and Tech., Institute for AI, Tsinghua-Bosch Joint ML Center, THBI Lab, BNRist Center, Tsinghua University, Beijing, China; Peng Cheng Laboratory; Tsinghua University-China Mobile Communications Group Co., Ltd. Joint Institute; Jun Zhu, Dept. of Comp. Sci. and Tech., Institute for AI, Tsinghua-Bosch Joint ML Center, THBI Lab, BNRist Center, Tsinghua University, Beijing, China; RealAI; Peng Cheng Laboratory; Tsinghua University-China Mobile Communications Group Co., Ltd. Joint Institute; Chao Zhang, Institute for Network Science and Cyberspace / BNRist, Tsinghua University
Adversarial attacks can fool deep learning models by imposing imperceptible perturbations onto natural examples, which have provoked concerns in various security-sensitive applications. Among them, decision-based black-box attacks are practical yet more challenging, where the adversary can only acquire the final classification labels by querying the target model without access to the model's details. Under this setting, existing works usually rely on heuristics and exhibit unsatisfactory performance in terms of query efficiency and attack success rate. To better understand the rationality of these heuristics and further improve over existing methods, we propose AutoDA to automatically discover decision-based iterative adversarial attack algorithms. In our approach, we construct a generic search space of attack algorithms and develop an efficient search algorithm to explore this space. Although we adopt a small and fast model to efficiently evaluate and discover qualified attack algorithms during the search, extensive experiments demonstrate that the discovered algorithms are simple yet query-efficient when attacking larger models on the CIFAR-10 and ImageNet datasets. They achieve comparable performance with the human-designed state-of-the-art decision-based iterative attack methods consistently.
Vincent Cheval, Inria Paris; Charlie Jacomme, CISPA Helmholtz Center for Information Security; Steve Kremer, Université de Lorraine LORIA & Inria Nancy; Robert Künnemann, CISPA Helmholtz Center for Information Security
Symbolic security protocol verifiers have reached a high degree of automation and maturity. Today, experts can model real-world protocols, but this often requires model-specific encodings and deep insight into the strengths and weaknesses of each of those tools. With SAPIC+, we introduce a protocol verification platform that lifts this burden and permits choosing the right tool for the job, at any development stage. We build on the existing compiler from SAPIC to TAMARIN, and extend it with automated translations from SAPIC+ to PROVERIF and DEEPSEC, as well as powerful, protocol-independent optimizations of the existing translation. We prove each part of these translations sound. A user can thus, with a single SAPIC+ file, verify reachability and equivalence properties on the specified protocol, either using PROVERIF, TAMARIN or DEEPSEC. Moreover, the soundness of the translation allows to directly assume results proven by another tool which allows to exploit the respective strengths of each tool. We demonstrate our approach by analyzing various existing models. This includes a large case study of the 5G authentication protocols, previously analyzed in TAMARIN. Encoding this model in SAPIC+ we demonstrate the effectiveness of our approach. Moreover, we study four new case studies: the LAKE-EDHOC and the Privacy-Pass protocols, both under standardization, the SSH protocol with the agent-forwarding feature, and the recent KEMTLS protocol, a post-quantum version of the main TLS key exchange.
Harjot Kaur, Leibniz University Hannover; Sabrina Amft, CISPA Helmholtz Center for Information Security; Daniel Votipka, Tufts University; Yasemin Acar, Max Planck Institute for Security and Privacy and George Washington University; Sascha Fahl, CISPA Helmholtz Center for Information Security and Leibniz University Hannover
Studying developers is an important aspect of usable security and privacy research. In particular, studying security development challenges such as the usability of security APIs, the secure use of information sources during development or the effectiveness of IDE security plugins raised interest in recent years. However, recruiting skilled participants with software development experience is particularly challenging, and it is often not clear what security researchers can expect from certain participant samples, which can make research results hard to compare and interpret. Hence, in this work, we study for the first time opportunities and challenges of different platforms to recruit participants with software development experience for security development studies. First, we identify popular recruitment platforms in 59 papers. Then, we conduct a comparative online study with 706 participants based on self-reported software development experience across six recruitment platforms. Using an online questionnaire, we investigate participants' programming and security experiences, skills and knowledge. We find that participants across all samples report rich general software development and security experience, skills, and knowledge. Based on our results, we recommend developer recruitment from Upwork for practical coding studies and Amazon MTurk along with a pre-screening survey to reduce additional noise for larger studies. Both of these, along with Freelancer, are also recommended for security studies. We conclude the paper by discussing the impact of our results on future security development studies.
Chengbin Pang and Tiantai Zhang, Nanjing University; Ruotong Yu, University of Utah; Bing Mao, Nanjing University; Jun Xu, University of Utah
Modern disassembly tools often rely on empirical evaluations to validate their performance and discover their limitations, thus promoting long-term evolvement. To support the empirical evaluation, a foundation is the right approach to collect the ground truth knowledge. However, there has been no unanimous agreement on the approach we should use. Most users pick an approach based on their experience or will, regardless of the properties that the approach presents.
In this paper, we perform a study on the approaches to building the ground truth for binary disassembly, aiming to shed light on the right way for the future. We first provide a taxonomy of the approaches used by past research, which unveils five major mechanisms behind those approaches. Following the taxonomy, we summarize the properties of the five mechanisms from two perspectives: (i) the coverage and precision of the ground truth produced by the mechanisms and (ii) the applicable scope of the mechanisms (e.g., what disassembly tasks and what types of binaries are supported). The summarization, accompanied by quantitative evaluations, illustrates that many mechanisms are ill-suited to support the generation of disassembly ground truth. The mechanism best serving today's need is to trace the compiling process of the target binaries to collect the ground truth information.
Observing that the existing tool to trace the compiling process can still miss ground truth results and can only handle x86/x64 binaries, we extend the tool to avoid overlooking those results and support ARM32/AArch64/MIPS32/MIPS64 binaries. We envision that our extension will make the tool a better foundation to enable universal, standard ground truth for binary disassembly.
Jinyu Gu, Bojun Zhu, Mingyu Li, Wentai Li, Yubin Xia, and Haibo Chen, Shanghai Jiao Tong University
The monolithic programming model has been favored for high compatibility and easing the programming for SGX enclaves, i.e., running the secure code with all dependent libraries or even library OSes (LibOSes). Yet, it inevitably bloats the trusted computing base (TCB) and thus deviates from the goal of high security. Introducing ﬁne-grained isolation can effectively mitigate TCB bloating while existing solutions face performance issues. We observe that the off-the-shelf Intel MPK is a perfect match for efﬁcient intra-enclave isolation. Nonetheless, the trust models between MPK and SGX are incompatible by design. We hence propose LIGHTENCLAVE, which embraces non-intrusive extensions on existing SGX hardware to incorporate MPK securely and allows multiple light-enclaves isolated within one enclave. Experiments show that LIGHTENCLAVE incurs up to 4% overhead when separating secret SSL keys for server applications and can signiﬁcantly improve the performance of Graphene-SGX and Occlum by reducing the communication and runtime overhead, respectively.
Shawn Shan, Arjun Nitin Bhagoji, Haitao Zheng, and Ben Y. Zhao, University of Chicago
In adversarial machine learning, new defenses against attacks on deep learning systems are routinely broken soon after their release by more powerful attacks. In this context, forensic tools can offer a valuable complement to existing defenses, by tracing back a successful attack to its root cause, and offering a path forward for mitigation to prevent similar attacks in the future.
In this paper, we describe our efforts in developing a forensic traceback tool for poison attacks on deep neural networks. We propose a novel iterative clustering and pruning solution that trims "innocent" training samples, until all that remains is the set of poisoned data responsible for the attack. Our method clusters training samples based on their impact on model parameters, then uses an efficient data unlearning method to prune innocent clusters. We empirically demonstrate the efficacy of our system on three types of dirty-label (backdoor) poison attacks and three types of clean-label poison attacks, across domains of computer vision and malware classification. Our system achieves over 98.4% precision and 96.8% recall across all attacks. We also show that our system is robust against four anti-forensics measures specifically designed to attack it.
Peter Mayer, Karlsruhe Institute of Technology; Collins W. Munyendo, The George Washington University; Michelle L. Mazurek, University of Maryland, College Park; Adam J. Aviv, The George Washington University
We quantitatively investigated the current state of Password Manager (PM) usage and general password habits at a large, private university in the United States. Building on prior qualitative findings from SOUPS 2019, we survey n=277 faculty, staff, and students, finding that 77% of our participants already use PMs, but users of third-party PMs, as opposed to browser-based PMs, were significantly less likely to reuse their passwords across accounts. The largest factor encouraging PM adoption is perceived ease-of-use, indicating that communication and institutional campaigns should focus more on usability factors. Additionally, our work indicates the need for design improvements for browser-based PMs to encourage less password reuse as they are more widely adopted.
Henry Birge-Lee, Princeton University; Joel Wanner, ETH Zürich; Grace H. Cimaszewski, Princeton University; Jonghoon Kwon, ETH Zürich; Liang Wang, Princeton University; François Wirz, ETH Zürich; Prateek Mittal, Princeton University; Adrian Perrig, ETH Zürich; Yixin Sun, University of Virginia
Adversaries can exploit inter-domain routing vulnerabilities to intercept communication and compromise the security of critical Internet applications. Meanwhile the deployment of secure routing solutions such as Border Gateway Protocol Security (BGPsec) and Scalability, Control and Isolation On Next-generation networks (SCION) are still limited. How can we leverage emerging secure routing backbones and extend their security properties to the broader Internet?
We design and deploy an architecture to bootstrap secure routing. Our key insight is to abstract the secure routing backbone as a virtual Autonomous System (AS), called Secure Backbone AS (SBAS). While SBAS appears as one AS to the Internet, it is a federated network where routes are exchanged between participants using a secure backbone. SBAS makes BGP announcements for its customers' IP prefixes at multiple locations (referred to as Points of Presence or PoPs) allowing traffic from non-participating hosts to be routed to a nearby SBAS PoP (where it is then routed over the secure backbone to the true prefix owner). In this manner, we are the first to integrate a federated secure non-BGP routing backbone with the BGP-speaking Internet.
We present a real-world deployment of our architecture that uses SCIONLab to emulate the secure backbone and the PEERING framework to make BGP announcements to the Internet. A combination of real-world attacks and Internet-scale simulations shows that SBAS substantially reduces the threat of routing attacks. Finally, we survey network operators to better understand optimal governance and incentive models.
Sunwoo Kim, Samsung Research; Young Min Kim, Jaewon Hur, and Suhwan Song, Seoul National University; Gwangmu Lee, EPFL; Byoungyoung Lee, Seoul National University
Universal cross-site scripting (UXSS) is a browser vulnerability, making a vulnerable browser execute an attacker's script on any web pages loaded by the browser. UXSS is considered a far more severe vulnerability than well-studied cross-site scripting (XSS). This is because the impact of UXSS is not limited to a web application, but it impacts each and every web application as long as a victim user runs a vulnerable browser. We find that UXSS vulnerabilities are difficult to find, especially through fuzzing, for the following two reasons. First, it is challenging to detect UXSS because it is a semantic vulnerability. In order to detect UXSS, one needs to understand the complex interaction semantics between web pages. Second, it is difficult to generate HTML inputs that trigger UXSS since one needs to drive the browser to perform complex interactions and navigations.
This paper proposes FuzzOrigin, a browser fuzzer designed to detect UXSS vulnerabilities. FuzzOrigin addresses the above two challenges by (i) designing an origin sanitizer with a static origin tagging mechanism and (ii) prioritizing origin-update operations through generating chained-navigation operations handling dedicated events. We implemented FuzzOrigin, which works with most modern browsers, including Chrome, Firefox, Edge, and Safari. During the evaluation, FuzzOrigin discovered four previously unknown UXSS vulnerabilities, one in Chrome and three in Firefox, all of which have been confirmed by the vendors. FuzzOrigin is responsible for finding one out of two UXSS vulnerabilities in Chrome reported in 2021 and all three in Firefox, highlighting its strong effectiveness in finding new UXSS vulnerabilities.
Zekun Shen, Ritik Roongta, and Brendan Dolan-Gavitt, NYU
Peripheral hardware in modern computers is typically assumed to be secure and not malicious, and device drivers are implemented in a way that trusts inputs from hardware. However, recent vulnerabilities such as Broadpwn have demonstrated that attackers can exploit hosts through vulnerable peripherals, highlighting the importance of securing the OS-peripheral boundary. In this paper, we propose a hardware-free concolic-augmented fuzzer targeting WiFi and Ethernet drivers, and a technique for generating high-quality initial seeds, which we call golden seeds, that allow fuzzing to bypass difficult code constructs during driver initialization. Compared to prior work using symbolic execution or greybox fuzzing, Drifuzz is more successful at automatically finding inputs that allow network interfaces to be fully initialized, and improves fuzzing coverage by 214% (3.1×) in WiFi drivers and 60% (1.6×) for Ethernet drivers. During our experiments with fourteen PCI and USB network drivers, we find twelve previously unknown bugs, two of which were assigned CVEs.
Mohannad Ismail, Virginia Tech; Andrew Quach, Oregon State University; Christopher Jelesnianski, Virginia Tech; Yeongjin Jang, Oregon State University; Changwoo Min, Virginia Tech
ARM is becoming more popular in desktops and data centers, opening a new realm in terms of security attacks against ARM. ARM has released Pointer Authentication, a new hardware security feature that is intended to ensure pointer integrity with cryptographic primitives.
In this paper, we utilize Pointer Authentication (PA) to build a novel scheme to completely prevent any misuse of security-sensitive pointers. We propose PACTIGHT to tightly seal these pointers. PACTIGHT utilizes a strong and unique modifier that addresses the current issues with the state-of-the-art PA defense mechanisms. We implement four defenses based on the PACTIGHT mechanism. Our security and performance evaluation results show that PACTIGHT defenses are more efficient and secure. Using real PA instructions, we evaluated PACTIGHT on 30 different applications, including NGINX web server, with an average performance overhead of 4.07% even when enforcing our strongest defense. PACTIGHT demonstrates its effectiveness and efficiency with real PA instructions on real hardware.
Lawrence Roy, Stanislav Lyakhov, Yeongjin Jang, and Mike Rosulek, Oregon State University
Public-key authentication in SSH reveals more information about the participants' keys than is necessary. (1) The server can learn a client's entire set of public keys, even keys generated for other servers. (2) The server learns exactly which key the client uses to authenticate, and can further prove this fact to a third party. (3) A client can learn whether the server recognizes public keys belonging to other users. Each of these problems lead to tangible privacy violations for SSH users.
In this work we introduce a new public-key authentication method for SSH that reveals essentially the minimum possible amount of information. With our new method, the server learns only whether the client knows the private key for some authorized public key. If multiple keys are authorized, the server does not learn which one the client used. The client cannot learn whether the server recognizes public keys belonging to other users. Unlike traditional SSH authentication, our method is fully deniable.
Our method supports existing SSH keypairs of all standard flavors—RSA, ECDSA, EdDSA. It does not require users to generate new key material. As in traditional SSH authentication, clients and servers can use a mixture of different key flavors in a single authentication session.
We integrated our new authentication method into OpenSSH, and found it to be practical and scalable. For a typical client and server with at most 10 ECDSA/EdDSA keys each, our protocol requires 9 kB of communication and 12.4 ms of latency. Even for a client with 20 keys and server with 100 keys, our protocol requires only 12 kB of communication and 26.7 ms of latency.
Estimating Incidental Collection in Foreign Intelligence Surveillance: Large-Scale Multiparty Private Set Intersection with Union and Sum
Anunay Kulshrestha and Jonathan Mayer, Princeton University
Section 702 of the Foreign Intelligence Surveillance Act authorizes U.S. intelligence agencies to intercept communications content without obtaining a warrant. While Section 702 requires targeting foreigners abroad for intelligence purposes, agencies "incidentally" collect communications to or from Americans and can search that data for purposes beyond intelligence gathering. For over a decade, members of Congress and civil society organizations have called on the U.S. Intelligence Community (IC) to estimate the scale of incidental collection. Senior intelligence officials have acknowledged the value of quantitative transparency for incidental collection, but the IC has not identified a satisfactory estimation method that respects individual privacy, protects intelligence sources and methods, and imposes minimal burden on IC resources.
In this work, we propose a novel approach to estimating incidental collection using secure multiparty computation (MPC). The IC possesses records about the parties to intercepted communications, and communications services possess country-level location for users. By combining these datasets with MPC, it is possible to generate an automated aggregate estimate of incidental collection that maintains confidentiality for intercepted communications and user locations.
We formalize our proposal as a new variant of private set intersection, which we term multiparty private set intersection with union and sum (MPSIU-Sum). We then design and evaluate an efficient MPSIU-Sum protocol, based on elliptic curve cryptography and partially homomorphic encryption. Our protocol performs well at the large scale necessary for estimating incidental collection in Section 702 surveillance.
IHOP: Improved Statistical Query Recovery against Searchable Symmetric Encryption through Quadratic Optimization
Simon Oya and Florian Kerschbaum, University of Waterloo
Effective query recovery attacks against Searchable Symmetric Encryption (SSE) schemes typically rely on auxiliary ground-truth information about the queries or dataset. Query recovery is also possible under the weaker statistical auxiliary information assumption, although statistical-based attacks achieve lower accuracy and are not considered a serious threat. In this work we present IHOP, a statistical-based query recovery attack that formulates query recovery as a quadratic optimization problem and reaches a solution by iterating over linear assignment problems. We perform an extensive evaluation with five real datasets, and show that IHOP outperforms all other statistical-based query recovery attacks under different parameter and leakage configurations, including the case where the client uses some access-pattern obfuscation defenses. In some cases, our attack achieves almost perfect query recovery accuracy. Finally, we use IHOP in a frequency-only leakage setting where the client's queries are correlated, and show that our attack can exploit query dependencies even when PANCAKE, a recent frequency-hiding defense by Grubbs et al., is applied. Our findings indicate that statistical query recovery attacks pose a severe threat to privacy-preserving SSE schemes.
Maya Dotan, Saar Tochner, Aviv Zohar, and Yossi Gilad, The Hebrew University of Jerusalem
Payment channel networks (PCNs) provide a faster and cheaper alternative to transactions recorded on the blockchain. Clients can trustlessly establish payment channels with relays by locking coins and then send signed payments that shift coin balances over the network's channels. Although payments are never published, anyone can track a client's payment by monitoring changes in coin balances over the network's channels. We present Twilight, the first PCN that provides a rigorous differential privacy guarantee to its users. Relays in Twilight run a noisy payment processing mechanism that hides the payments they carry. This mechanism increases the relay's cost, so Twilight combats selfish relays that wish to avoid it, using a trusted execution environment (TEE) that ensures they follow its protocol. The TEE does not store the channel's state, which minimizes the trusted computing base. Crucially, Twilight ensures that even if a relay breaks the TEE's security, it cannot break the integrity of the PCN. We analyze Twilight in terms of privacy and cost and study the trade-off between them. We implement Twilight using Intel's SGX framework and evaluate its performance using relays deployed on two continents. We show that a route consisting of 4 relays handles 820 payments/sec.
Olga Gkountouna, University of Liverpool; Katerina Doka, National Technical University of Athens; Mingqiang Xue, Tower Research; Jianneng Cao, Bank Jago; Panagiotis Karras, Aarhus University
How can we orchestrate an one-off sharing of informative data about individuals, while bounding the risk of disclosing sensitive information to an adversary who has access to the global distribution of such information and to personal identifiers? Despite intensive efforts, current privacy protection techniques fall short of this objective. Differential privacy provides strong guarantees regarding the privacy risk incurred by one's participation in the data at the cost of high information loss and is vulnerable to learning-based attacks exploiting correlations among data. Syntactic anonymization bounds the risk on specific sensitive information incurred by data publication, yet typically resorts to a superfluous clustering of individuals into groups that forfeits data utility.
In this paper, we develop algorithms for disclosure control that abide to sensitive-information-oriented syntactic privacy guarantees and gain up to 77% in utility against current methods. We achieve this feat by recasting data heterogeneously, via bipartite matching, rather than homogeneously via clustering. We show that our methods resist adversaries who know the employed algorithm and its parameters. Our experimental study featuring synthetic and real data, as well as real learning and data analysis tasks, shows that these methods enhance data utility with a runtime overhead that is small and reducible by data partitioning, while the β-likeness guarantee with heterogeneous generalization staunchly resists machine-learning-based attacks, hence offers practical value.
Timothy Trippel and Kang G. Shin, University of Michigan; Alex Chernyakhovsky, Garret Kelly, and Dominic Rizzo, Google, LLC; Matthew Hicks, Virginia Tech
Hardware flaws are permanent and potent: hardware cannot be patched once fabricated, and any flaws may undermine even formally verified software executing on top. Consequently, verification time dominates implementation time. The gold standard in hardware Design Verification (DV) is dynamic random testing, due to its scalability to large designs. However, given its undirected nature, this technique is inefficient.
Instead of making incremental improvements to existing dynamic hardware verification approaches, we leverage the observation that existing software fuzzers already provide such a solution, and hence adapt them for hardware verification. Specifically, we translate RTL hardware to a software model and fuzz that model directly. The central challenge we address is how to mitigate the differences between the hardware and software execution models. This includes: 1) how to represent test cases, 2) what is the hardware equivalent of a crash, 3) what is an appropriate coverage metric, and 4) how to create a general-purpose fuzzing harness for hardware.
To evaluate our approach, we design, implement, and open-source a Hardware Fuzzing Pipeline that enables fuzzing hardware at scale, using only open-source tools. Using our pipeline, we fuzz five IP blocks from Google's OpenTitan Root-of-Trust chip, four SiFive TileLink peripherals, three RISC-V CPUs, and an FFT accelerator. Our experiments reveal a two orders-of-magnitude reduction in run time to achieve similar Finite State Machine coverage over traditional dynamic verification schemes, and 26.70% better HDL line coverage than prior work. Moreover, with our bus-centric harness, we achieve over 83% HDL line coverage in four of the five OpenTitan IPs we study—without any initial seeds—and are able to detect all bugs (four synthetic from Hack@DAC and one real) implanted across all five OpenTitan IPs we study, with less than 10 hours of fuzzing.
Yufei Chen, Xi'an Jiaotong University & City University of Hong Kong; Chao Shen, Xi'an Jiaotong University; Cong Wang, City University of Hong Kong; Yang Zhang, CISPA Helmholtz Center for Information Security
Transfer learning has become a common solution to address training data scarcity in practice. It trains a specified student model by reusing or fine-tuning early layers of a well-trained teacher model that is usually publicly available. However, besides utility improvement, the transferred public knowledge also brings potential threats to model confidentiality, and even further raises other security and privacy issues.
In this paper, we present the first comprehensive investigation of the teacher model exposure threat in the transfer learning context, aiming to gain a deeper insight into the tension between public knowledge and model confidentiality. To this end, we propose a teacher model fingerprinting attack to infer the origin of a student model, i.e., the teacher model it transfers from. Specifically, we propose a novel optimization-based method to carefully generate queries to probe the student model to realize our attack. Unlike existing model reverse engineering approaches, our proposed fingerprinting method neither relies on fine-grained model outputs, e.g., posteriors, nor auxiliary information of the model architecture or training dataset. We systematically evaluate the effectiveness of our proposed attack. The empirical results demonstrate that our attack can accurately identify the model origin with few probing queries. Moreover, we show that the proposed attack can serve as a stepping stone to facilitating other attacks against machine learning models, such as model stealing.
Birds of a Feather Flock Together: How Set Bias Helps to Deanonymize You via Revealed Intersection Sizes
Xiaojie Guo, Ye Han, Zheli Liu, Ding Wang, and Yan Jia, Nankai University; Jin Li, Guangzhou University
Secure two-party protocols that compute intersection-related statistics have attracted much attention from the industry. These protocols enable two organizations to jointly compute a function (e.g., count and sum) over the intersection of their sets without explicitly revealing this intersection. However, most of such protocols will reveal the intersection size of the two sets in the end. In this work, we are interested in how well an attacker can leverage the revealed intersection sizes to infer some elements' membership of one organization's set. Even disclosing an element's membership of one organization's set to the other organization may violate privacy regulations (e.g., GDPR) since such an element is usually used to identify a person between two organizations. We are the first to study this set membership leakage in intersection-size-revealing protocols. We propose two attacks, namely, baseline attack and feature-aware attack, to evaluate this leakage in realistic scenarios. In particular, our feature-aware attack exploits the realistic set bias that elements with specific features are more likely to be the members of one organization's set. The results show that our two attacks can infer 2.0 ∼ 72.7 set members on average in three realistic scenarios. If the set bias is not weak, the feature-aware attack will outperform the baseline one. For example, in COVID-19 contact tracing, the feature-aware attack can find 25.9 tokens of infected patients in 135 protocol invocations, 1.5 × more than the baseline attack. We discuss how such results may cause negative real-world impacts and propose possible defenses against our attacks.
Xuewei Feng, Department of Computer Science and Technology & BNRist, Tsinghua University; Qi Li, Institute for Network Sciences and Cyberspace & BNRist, Tsinghua University and Zhongguancun Lab; Kun Sun, Department of Information Sciences and Technology & CSIS, George Mason University; Zhiyun Qian, UC Riverside; Gang Zhao, Department of Computer Science and Technology & BNRist, Tsinghua University; Xiaohui Kuang, Beijing University of Posts and Telecommunications; Chuanpu Fu, Department of Computer Science and Technology & BNRist, Tsinghua University; Ke Xu, Department of Computer Science and Technology & BNRist, Tsinghua University and Zhongguancun Lab
ICMP redirect is a mechanism that allows an end host to dynamically update its routing decisions for particular destinations. Previous studies show that ICMP redirect may be exploited by attackers to manipulate the routing of victim traffic. However, it is widely believed that ICMP redirect attacks are not a real-world threat since they can only occur under specific network topologies (e.g., LAN). In this paper, we conduct a systematic study on the legitimacy check mechanism of ICMP and uncover a fundamental gap between the check mechanism and stateless protocols, resulting in a wide range of vulnerabilities. In particular, we find that off-path attackers can utilize a suite of stateless protocols (e.g., UDP, ICMP, GRE, IPIP and SIT) to easily craft evasive ICMP error messages, thus revitalizing ICMP redirect attacks to cause serious damage in the real world, particularly, on the wide-area network. First, we show that off-path attackers can conduct a stealthy DoS attack by tricking various public servers on the Internet into mis-redirecting their traffic into black holes with a single forged ICMP redirect message. For example, we reveal that more than 43K popular websites on the Internet are vulnerable to this DoS attack. In addition, we identify 54.47K open DNS resolvers and 186 Tor nodes on the Internet are vulnerable as well. Second, we show that, by leveraging ICMP redirect attacks against NATed networks, off-path attackers in the same NATed network can perform a man-in-the-middle (MITM) attack to intercept the victim traffic. Finally, we develop countermeasures to throttle the attacks.
Sunil Manandhar and Kaushal Kafle, William & Mary; Benjamin Andow, Google LLC; Kapil Singh, IBM T.J. Watson Research Center; Adwait Nadkarni, William & Mary
Smart home devices transmit highly sensitive usage information to servers owned by vendors or third-parties as part of their core functionality. Hence, it is necessary to provide users with the context in which their device data is collected and shared, to enable them to weigh the benefits of deploying smart home technology against the resulting loss of privacy. As privacy policies are generally expected to precisely convey this information, we perform a systematic and data-driven analysis of the current state of smart home privacy policies, with a particular focus on three key questions: (1) how hard privacy policies are for consumers to obtain, (2) how existing policies describe the collection and sharing of device data, and (3) how accurate these descriptions are when compared to information derived from alternate sources. Our analysis of 596 smart home vendors, affecting 2, 442 smart home devices yields 17 findings that impact millions of users, demonstrate gaps in existing smart home privacy policies, as well as challenges and opportunities for automated analysis.
Agnieszka Dutkowska-Zuk, Lancaster University; Austin Hounsel, Princeton University; Amy Morrill, University of Chicago; Andre Xiong, Princeton University; Marshini Chetty and Nick Feamster, University of Chicago
Virtual Private Networks (VPNs) are often used to protect online users' privacy, but many VPNs do not guarantee privacy and may even compromise user privacy through leakage of traffic flows, data collection and sharing, and so forth. In this paper, we aim to understand the extent to which people are aware of privacy and security risks when using VPNs as well as how they use and adopt VPNs in the first place. To do so, we conducted a study of 729 VPN users in the United States about their VPN usage habits and preferences. Our study comprised 32 in-person interviews with university students, a survey of 349 university students and a survey of 348 general VPN users on Prolific. We have three main findings. First, although a general population of VPN users primarily use VPNs to improve privacy and security, students are additionally concerned with access to content (e.g., circumvention of geographic restrictions). Second, both groups concluded that VPNs collect data about them, exposing gaps both in mental models about how VPNs work and awareness of the risks of data collection. Finally, most users learned about VPNs in high school or college and use free VPNs but feel safer using VPNs provided by their institutions. These results could form the basis of future research, awareness campaigns, and regulatory activity.
Xudong Pan, Mi Zhang, Beina Sheng, Jiaming Zhu, and Min Yang, Fudan University
The vulnerability of deep neural networks (DNN) to backdoor (trojan) attacks is extensively studied for the image domain. In a backdoor attack, a DNN is modified to exhibit expected behaviors under attacker-specified inputs (i.e., triggers). Exploring the backdoor vulnerability of DNN in natural language processing (NLP), recent studies are limited to using specially added words/phrases as the trigger pattern (i.e., word-based triggers), which distorts the semantics of the base sentence, causes perceivable abnormality in linguistic features and can be eliminated by potential defensive techniques.
In this paper, we present LiMnguistic Style-Motivated backdoor attack (LISM), the first hidden trigger backdoor attack which exploits implicit linguistic styles for backdooring NLP models. Besides the basic requirements on attack success rate and normal model performance, LISM realizes the following advanced design goals compared with previous word-based backdoor: (a) LISM weaponizes text style transfer models to learn to generate sentences with an attacker-specified linguistic style (i.e., trigger style), which largely preserves the malicious semantics of the base sentence and reveals almost no abnormality exploitable by detection algorithms. (b) Each base sentence is dynamically paraphrased to hold the trigger style, which has almost no dependence on common words or phrases and therefore evades existing defenses which exploit the strong correlation between trigger words and misclassification. Extensive evaluation on 5 popular model architectures, 3 real-world security-critical tasks, 3 trigger styles and 3 potential countermeasures strongly validates the effectiveness and the stealthiness of LISM.
Matheus E. Garbelini, Vaibhav Bedi, and Sudipta Chattopadhyay, Singapore University of Technology and Design; Sumei Sun and Ernest Kurniawan, Institute for Infocomm Research, A*Star
In this paper we propose, design and evaluate a systematic directed fuzzing framework to automatically discover implementation bugs in arbitrary Bluetooth Classic (BT) devices. The core of our fuzzer is the first over-the-air approach that takes full control of the BT controller baseband from the host. This enables us to intercept and modify arbitrary packets, as well as to inject packets out-of-order in lower layers of closed-source BT stack, i.e., Link Manager Protocol (LMP) and Baseband. To systematically guide our fuzzing process, we propose an extensible and novel rule-based approach to automatically construct the protocol state machine during normal over-the-air communication. In particular, by writing a simple set of rules to identify protocol messages, we can dynamically construct an abstracted protocol state machine, fuzz packets resulting from a state and validate responses from target devices. As of today, we have fuzzed 13 BT devices from 11 vendors and we have discovered a total of 18 unknown implementation flaws, with 24 common vulnerability exposures (CVEs) assigned. Furthermore, our discoveries were awarded with six bug bounties from certain vendors. Finally, to show the broader applicability of our framework beyond BT, we have extended our approach to fuzz other wireless protocols, which additionally revealed 6 unknown bugs in certain Wi-Fi and BLE Host stacks.
Brian Kondracki, Johnny So, and Nick Nikiforakis, Stony Brook University
Since its creation, Certificate Transparency (CT) has served as a vital component of the secure web. However, with the increase in TLS adoption, CT has essentially become a defacto log for all newly-created websites, announcing to the public the existence of web endpoints, including those that could have otherwise remained hidden. As a result, web bots can use CT to probe websites in real time, as they are created. Little is known about these bots, their behaviors, and their intentions.
In this paper we present CTPOT, a distributed honeypot system which creates new TLS certificates for the purpose of advertising previously non-existent domains, and records the activity generated towards them from a number of network vantage points. Using CTPOT, we create 4,657 TLS certificates over a period of ten weeks, attracting 1.5 million web requests from 31,898 unique IP addresses. We find that CT bots occupy a distinct subset of the overall web bot population, with less than 2% overlap between IP addresses of CT bots and traditional host-scanning web bots. By creating certificates with varying content types, we are able to further sub-divide the CT bot population into subsets of varying intentions, revealing a stark contrast in malicious behavior among these groups. Finally, we correlate observed bot IP addresses into campaigns using the file paths requested by each bot, and find 105 malicious campaigns targeting the domains we advertise. Our findings shed light onto the CT bot ecosystem, revealing that it is not only distinct to that of traditional IP-based bots, but is composed of numerous entities with varying targets and behaviors.
Fangming Gu and Qingli Guo, Institute of Information Engineering, Chinese Academy of Sciences and School of Cyber Security, University of Chinese Academy of Sciences; Lian Li, Institute of Computing Technology, Chinese Academy of Sciences and School of Computer Science and Technology, University of Chinese Academy of Sciences; Zhiniang Peng, Sangfor Technologies Inc and Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences; Wei Lin, Xiaobo Yang, and Xiaorui Gong, Institute of Information Engineering, Chinese Academy of Sciences and School of Cyber Security, University of Chinese Academy of Sciences
The Microsoft Component Object Model (COM) is the foundation for many key Microsoft technologies and we develop COMRace, the first data race vulnerability detection tool for commercial off-the-shelf COM objects. COMRace targets a severe but previously overlooked flaw in the COM threading model, which makes COM objects prone to data race attacks. In COMRace, we apply static binary analyses to identify thread-unsafe interface methods in off-the-shelf COM binaries, then further verify binary analyses results with automatically synthesized proof-of-concept exploits (PoC). We have applied COMRace to 10,420 registered COM objects on the windows platform and the tool reports 186 vulnerable interface methods. COMRace automatically synthesizes 234 PoCs for 256 selected method pairs (82 unsafe methods) with conflict accesses, and there are 194 PoCs triggering race conditions. Furthermore, 145 PoCs lead to critical memory corruptions, exposing 26 vulnerabilities confirmed by the Common Vulnerabilities and Exposures (CVE) database.
Alejandro Cuevas, Carnegie Mellon University; Fieke Miedema, Delft University of Technology; Kyle Soska, University of Illinois Urbana Champaign and Hikari Labs, Inc.; Nicolas Christin, Carnegie Mellon University and Hikari Labs, Inc.; Rolf van Wegberg, Delft University of Technology
A number of recent studies have investigated online anonymous ("dark web") marketplaces. Almost all leverage a "measurement-by-proxy" design, in which researchers scrape market public pages, and take buyer reviews as a proxy for actual transactions, to gain insights into market size and revenue. Yet, we do not know if and how this method biases results.
We build a framework to reason about marketplace measurement accuracy, and use it to contrast estimates projected from scrapes of Hansa Market with data from a back-end database seized by the police. We further investigate, by simulation, the impact of scraping frequency, consistency and rate-limits. We find that, even with a decent scraping regimen, one might miss approximately 46% of objects—with scraped listings differing significantly from not-scraped listings on price, views and product categories. This bias also impacts revenue calculations. We find Hansa's total market revenue to be US $50M, which projections based on our scrapes underestimate by a factor of four. Simulations further show that studies based on one or two scrapes are likely to suffer from a very poor coverage (on average, 14% to 30%, respectively).
A high scraping frequency is crucial to achieve reliable coverage, even without a consistent scraping routine. When high-frequency scraping is difficult, e.g., due to deployed anti-scraping countermeasures, innovative scraper design, such as scraping most popular listings first, helps improve coverage. Finally, abundance estimators can provide insights on population coverage when population sizes are unknown.
Andreas Kogler, Graz University of Technology; Jonas Juffinger, Graz University of Technology and Lamarr Security Research; Salman Qazi and Yoongu Kim, Google; Moritz Lipp, Amazon Web Services; Nicolas Boichat, Google; Eric Shiu, Rivos; Mattias Nissler, Google; Daniel Gruss, Graz University of Technology
Rowhammer is a vulnerability in modern DRAM where repeated accesses to one row (the aggressor) give off electrical disturbance whose cumulative effect flips the bits in an adjacent row (the victim). Consequently, Rowhammer defenses presuppose the adjacency of aggressor-victim pairs, including those in LPDDR4 and DDR4, most notably TRR.
In this paper, we present Half-Double, an escalation of Rowhammer to rows beyond immediate neighbors. Using Half-Double, we induce errors in a victim by combining many accesses to a distance-2 row with just a few to a distance-1 row. Our experiments show that the cumulative effect of these leads to a sufficient electrical disturbance in the victim row, inducing bit flips. We demonstrate the practical relevance of Half-Double in a proof-of-concept attack on a fully up-to-date system. We use side channels, a new technique called BlindHammering, a new spraying technique, and a Spectre attack in our end-to-end Half-Double Attack. On recent Chromebooks with ECC- and TRR-protected LPDDR4x memory, the attack takes less than 45 minutes on average.
Pietro Borrello, Sapienza University of Rome; Andreas Kogler and Martin Schwarzl, Graz University of Technology; Moritz Lipp, Amazon Web Services; Daniel Gruss, Graz University of Technology; Michael Schwarz, CISPA Helmholtz Center for Information Security
CPU vulnerabilities undermine the security guarantees provided by software- and hardware-security improvements. While the discovery of transient-execution attacks increased the interest in CPU vulnerabilities on a microarchitectural level, architectural CPU vulnerabilities are still understudied.
In this paper, we systematically analyze existing CPU vulnerabilities showing that CPUs suffer from vulnerabilities whose root causes match with those in complex software. We show that transient-execution attacks and architectural vulnerabilities often arise from the same type of bug and identify the blank spots. Investigating the blank spots, we focus on architecturally improperly initialized data locations.
We discover ÆPIC Leak, the first architectural CPU bug that leaks stale data from the microarchitecture without using a side channel. ÆPIC Leak works on all recent Sunny- Cove-based Intel CPUs (i.e., Ice Lake and Alder Lake). It architecturally leaks stale data incorrectly returned by reading undefined APIC-register ranges. ÆPIC Leak samples data transferred between the L2 and last-level cache, including SGX enclave data, from the superqueue. We target data in use, e.g., register values and memory loads, as well as data at rest, e.g., SGX-enclave data pages. Our end-to-end attack extracts AES-NI, RSA, and even the Intel SGX attestation keys from enclaves within a few seconds. We discuss mitigations and conclude that the only short-term mitigations for ÆPIC Leak are to disable APIC MMIO or not rely on SGX.
Yingchen Wang, University of Texas at Austin; Riccardo Paccagnella and Elizabeth Tang He, University of Illinois Urbana-Champaign; Hovav Shacham, University of Texas at Austin; Christopher W. Fletcher, University of Illinois Urbana-Champaign; David Kohlbrenner, University of Washington
Power side-channel attacks exploit data-dependent variations in a CPU's power consumption to leak secrets. In this paper, we show that on modern Intel (and AMD) x86 CPUs, power side-channel attacks can be turned into timing attacks that can be mounted without access to any power measurement interface. Our discovery is enabled by dynamic voltage and frequency scaling (DVFS). We find that, under certain circumstances, DVFS-induced variations in CPU frequency depend on the current power consumption (and hence, data) at the granularity of milliseconds. Making matters worse, these variations can be observed by a remote attacker, since frequency differences translate to wall time differences!
The frequency side channel is theoretically more powerful than the software side channels considered in cryptographic engineering practice today, but it is difficult to exploit because it has a coarse granularity. Yet, we show that this new channel is a real threat to the security of cryptographic software. First, we reverse engineer the dependency between data, power, and frequency on a modern x86 CPU—finding, among other things, that differences as seemingly minute as a set bit's position in a word can be distinguished through frequency changes. Second, we describe a novel chosen-ciphertext attack against (constant-time implementations of) SIKE, a post-quantum key encapsulation mechanism, that amplifies a single key-bit guess into many thousands of high- or low-power operations, allowing full key extraction via remote timing.
Jean-Luc Watson, Sameer Wagh, and Raluca Ada Popa, University of California, Berkeley
Secure multi-party computation (MPC) is an essential tool for privacy-preserving machine learning (ML). However, secure training of large-scale ML models currently requires a prohibitively long time to complete. Given that large ML inference and training tasks in the plaintext setting are significantly accelerated by Graphical Processing Units (GPUs), this raises the natural question: can secure MPC leverage GPU acceleration? A few recent works have studied this question in the context of accelerating specific components or protocols, but do not provide a general-purpose solution. Consequently, MPC developers must be both experts in cryptographic protocol design and proficient at low-level GPU kernel development to achieve good performance on any new protocol implementation.
We present Piranha, a general-purpose, modular platform for accelerating secret sharing-based MPC protocols using GPUs. Piranha allows the MPC community to easily leverage the benefits of a GPU without requiring GPU expertise. Piranha contributes a three-layer architecture: (1) a device layer that can independently accelerate secret-sharing protocols by providing integer-based kernels absent in current general-purpose GPU libraries, (2) a modular protocol layer that allows developers to maximize utility of limited GPU memory with in-place computation and iterator-based support for non-standard memory access patterns, and (3) an application layer that allows applications to remain completely agnostic to the underlying protocols they use.
To demonstrate the benefits of Piranha, we implement 3 state-of-the-art linear secret sharing MPC protocols for secure NN training: 2-party SecureML (IEEE S&P '17), 3-party Falcon (PETS '21), and 4-party FantasticFour (USENIX Security '21). Compared to their CPU-based implementations, the same protocols implemented on top of Piranha's protocol-agnostic acceleration exhibit a 16-48x decrease in training time. For the first time, Piranha demonstrates the feasibility of training a realistic neural network (e.g. VGG), end-to-end, using MPC in a little over one day. Piranha is open source and available at https://github.com/ucbrise/piranha.
Ju Chen, UC Riverside; Wookhyun Han, KAIST; Mingjun Yin, Haochen Zeng, and Chengyu Song, UC Riverside; Byoungyoung Lee, Seoul National University; Heng Yin, UC Riverside; Insik Shin, KAIST
Concolic execution is a powerful program analysis technique for systematically exploring execution paths. Compared to random-mutation-based fuzzing, concolic execution is especially good at exploring paths that are guarded by complex and tight branch predicates. The drawback, however, is that concolic execution engines are much slower than native execution. While recent advances in concolic execution have significantly reduced its performance overhead, our analysis shows that state-of-the-art concolic executors overlook the overhead for managing symbolic expressions. Based on the observation that concolic execution can be modeled as a special form of dynamic data-flow analysis, we propose to leverage existing highly-optimized data-flow analysis frameworks to implement concolic executors. To validate this idea, we implemented a prototype SYMSAN based on the data-flow sanitizer of LLVM and evaluated it against the state-of-the-art concolic executors SymCC and SymQEMU with three sets of programs: nbench, the DARPA Cyber Grand Challenge dataset, and real-world applications from Google's Fuzzbench and binutils. The results showed that SYMSAN has a much lower overhead for managing symbolic expressions. The reduced overhead can also lead to faster concolic execution and improved code coverage.
Zirui Neil Zhao, University of Illinois Urbana-Champaign; Adam Morrison, Tel Aviv University; Christopher W. Fletcher and Josep Torrellas, University of Illinois Urbana-Champaign
Microarchitectural side channels are a pressing security threat. These channels are created when programs modulate hardware resources in a secret data-dependent fashion. They are broadly classified as being either stateful or stateless (also known as contention-based), depending on whether they leave behind a trace for attackers to later observe. Common wisdom suggests that stateful channels are significantly easier to monitor than stateless ones, and hence have received the most attention.
In this paper, we present a novel stateless attack that shows this common wisdom is not always true. Our attack, called Binoculars, exploits unexplored interactions between in-flight page walk operations and other memory operations. Unlike other stateless channels, Binoculars creates significant timing perturbations—up to 20,000 cycles stemming from a single dynamic instruction—making it easy to monitor. We show how these perturbations are address dependent, enabling Binoculars to leak more virtual address bits in victim memory operations than any prior channel. Binoculars needs no shared memory between the attacker and the victim.
Using Binoculars, we design both covert- and side-channel attacks. Our covert channel achieves a high capacity of 1116 KB/s on a Cascade Lake-X machine. We then design a sidechannel attack that steals keys from OpenSSL's side-channel resistant ECDSA by learning the ECDSA nonce k. Binoculars' ability to significantly amplify subtle behaviors, e.g., orderings of stores, is crucial for this attack to succeed because the nonce changes after each run. Finally, we fully break kernel ASLR.
Flavien Solt, ETH Zurich; Ben Gras, Intel Corporation; Kaveh Razavi, ETH Zurich
Dynamic Information Flow Tracking (dynamic IFT) is a well-known technique with many security applications such as analyzing the behavior of a system given an input and detecting security violations. While there are many widely used open dynamic IFT solutions that scale to large software, the same level of support is unfortunately lacking for hardware. This gap is becoming more pronounced with the increasing complexity of open-source hardware and the plethora of recent hardware attacks.
We introduce CellIFT, a new design point in the space of dynamic IFT for hardware. CellIFT leverages the logical macrocell abstraction (e.g., an adder) to achieve scalability, precision and completeness when instrumenting a given Register Transfer Level (RTL) hardware design. Cell-level dynamic IFT does not suffer from the scalability problems that are inherent to lower levels of abstraction such as gates, yet it achieves completeness given the limited number of cell types. We show the versatility of CellIFT by instrumenting five distinct RISC-V designs, one of which is a complete SoC. The only existing complete solution already fails to instrument two of these designs. Our extensive evaluation using microbenchmarks and standard RISC-V benchmarks on the instrumented designs shows that CellIFT is 21× to 61× faster than the state of the art in terms of simulation runtime without losing precision. We further show-case concrete applications of CellIFT in four scenarios by detecting: 1) sources of microarchitectural information leakage, 2) microarchitectural bugs such as Meltdown, 3) speculative vulnerabilities such as Spectre-BCB, and 4) SoC-wide architectural design flaws. We release CellIFT as open source to enable RTL-level security research for the wider community.
MOVERY: A Precise Approach for Modified Vulnerable Code Clone Discovery from Modified Open-Source Software Components
Seunghoon Woo, Hyunji Hong, Eunjin Choi, and Heejo Lee, Korea University
Vulnerabilities inherited from third-party open-source software (OSS) components can compromise the entire software security. However, discovering propagated vulnerable code is challenging as it proliferates with various code syntaxes owing to the OSS modifications, more specifically, internal (e.g., OSS updates) and external modifications of OSS (e.g., code changes that occur during the OSS reuse).
In this paper, we present MOVERY, a precise approach for discovering vulnerable code clones (VCCs) from modified OSS components. By considering the oldest vulnerable function and extracting only core vulnerable and patch lines from security patches, MOVERY generates vulnerability and patch signatures that effectively address OSS modifications. For scalability, MOVERY reduces the search space of the target software by focusing only on the codes borrowed from other OSS projects. Finally, MOVERY determines that the function is VCC when it matches the vulnerability signature and is distinctive from the patch signature.
When we applied MOVERY on ten popular software selected from diverse domains, we observed that 91% of the discovered VCCs had different code syntax from the disclosed vulnerable function. Nonetheless, MOVERY discovered VCCs at least 2.5 times more than those discovered in existing techniques, with much higher accuracy: MOVERY discovered 415 VCCs with 96% precision and 96% recall, whereas two recent VCC discovery techniques, which hardly consider internal and external OSS modifications, discovered only 163 and 72 VCCs with at most 77% precision and 38% recall.
Sebastian Roth, CISPA Helmholtz Center for Information Security; Stefano Calzavara, Università Ca' Foscari Venezia; Moritz Wilhelm, CISPA Helmholtz Center for Information Security; Alvise Rabitti, Università Ca' Foscari Venezia; Ben Stock, CISPA Helmholtz Center for Information Security
To mitigate a myriad of Web attacks, modern browsers support client-side security policies shipped through HTTP response headers. To enforce these defenses, the server needs to communicate them to the client, a seemingly straightforward process. However, users may access the same site in variegate ways, e.g., using different User-Agents, network access methods, or language settings. All these usage scenarios should enforce the same security policies, otherwise a security lottery would take place: depending on specific client characteristics, different levels of Web application security would be provided to users (inconsistencies). We formalize security guarantees provided through four popular mechanisms and apply this to measure the prevalence of inconsistencies in the security policies of top sites across different client characteristics. Based on our insights, we investigate the security implications of both deterministic and non-deterministic inconsistencies, and show how even prominent services are affected by them.
Jinsheng Ba, National University of Singapore; Marcel Böhme, Monash University and MPI-SP; Zahra Mirzamomen, Monash University; Abhik Roychoudhury, National University of Singapore
Many protocol implementations are reactive systems, where the protocol process is in continuous interaction with other processes and the environment. If a bug can be exposed only in a certain state, a fuzzer needs to provide a specific sequence of events as inputs that would take protocol into this state before the bug is manifested. We call these bugs as "stateful" bugs. Usually, when we are testing a protocol implementation, we do not have a detailed formal specification of the protocol to rely upon. Without knowledge of the protocol, it is inherently difficult for a fuzzer to discover such stateful bugs. A key challenge then is to cover the state space without an explicit specification of the protocol. Finding stateful bugs in protocol implementations would thus involve partially uncovering the state space of the protocol. Fuzzing stateful software systems would need to incorporate strategies for state identification. Such state identification may follow from manual guidance, or from automatic analysis.
In this work, we posit that manual annotations for state identification can be avoided for stateful protocol fuzzing. Specifically, we rely on a programmatic intuition that the state variables used in protocol implementations often appear in enum type variables whose values (the state names) come from named constants. In our analysis of the Top-50 most widely used open-source protocol implementations, we found that every implementation uses state variables that are assigned named constants (with easy to comprehend names such as INIT, READY) to represent the current state. In this work, we propose to automatically identify such state variables and track the sequence of values assigned to them during fuzzing to produce a "map" of the explored state space.
Our experiments confirm that our stateful fuzzer discovers stateful bugs twice as fast as the baseline greybox fuzzer that we extended. Starting from the initial state, our fuzzer exercises one order of magnitude more state/transition sequences and covers code two times faster than the baseline fuzzer. Several zero-day bugs in prominent protocol implementations were found by our fuzzer, and 8 CVEs have been assigned.
Philipp Jeitner, Fraunhofer Institute for Secure Information Technology SIT and National Research Center for Applied Cybersecurity ATHENE; Haya Shulman, Fraunhofer Institute for Secure Information Technology SIT, National Research Center for Applied Cybersecurity ATHENE, and Goethe-Universität Frankfurt; Lucas Teichmann, Fraunhofer Institute for Secure Information Technology SIT; Michael Waidner, Fraunhofer Institute for Secure Information Technology SIT, National Research Center for Applied Cybersecurity ATHENE, and Technische Universität Darmstadt
We explore the security of residential routers and find a range of critical vulnerabilities. Our evaluations show that 10 out of 36 popular routers are vulnerable to injections of fake records via misinterpretation of special characters. We also find that in 15 of the 36 routers the mechanisms, that are meant to prevent cache poisoning attacks, can be circumvented.
In our Internet-wide study with an advertisement network, we identified and analyzed 976 residential routers used by web clients, out of which more than 95% were found vulnerable to our attacks. Overall, vulnerable routers are prevalent and are distributed among 177 countries and 4830 networks.
To understand the core factors causing the vulnerabilities we perform black- and white-box analyses of the routers. We find that many problems can be attributed to incorrect assumptions on the protocols' behaviour and the Internet, misunderstanding of the standard recommendations, bugs, and simplified DNS software implementations.
We provide recommendations to mitigate our attacks. We also set up a tool to enable everyone to evaluate the security of their routers at https://xdi-attack.net/.
Harshad Sathaye, Northeastern University; Martin Strohmeier and Vincent Lenders, armasuisse; Aanjhan Ranganathan, Northeastern University
Today, there is limited knowledge about the behavior of UAVs under GPS spoofing attacks in a real-world environment, in particular considering the interplay between the UAV's software as well as other equipped navigation aids and vision sensors. This work aims to understand the feasibility and requirements of fully controlling a UAV's movements by spoofing GPS signals alone. We enumerate the challenges in accomplishing a complete UAV takeover through GPS spoofing and controlling it without crashing. We design and implement a Real-time GPS Signal Generator (RtGSG) that can be configured to generate any arbitrary trajectory and is capable of making changes to GPS signals in real-time through user input, e.g., using a keyboard or joystick. We evaluate RtGSG on popular commercial UAVs from DJI and Autel through over-the-air spoofing experiments in a controlled chamber. We explore generic and UAV-specific GPS spoofing strategies in order to best achieve complete maneuvering control (e.g., velocity and direction). This work highlights that, although COTS UAVs remain vulnerable to GPS spoofing attacks, a complete takeover and control of the UAV requires careful manipulation of the spoofing signals in real-time. Finally, we release our implementation to the scientific community for further research.
Johannes Krupp, CISPA Helmholtz Center for Information Security; Ilya Grishchenko, University of California, Santa Barbara; Christian Rossow, CISPA Helmholtz Center for Information Security
Amplification DDoS attacks remain a prevalent and serious threat to the Internet, with recent attacks reaching the Tbps range. However, all amplification attack vectors known to date were either found by researchers through laborious manual analysis or could only be identified postmortem following large attacks. Ideally though, an attack vector is discovered and mitigated before the first attack can occur.
To this end, we present AmpFuzz, the first systematic approach to find amplification vectors in UDP services in a protocol-agnostic way. AmpFuzz is based on the state-of-the-art greybox fuzzing boosted by a novel technique to make fuzzing UDP-aware, which significantly increases performance. We evaluate AmpFuzz on 28 Debian network services, where we (re-)discover 7 known and 6 previously unreported amplification vulnerabilities.
Tobias Cloosters, University of Duisburg-Essen; Johannes Willbold, Ruhr-Universität Bochum; Thorsten Holz, CISPA Helmholtz Center for Information Security; Lucas Davi, University of Duisburg-Essen
Intel's Software Guard Extensions (SGX) provide a nonintrospectable trusted execution environment (TEE) to protect security-critical code from a potentially malicious OS. This protection can only be effective if the individual enclaves are secure, which is already challenging in regular software, and this becomes even more difficult for enclaves as the entire environment is potentially malicious. As such, many enclaves expose common vulnerabilities, e.g., memory corruption and SGXspecific vulnerabilities like null-pointer dereferences. While fuzzing is a popular technique to assess the security of software, dynamically analyzing enclaves is challenging as enclaves are meant to be non-introspectable. Further, they expect an allocated multi-pointer structure as input instead of a plain buffer.
In this paper, we present SGXFUZZ, a coverage-guided fuzzer that introduces a novel binary input structure synthesis method to expose enclave vulnerabilities even without source-code access. To obtain code coverage feedback from enclaves, we show how to extract enclave code from distribution formats. We also present an enclave runner that allows execution of the extracted enclave code as a user-space application at native speed, while emulating all relevant environment interactions of the enclave. We use this setup to fuzz enclaves using a state-of-the-art snapshot fuzzing engine that deploys our novel structure synthesis stage. This stage synthesizes multi-layer pointer structures and size fields incrementally on-the-fly based on fault signals. Furthermore, it matches the expected input format of the enclave without any prior knowledge. We evaluate our approach on 30 open- and closed-source enclaves and found a total of 79 new bugs and vulnerabilities.
Moritz Schloegel, Tim Blazytko, Moritz Contag, Cornelius Aschermann, and Julius Basler, Ruhr-Universität Bochum; Thorsten Holz, CISPA Helmholtz Center for Information Security; Ali Abbasi, Ruhr-Universität Bochum
Software obfuscation is a crucial technology to protect intellectual property and manage digital rights within our society. Despite its huge practical importance, both commercial and academic state-of-the-art obfuscation methods are vulnerable to a plethora of automated deobfuscation attacks, such as symbolic execution, taint analysis, or program synthesis. While several enhanced obfuscation techniques were recently proposed to thwart taint analysis or symbolic execution, they either impose a prohibitive runtime overhead or can be removed in an automated way (e.g., via compiler optimizations). In general, these techniques suffer from focusing on a single attack vector, allowing an attacker to switch to other, more effective techniques, such as program synthesis. In this work, we present Loki, an approach for software obfuscation that is resilient against all known automated deobfuscation attacks. To this end, we use and efficiently combine multiple techniques, including a generic approach to synthesize formally verified expressions of arbitrary complexity. Contrary to state-of-the-art approaches that rely on a few hardcoded generation rules, our expressions are more diverse and harder to pattern match against. We show that even the state-of-the-art approach on Mixed-Boolean Arithmetic (MBA) deobfuscation fails to simplify them. Moreover, Loki protects against previously unaccounted attack vectors such as program synthesis, for which it reduces the success rate to merely 19%. In a comprehensive evaluation, we show that our design incurs significantly less overhead while providing a much stronger protection level compared to existing works.
Hongbin Liu, Jinyuan Jia, and Neil Zhenqiang Gong, Duke University
Contrastive learning pre-trains an image encoder using a large amount of unlabeled data such that the image encoder can be used as a general-purpose feature extractor for various downstream tasks. In this work, we propose PoisonedEncoder, a data poisoning attack to contrastive learning. In particular, an attacker injects carefully crafted poisoning inputs into the unlabeled pre-training data, such that the downstream classifiers built based on the poisoned encoder for multiple target downstream tasks simultaneously classify attacker-chosen, arbitrary clean inputs as attacker-chosen, arbitrary classes. We formulate our data poisoning attack as a bilevel optimization problem, whose solution is the set of poisoning inputs; and we propose a contrastive-learning-tailored method to approximately solve it. Our evaluation on multiple datasets shows that PoisonedEncoder achieves high attack success rates while maintaining the testing accuracy of the downstream classifiers built upon the poisoned encoder for non-attacker-chosen inputs. We also evaluate five defenses against PoisonedEncoder, including one pre-processing, three in-processing, and one post-processing defenses. Our results show that these defenses can decrease the attack success rate of PoisonedEncoder, but they also sacrifice the utility of the encoder or require a large clean pre-training dataset.
Avinash Sudhodanan, Independent Researcher; Andrew Paverd, Microsoft Security Response Center
The ubiquity of user accounts in websites and online services makes account hijacking a serious security concern. Although previous research has studied various techniques through which an attacker can gain access to a victim's account, relatively little attention has been directed towards the process of account creation. The current trend towards federated authentication (e.g., Single Sign-On) adds an additional layer of complexity because many services now support both the classic approach in which the user directly sets a password, and the federated approach in which the user authenticates via an identity provider.
Inspired by previous work on preemptive account hijacking [Ghasemisharif et al., USENIX SEC 2018], we show that there exists a whole class of account pre-hijacking attacks. The distinctive feature of these attacks is that the attacker performs some action before the victim creates an account, which makes it trivial for the attacker to gain access after the victim has created/recovered the account. Assuming a realistic attacker who knows only the victim's email address, we identify and discuss five different types of account pre-hijacking attacks.
To ascertain the prevalence of such vulnerabilities in the wild, we analyzed 75 popular services and found that at least 35 of these were vulnerable to one or more account pre-hijacking attacks. Whilst some of these may be noticed by attentive users, others were completely undetectable from the victim's perspective. Finally, we investigated the root cause of these vulnerabilities and present a set of security requirements to prevent such vulnerabilities arising in future.
Viet Tung Hoang, Cong Wu, and Xin Yuan, Florida State University
Distinguished Paper Award Winner
System logs are crucial for forensic analysis, but to be useful, they need to be tamper-proof. To protect the logs, a number of secure logging systems have been proposed from both academia and the industry. Unfortunately, except for the recent KennyLoggings construction, all other logging systems are broken by an attack of Paccagnella et al. (CCS 2020). In this work, we build a secure logging system that improves KennyLoggings in several fronts: adoptability, security, and performance. Our key insight for performance gain is to use AES on a fixed, known key. While this trick is widely used in secure distributed computing, this is the first time it has found an application in the area of symmetric-key cryptography.
Investigating State-of-the-Art Practices for Fostering Subjective Trust in Online Voting through Interviews
Karola Marky, Leibniz University Hannover and University of Glasgow; Paul Gerber and Sebastian Günther, Technical University of Darmstadt; Mohamed Khamis, University of Glasgow; Maximilian Fries and Max Mühlhäuser, Technical University of Darmstadt
Ensuring voters' subjective trust is key to adopting any voting system. Consequently, researchers, experts, and policymakers have proposed and implemented practices to foster the trust of voters in online voting. State-of-the-art practices include security features, public information, or evaluations. However, it remains unclear how these practices affect the voters' subjective trust. Through interviews with 26 participants, this work presents the first analysis of voters' perceptions considering state-of-the-art practices that help voters determine their trust in Internet voting. Among our results, we show practices, such as expert evaluations, that we identified as mandatory. Further, we found practices, such as individual verifiability, that facilitate trust. Others, such as vote updating, have a negative impact due to unfamiliarity. We, furthermore, report misconceptions, discuss ways to address them through different information interfaces or as part of the voting software. Finally, we list recommendations for the specific realization of expedient practices to inform developers and policymakers.
Yunang Chen, Yue Gao, Nick Ceccio, Rahul Chatterjee, Kassem Fawaz, and Earlence Fernandes, University of Wisconsin–Madison
Business Collaboration Platforms like Microsoft Teams and Slack enable teamwork by supporting text chatting and third-party resource integration. A user can access online file storage, make video calls, and manage a code repository, all from within the platform, thus making them a hub for sensitive communication and resources. The key enabler for these productivity features is a third-party application model. We contribute an experimental security analysis of this model and the third-party apps. Performing this analysis is challenging because commercial platforms and their apps are closed-source systems. Our analysis methodology is to systematically investigate different types of interactions possible between apps and users. We discover that the access control model in these systems violates two fundamental security principles: least privilege and complete mediation. These violations enable a malicious app to exploit the confidentiality and integrity of user messages and third-party resources connected to the platform. We construct proof-of-concept attacks that can: (1) eavesdrop on user messages without having permission to read those messages; (2) launch fake video calls; (3) automatically merge code into repositories without user approval or involvement. Finally, we provide an analysis of countermeasures that systems like Slack and Microsoft Teams can adopt today.
Ben Burgess, Princeton University; Avi Ginsberg, Georgetown Law; Edward W. Felten, Princeton University; Shaanan Cohney, University of Melbourne
Educators are rapidly switching to remote proctoring and examination software for their testing needs, both due to the COVID-19 pandemic and the expanding virtualization of the education sector. State boards are increasingly utilizing these software packages for high stakes legal and medical licensing exams. Three key concerns arise with the use of these complex programs: exam integrity, exam procedural fairness, and exam-taker security and privacy.
We conduct the first technical analysis of each of these concerns through a case study of four primary proctoring suites used in U.S. law school and state attorney licensing exams. We reverse engineer these proctoring suites and find that despite promises of high-security, all their anti-cheating measures can be trivially bypassed and can pose significant user security risks.
We evaluate current facial recognition classifiers alongside the classifier used by Examplify, the legal exam proctoring suite with the largest market share, to ascertain their accuracy and determine whether faces with certain skin tones are more readily flagged for cheating. Finally, we offer recommendations to improve the integrity and fairness of the remotely proctored exam experience.
Pool Inference Attacks on Local Differential Privacy: Quantifying the Privacy Guarantees of Apple's Count Mean Sketch in Practice
Andrea Gadotti, Imperial College London; Florimond Houssiau, Alan Turing Institute; Meenatchi Sundaram Muthu Selva Annamalai and Yves-Alexandre de Montjoye, Imperial College London
Behavioral data generated by users’ devices, ranging from emoji use to pages visited, are collected at scale to improve apps and services. These data, however, contain fine-grained records and can reveal sensitive information about individual users. Local differential privacy has been used by companies as a solution to collect data from users while preserving privacy. We here first introduce pool inference attacks, where an adversary has access to a user’s obfuscated data, defines pools of objects, and exploits the user’s polarized behavior in multiple data collections to infer the user’s preferred pool. Second, we instantiate this attack against Count Mean Sketch, a local differential privacy mechanism proposed by Apple and deployed in iOS and Mac OS devices, using a Bayesian model. Using Apple’s parameters for the privacy loss ε, we then consider two specific attacks: one in the emojis setting— where an adversary aims at inferring a user’s preferred skin tone for emojis — and one against visited websites — where an adversary wants to learn the political orientation of a user from the news websites they visit. In both cases, we show the attack to be much more effective than a random guess when the adversary collects enough data. We find that users with high polarization and relevant interest are significantly more vulnerable, and we show that our attack is well-calibrated, allowing the adversary to target such vulnerable users. We finally validate our results for the emojis setting using user data from Twitter. Taken together, our results show that pool inference attacks are a concern for data protected by local differential privacy mechanisms with a large ε, emphasizing the need for additional technical safeguards and the need for more research on how to apply local differential privacy for multiple collections.
Igibek Koishybayev and Aleksandr Nahapetyan, North Carolina State University; Raima Zachariah, Independent Researcher; Siddharth Muralee, Purdue University; Bradley Reaves and Alexandros Kapravelos, North Carolina State University; Aravind Machiry, Purdue University
Continuous integration and deployment (CI/CD) has revolutionized software development and maintenance. Commercial CI/CD platforms provide services for specifying and running CI/CD actions. However, they present a security risk in their own right, given their privileged access to secrets, infrastructure, and ability to fetch and execute arbitrary code.
In this paper, we study the security of the newly popular GitHub CI platform. We first identify four fundamental security properties that must hold for any CI/CD system: Admittance Control, Execution Control, Code Control, and Access to Secrets. We then examine if GitHub CI enforces these properties in comparison with the other five popular CI/CD platforms. We perform a comprehensive analysis of 447,238 workflows spanning 213,854 GitHub repositories. We made several disturbing observations. Our analysis shows that 99.8% of workflows are overprivileged and have read-write access (instead of read-only) to the repository. In addition, 23.7% of workflows are triggerable by a pull_request and use code from the underlying repository. An attacker can exploit these workflows and execute arbitrary code as part of the workflow. Due to the modular nature of workflows, we find that 99.7% of repositories in our dataset execute some externally developed plugin, called "Actions" , for various purposes. We found that 97% of repositories execute at least one Action that does not originate with a verified creator, and 18% of repositories in our dataset execute at least one Action with missing security updates. These represent potential attack vectors that can be used to compromise the execution of workflows, consequently leading to supply chain attacks. This work highlights the systemic risks inherent in CI/CD platforms like GitHub CI; we also present our own Github action, GWChecker, which functions as an early warning system for bad practices that violate the identified security properties.
Bahruz Jabiyev, Steven Sprecher, Anthony Gavazzi, and Tommaso Innocenti, Northeastern University; Kaan Onarlioglu, Akamai Technologies; Engin Kirda, Northeastern University
HTTP/2 adoption is rapidly climbing. However, in practice, Internet communications still rarely happen over end-to-end HTTP/2 channels. This is due to Content Delivery Networks and other reverse proxies, ubiquitous and necessary components of the Internet ecosystem, which only support HTTP/2 on the client's end, but not the forward connection to the origin server. Instead, proxy technologies predominantly rely on HTTP/2-to-HTTP/1 protocol conversion between the two legs of the connection.
We present the first systematic exploration of HTTP/2-to-HTTP/1 protocol conversion anomalies and their security implications. We develop a novel grammar-based fuzzer for HTTP/2, experiment with 12 popular reverse proxy technologies & CDNs through HTTP/2 frame sequence and content manipulation, and discover a plethora of novel web application attack vectors that lead to Request Blackholing, Denial-of-Service, Query-of-Death, and Request Smuggling attacks.
Kinan Dak Albab, Brown University; Rawane Issa and Mayank Varia, Boston University; Kalman Graffi, Honda Research Institute Europe
Private Information Retrieval (PIR) allows several clients to query a database held by one or more servers, such that the contents of their queries remain private. Prior PIR schemes have achieved sublinear communication and computation by leveraging computational assumptions, federating trust among many servers, relaxing security to permit differentially private leakage, refactoring effort into an offline stage to reduce online costs, or amortizing costs over a large batch of queries.
In this work, we present an efficient PIR protocol that combines all of the above techniques to achieve constant amortized communication and computation complexity in the size of the database and constant client work. We leverage differentially private leakage in order to provide better trade-offs between privacy and efficiency. Our protocol achieves speed-ups up to and exceeding 10x in practical settings compared to state of the art PIR protocols, and can scale to batches with hundreds of millions of queries on cheap commodity AWS machines. Our protocol builds upon a new secret sharing scheme that is both incremental and non-malleable, which may be of interest to a wider audience. Our protocol provides security up to abort against malicious adversaries that can corrupt all but one party.
J. Alex Halderman, University of Michigan
Distinguished Paper Award Winner
In November 2020, Antrim County, Michigan published unofficial election results that misstated totals in the presidential race and other contests by up to several thousand votes. Antrim subsequently issued a series of corrections, and the certified presidential results were confirmed by a hand count. Nevertheless, Antrim was repeatedly cited by the former President as evidence of widespread fraud, and it remains a centerpiece of conspiracy theories about the 2020 election. At the request of the Michigan Secretary of State and Attorney General, I performed a forensic investigation of the incident. Using data from the election system, I precisely reproduce the major anomalies, explain their cause, and verify they have been corrected. However, I also uncover other errors affecting specific down-ballot contests that have not been corrected, despite the unusual attention focused on the results, one of which may have changed the outcome of a local contest. Based on this analysis, I refute false claims and disinformation about the incident, concluding that it was not the result of a security breach but rather a series of operator errors compounded by inadequate procedures and insufficiently defensive software design. These events offer lessons for election administration and highlight the value of rigorously investigating election technology incidents for enhancing accuracy and public trust.
How Are Your Zombie Accounts? Understanding Users' Practices and Expectations on Mobile App Account Deletion
Yijing Liu, Yan Jia, Qingyin Tan, and Zheli Liu, Nankai University; Luyi Xing, Indiana University Bloomington
Account deletion is an important way for users to exercise their right to delete. However, little work has been done to evaluate the usability of account deletion in mobile apps. In this paper, we conducted a 647-participants online survey covering two countries along with an additional 20-participants on-site interview to explore users' awareness, practices, and expectations for mobile app account deletion. The studies were based on the account deletion model we proposed, which was summarized from an empirical measurement covering 60 mobile apps. The results reveal that although account deletion is highly demanded, users commonly keep zombie app accounts in practice due to the lack of awareness. Moreover, users' understandings and expectations of account deletion are different from the current design of apps in many aspects. Our findings indicate that current ruleless implementations made consumers feel inconvenienced during the deletion process, especially the hidden entry and complex operation steps, which even blocked a non-negligible number of users exercising account deletion. Finally, we provide some design recommendations for making mobile app account deletion more usable for consumers.
Sujaya Maiyya, Seif Ibrahim, Caitlin Scarberry, Divyakant Agrawal, and Amr El Abbadi, UC Santa Barbara; Huijia Lin and Stefano Tessaro, University of Washington; Victor Zakhary, Oracle
Privacy and security challenges due to the outsourcing of data storage and processing to third-party cloud providers are well known. With regard to data privacy, Oblivious RAM (ORAM) schemes provide strong privacy guarantees by not only hiding the contents of the data (by encryption) but also obfuscating the access patterns of the outsourced data. But most existing ORAM datastores are not fault tolerant in that if the external storage server (which stores encrypted data) or the trusted proxy (which stores the encryption key and other metadata) crashes, an application loses all of its data. To achieve fault tolerance, we propose QuORAM, the first ORAM datastore to replicate data with a quorum-based replication protocol. QuORAM's contributions are three-fold: (i) it obfuscates access patterns to provide obliviousness guarantees, (ii) it replicates data using a novel lock-free and decentralized replication protocol to achieve fault tolerance, and (iii) it guarantees linearizable semantics. Experimentally evaluating QuORAM highlights counter-intuitive results: QuORAM incurs negligible cost to achieve obliviousness when compared to an insecure fault-tolerant replicated system; QuORAM's peak throughput is 2.4x of its non-replicated baseline; and QuORAM performs 33.2x better in terms of throughput than an ORAM datastore that relies on CockroachDB, an open-source geo-replicated database, for fault tolerance.
Marina Sanusi Bohuk, Cornell University; Mazharul Islam, University of Wisconsin–Madison; Suleman Ahmad, Cloudflare; Michael Swift, University of Wisconsin–Madison; Thomas Ristenpart, Cornell Tech; Rahul Chatterjee, University of Wisconsin–Madison
Passwords remain the primary way to authenticate users online. Yet little is known about the characteristics of login requests submitted to login systems due to the sensitivity of monitoring submitted passwords. This means we don't have answers to basic questions, such as how often users submit a password similar to their actual password, whether users often resubmit the same incorrect password, how many users utilize passwords known to be in a public breach, and more. Whether we can build and deploy measurement infrastructure to safely answer such questions is, itself, an open question.
We offer a system, called Gossamer, that enables securely logging information about login attempts, including carefully chosen statistics about submitted passwords. We provide a simulation-based approach for tuning the security-utility trade-offs for storing different password-derived statistics. This enables us to gather useful measurements while reducing risk even in the unlikely case of complete compromise of the measurement system. We worked closely with two large universities and deployed Gossamer to perform a measurement study that observed 34 million login requests over a seven month period. The measurements we gather provide insight into the use of breached credentials, password usability, and other characteristics of the submitted login requests.
Ruoyu Wu, Purdue University; Taegyu Kim, The Pennsylvania State University; Dave (Jing) Tian, Antonio Bianchi, and Dongyan Xu, Purdue University
The usage of Deep Neural Networks (DNNs) has steadily increased in recent years. Especially when used in edge devices, dedicated DNN compilers are used to compile DNNs into binaries. Many security applications (such as DNN model extraction, white-box adversarial sample generation, and DNN model patching and hardening) are possible when a DNN model is accessible. However, these techniques cannot be applied to compiled DNNs. Unfortunately, no dedicated decompiler exists that is able to recover a high-level representation of a DNN starting from its compiled binary code.
To address this issue, we propose DnD, the first compiler- and ISA-agnostic DNN decompiler. DnD uses symbolic execution, in conjunction with a dedicated loop analysis, to lift the analyzed binary code into a novel intermediate representation, able to express the high-level mathematical DNN operations in a compiler- and ISA-agnostic way. Then, DnD matches the extracted mathematical DNN operations with template mathematical DNN operations, and it recovers hyper-parameters and parameters of all the identified DNN operators, as well as the overall DNN topology. Our evaluation shows that DnD can perfectly recover different DNN models, extracting them from binaries compiled by two different compilers (Glow and TVM) for three different ISAs (Thumb, AArch64, and x86-64). Moreover, DnD enables extracting the DNN models used by real-world micro-controllers and attacking them using white-box adversarial machine learning techniques.
Zenong Zhang and Zach Patterson, University of Texas at Dallas; Michael Hicks, University of Maryland and Amazon; Shiyi Wei, University of Texas at Dallas
Distinguished Paper Award Winner
Fuzz testing is an active area of research with proposed improvements published at a rapid pace. Such proposals are assessed empirically: Can they be shown to perform better than the status quo? Such an assessment requires a benchmark of target programs with well-identified, realistic bugs. To ease the construction of such a benchmark, this paper presents FIXREVERTER, a tool that automatically injects realistic bugs in a program. FIXREVERTER takes as input a bugfix pattern which contains both code syntax and semantic conditions. Any code site that matches the specified syntax is undone if the semantic conditions are satisfied, as checked by static analysis, thus (re)introducing a likely bug. This paper focuses on three bugfix patterns, which we call conditional-abort, conditional-execute, and conditional-assign, based on a study of fixes in a corpus of Common Vulnerabilities and Exposures (CVEs). Using FIXREVERTER we have built REVBUGBENCH, which consists of 10 programs into which we have injected nearly 8,000 bugs; the programs are taken from FuzzBench and Binutils, and represent common targets of fuzzing evaluations. We have integrated REVBUGBENCH into the FuzzBench service, and used it to evaluate five fuzzers. Fuzzing performance varies by fuzzer and program, as desired/expected. Overall, 219 unique bugs were reported, 19% of which were detected by just one fuzzer.
Rawane Issa, Nicolas Alhaddad, and Mayank Varia, Boston University
End-to-end encryption provides strong privacy protections to billions of people, but it also complicates efforts to moderate content that can seriously harm people. To address this concern, Tyagi et al. [CRYPTO 2019] introduced the concept of asymmetric message franking (AMF) so that people can report abusive content to a moderator, while otherwise retaining end-to-end privacy by default and compatibility with anonymous communication systems like Signal's sealed sender.
In this work, we provide a new construction for asymmetric message franking called Hecate that is faster, more secure, and introduces additional functionality compared to Tyagi et al. First, our construction uses fewer invocations of standardized crypto primitives and operates in the plain model. Second, on top of AMF's accountability and deniability requirements, we also add forward and backward secrecy. Third, we combine AMF with source tracing, another approach to content moderation that has previously been considered only in the setting of non-anonymous networks. Source tracing allows for messages to be forwarded, and a report only identifies the original source who created a message. To provide anonymity for senders and forwarders, we introduce a model of AMF with preprocessing whereby every client authenticates with the moderator out-of-band to receive a token that they later consume when sending a message anonymously.
Daniel Townley, Peraton Labs; Kerem Arıkan, Yu David Liu, and Dmitry Ponomarev, Binghamton University; Oğuz Ergin, TOBB University of Economics and Technology
The security of isolated execution architectures such as Intel SGX has been significantly threatened by the recent emergence of side-channel attacks. Cache side-channel attacks allow adversaries to leak secrets stored inside isolated enclaves without having direct access to the enclave memory. In some cases, secrets can be leaked even without having the knowledge of the victim application code or having OS-level privileges. We propose the concept of Composable Cachelets (CC), a new scalable strategy to dynamically partition the last-level cache (LLC) for completely isolating enclaves from other applications and from each other. CC supports enclave isolation in caches with the capability to dynamically readjust the cache capacity as enclaves are created and destroyed. We present a cache-aware and enclave-aware operational semantics to help rigorously establish security properties of CC, and we experimentally demonstrate that CC thwarts side-channel attacks on caches with modest performance and complexity impact.
Kaihang Ji, Jun Zeng, Yuancheng Jiang, and Zhenkai Liang, National University of Singapore; Zheng Leong Chua, Independent Researcher; Prateek Saxena and Abhik Roychoudhury, National University of Singapore
Dynamic Information Flow Tracking (DIFT) forms the foundation of a wide range of security and privacy analyses. The main challenges faced by DIFT techniques are performance and scalability. Due to the large number of states in a program, the number of data flows can be prohibitively large and efficiently performing interactive data flow analysis queries using existing approaches is challenging. In this paper, we identify that DIFT under dependency-based information flow rules can be cast as linear transformations over taint state. This enables a novel matrix-based representation, which we call FlowMatrix, to represent DIFT operations concisely and makes it practical to adopt GPUs as co-processors for DIFT analysis. FlowMatrix provides efficient support for interactive DIFT query operations. We design a DIFT query system and prototype it on commodity GPUs. Our evaluation shows that our prototype outperforms CPU-based baseline by 5.6 times and enables rapid response to a DIFT queries. It has two to three orders of magnitude higher throughput compared to typical DIFT analysis solutions. We also demonstrate the efficiency and efficacy of new DIFT query operations.
End-to-Same-End Encryption: Modularly Augmenting an App with an Efficient, Portable, and Blind Cloud Storage
Long Chen, Institute of Software, Chinese Academy of Sciences; Ya-Nan Li and Qiang Tang, The University of Sydney; Moti Yung, Google & Columbia University
The cloud has become pervasive, and we ask: how can we protect cloud data against the cloud itself? For messaging Apps, facilitating user-to-user private communication via a cloud server, security has been formulated and solved efficiently via End-to-End encryption, building on existing channels between end-users via servers (i.e., exploiting TLS, certificates, and encryption, without the need to program new primitives). However, the analogous problem for Apps employing servers for storing and retrieving end-user data privately, solving the analogous "privacy from the server itself" (cloud-blind storage) where (1) based on existing messaging/infrastructure and (2) allowing user mobility, is, in fact, still open. Existing proposals, like password protected secret sharing (PPSS), target end-to-same-end encryption of storage, but need new protocols, whereas most popular commercial cloud storage services are not programmable. Namely, they lack the simplicity needed for being portable over any cloud storage service.
Here, we propose a novel system for storing private data in the cloud storage with the help of a key server (necessary given the requirements). In our system, the user data will be secure from any of: the cloud server, the key server, or any illegitimate users, while the authenticated user can access the data on any devices just via a correct passphrase. The most attractive feature of our system is that it does not require the cloud storage server to support any newly programmable operations, except the existing client login and the data storing. Moreover, our system is simply built on top of the existing App password login system, so the user only needs one passphrase to login the App and access his secure storage. The security of our protocol, in turn, is proved under our rigorous models, and the efficiency is further demonstrated by real-world network experiments over Amazon S3. We remark that a very preliminary variant, based on our principles, was deployed by Snapchat in their My Eyes Only module, serving hundreds of millions of users!
Yu Liang, Pennsylvania State University; Song Liu, Pennsylvania State University and Qi-AnXin Tech. Research Institute; Hong Hu, Pennsylvania State University
Database management systems (DBMSs) are critical components of modern data-intensive applications. Developers have adopted many testing techniques to detect DBMS bugs such as crashes and assertion failures. However, most previous efforts cannot detect logical bugs that make the DBMS return incorrect results. Recent work proposed several oracles to identify incorrect results, but they rely on rule-based expression generation to synthesize queries without any guidance.
In this paper, we propose to combine coverage-based guidance, validity-oriented mutations and oracles to detect logical bugs in DBMS systems. Specifically, we first design a set of general APIs to decouple the logic of fuzzers and oracles, so that developers can easily port fuzzing tools to test DBMSs and write new oracles for existing fuzzers. Then, we provide validity-oriented mutations to generate high-quality query statements in order to find more logical bugs. Our prototype, SQLRight, outperforms existing tools that only rely on oracles or code coverage. In total, SQLRight detects 18 logical bugs from two well-tested DBMSs, SQLite and MySQL. All bugs have been confirmed and 14 of them have been fixed.
Phakpoom Chinprutthiwong, Jianwei Huang, and Guofei Gu, SUCCESS Lab, Texas A&M University
Client-side web attacks are one of the major battlefields for cybercriminals today. To mitigate such attacks, researchers have proposed numerous defenses that can be deployed on a server or client. Server-side defenses can be easily deployed and modified by web developers, but it lacks the context of client-side attacks such as DOM-XSS attacks. On the other hand, client-side defenses, especially in the form of modified browsers or browser extensions, require constant vendor support or user involvement to be up to date.
In this work, we explore the feasibility of using a new execution context, the service worker context, as a platform for web security defense development that is programmable, browser agnostic, and runs at the client side without user involvement. To this end, we propose and develop SWAPP (Service Worker APplication Platform), a framework for implementing security mechanisms inside a service worker. As the service worker is supported by most browsers, our framework is compatible with most clients. Furthermore, SWAPP is designed to enable the extensibility and programmability of the apps. We demonstrate the versatility of SWAPP by implementing various apps that can mitigate web attacks including a recent side-channel attack targeting websites that deploy a service worker. SWAPP allows websites to offload a part of the security tasks from the server to the client and also enables the possibility to deploy or retrofit emerging security features/prototypes before they are officially supported by browsers. Finally, we evaluate the performance overhead of our framework and show that deploying defenses on a service worker is a feasible option.
Miles Dai, MIT; Riccardo Paccagnella, University of Illinois at Urbana-Champaign; Miguel Gomez-Garcia, MIT; John McCalpin, Texas Advanced Computing Center; Mengjia Yan, MIT
This paper studies microarchitectural side-channel attacks and mitigations on the on-chip mesh interconnect used in modern, server-class Intel processors. We find that, though difficult to exploit, the mesh interconnect can be abused by an adversary even when known attack vectors inside the cores and caches are closed. We then present novel, non-invasive mitigation mechanisms to interconnect side-channel attacks and offer insights to guide the design of future defenses.
Our analysis starts by thoroughly reverse engineering the mesh interconnect to reveal, for the first time, the precise conditions under which it is susceptible to contention. We show that an attacker can use these conditions to build a cross-core covert channel with a capacity of over 1.5 Mbps. We then demonstrate the feasibility of side-channel attacks that leak keys from vulnerable cryptographic implementations by monitoring mesh interconnect contention. Finally, we present an analytical model to quantify the vulnerability levels of different victim and attacker placements on the chip and use the results to design a software-only mitigation mechanism.
Marcel Maehren and Philipp Nieting, Ruhr University Bochum; Sven Hebrok, Paderborn University; Robert Merget, Ruhr University Bochum; Juraj Somorovsky, Paderborn University; Jörg Schwenk, Ruhr University Bochum
Although the newest versions of TLS are considered secure, flawed implementations may undermine the promised security properties. Such implementation flaws result from the TLS specifications' complexity, with exponentially many possible parameter combinations. Combinatorial Testing (CT) is a technique to tame this complexity, but it is hard to apply to TLS due to semantic dependencies between the parameters and thus leaves the developers with a major challenge referred to as the test oracle problem: Determining if the observed behavior of software is correct for a given test input.
In this work, we present TLS-Anvil, a test suite based on CT that can efficiently and systematically test parameter value combinations and overcome the oracle problem by dynamically extracting an implementation-specific input parameter model (IPM) that we constrained based on TLS specific parameter value interactions. Our approach thus carefully restricts the available input space, which in return allows us to reliably solve the oracle problem for any combination of values generated by the CT algorithm.
We evaluated TLS-Anvil with 13 well known TLS implementations, including OpenSSL, BoringSSL, and NSS. Our evaluation revealed two new exploits in MatrixSSL, five issues directly influencing the cryptographic operations of a session, as well as 15 interoperability issues, 116 problems related to incorrect alert handling, and 100 other issues across all tested libraries.
Kevin Burk, Fabio Pagani, Christopher Kruegel, and Giovanni Vigna, UC Santa Barbara
Human analysts must reverse engineer binary programs as a prerequisite for a number of security tasks, such as vulnerability analysis, malware detection, and firmware re-hosting. Existing studies of human reversers and the processes they follow are limited in size and often use qualitative metrics that require subjective evaluation.
In this paper, we reframe the problem of reverse engineering binaries as the problem of perfect decompilation, which is the process of recovering, from a binary program, source code that, when compiled, produces binary code that is identical to the original binary. This gives us a quantitative measure of understanding, and lets us examine the reversing process programmatically.
We developed a tool, called Decomperson, that supported a group of reverse engineers during a large-scale security competition designed to collect information about the participants' reverse engineering process, with the well-defined goal of achieving perfect decompilation. Over 150 people participated, and we collected more than 35,000 code submissions, the largest manual reverse engineering dataset to date. This includes snapshots of over 300 successful perfect decompilation attempts. In this paper, we show how perfect decompilation allows programmatic analysis of such large datasets, providing new insights into the reverse engineering process.
George Kappos and Haaroon Yousaf, University College London and IC3; Rainer Stütz and Sofia Rollet, AIT - Austrian Institute of Technology; Bernhard Haslhofer, Complexity Science Hub Vienna; Sarah Meiklejohn, University College London and IC3
One of the defining features of Bitcoin and the thousands of cryptocurrencies that have been derived from it is a globally visible transaction ledger. While Bitcoin uses pseudonyms as a way to hide the identity of its participants, a long line of research has demonstrated that Bitcoin is not anonymous. This has been perhaps best exemplified by the development of clustering heuristics, which have in turn given rise to the ability to track the flow of bitcoins as they are sent from one entity to another.
In this paper, we design a new heuristic that is designed to track a certain type of flow, called a peel chain, that represents many transactions performed by the same entity; in doing this, we implicitly cluster these transactions and their associated pseudonyms together. We then use this heuristic to both validate and expand the results of existing clustering heuristics. We also develop a machine learning-based validation method and, using a ground-truth dataset, evaluate all our approaches and compare them with the state of the art. Ultimately, our goal is to not only enable more powerful tracking techniques but also call attention to the limits of anonymity in these systems.
George Arnold Sullivan, University of California, San Diego; Jackson Sippe, University of Colorado Boulder; Nadia Heninger, University of California, San Diego; Eric Wustrow, University of Colorado Boulder
It is well known in the cryptographic literature that the most common digital signature schemes used in practice can fail catastrophically in the presence of faults during computation. We use passive and active network measurements to analyze organically-occuring faults in billions of digital signatures generated by tens of millions of hosts.We find that a persistent rate of apparent hardware faults in unprotected implementations has resulted in compromised certificate RSA private keys for years. The faulty signatures we observed allowed us to compute private RSA keys associated with a top-10 Alexa site, several browser-trusted wildcard certificates for organizations that used a popular VPN product, and a small sporadic population of other web sites and network devices. These measurements illustrate the fragility of RSA PKCS#1v1.5 signature padding and provide insight on the risks faced by unprotected implementations on hardware at Internet scale.
Johannes Wikner and Kaveh Razavi, ETH Zurich
Modern operating systems rely on software defenses against hardware attacks. These defenses are, however, as good as the assumptions they make on the underlying hardware. In this paper, we invalidate some of the key assumptions behind retpoline, a widely deployed mitigation against Spectre Branch Target Injection (BTI) that converts vulnerable indirect branches to protected returns. We present RETBLEED, a new Spectre-BTI attack that leaks arbitrary kernel memory on fully patched Intel and AMD systems. Two insights make RETBLEED possible: first, we show that return instructions behave like indirect branches under certain microarchitecture-dependent conditions, which we reverse engineer. Our dynamic analysis framework discovers many exploitable return instructions inside the Linux kernel, reachable through unprivileged system calls. Second, we show how an unprivileged attacker can arbitrarily control the predicted target of such return instructions by branching into kernel memory. RETBLEED leaks privileged memory at the rate of 219 bytes/s on Intel Coffee Lake and 3.9 kB/s on AMD Zen 2.
Jean-Pierre Smith and Luca Dolfi, ETH Zurich; Prateek Mittal, Princeton University; Adrian Perrig, ETH Zurich
Website fingerprinting attacks, which analyse the metadata of encrypted network communication to identify visited websites, have been shown to be effective on privacy-enhancing technologies including virtual private networks (VPNs) and encrypted proxies. Despite this, VPNs are still undefended against these attacks, leaving millions of users vulnerable. Proposed defences against website fingerprinting require cooperation between the client and a remote endpoint to reshape the network traffic, thereby hindering deployment.
We observe that the rapid and wide-spread deployment of QUIC and HTTP/3 creates an exciting opportunity to build website-fingerprinting defences directly into client applications, such as browsers, without requiring any changes to web servers, VPNs, or the deployment of new network services. We therefore design and implement the QCSD framework, which leverages QUIC and HTTP/3 to emulate existing website-fingerprinting defences by bidirectionally adding cover traffic and reshaping connections solely from the client. As case studies, we emulate both the FRONT and Tamaraw defences solely from the client and collected several datasets of live-defended traffic on which we evaluated modern machine-learning based attacks. Our results demonstrate the promise of this approach in shaping connections towards client-orchestrated defences, thereby removing a primary barrier to the deployment of website-fingerprinting defences.
Javad Ghareh Chamani and Dimitrios Papadopoulos, Hong Kong University of Science and Technology; Mohammadamin Karbasforushan and Ioannis Demertzis, UC Santa Cruz
We focus on the problem of Dynamic Searchable Encryption (DSE) with efficient (optimal/quasi-optimal) search in the presence of deletions. Towards that end, we first propose OSSE, the first DSE scheme that can achieve asymptotically optimal search time, linear to the result size and independent of any prior deletions, improving the previous state of the art by a multiplicative logarithmic factor. We then propose our second scheme LLSE, that achieves a sublogarithmic search overhead (loglogi_w, where i_w is the number or prior insertions for a keyword) compared to the optimal achieved by OSSE. While this is slightly worse than our first scheme, it still outperforms prior works, while also achieving faster deletions and asymptotically smaller server storage. Both schemes have standard leakage profiles and are forward-and-backward private. Our experimental evaluation is very encouraging as it shows our schemes consistently outperform the prior state-of-the-art DSE by 1.2-6.6x in search computation time, while also requiring just a single roundtrip to receive the search result. Even compared with prior simpler and very efficient constructions in which all deleted records are returned as part of the result, our OSSE achieves better performance for deletion rates ranging from 45-55%, while the previous state-of-the-art quasi-optimal scheme achieves this for 65-75% deletion rates.
Michael Harrity, Kevin Bock, Frederick Sell, and Dave Levin, University of Maryland
The censorship arms race has recently gone through a transformation, thanks to recent efforts showing that new ways to evade censorship can be discovered in an automated fashion. However, all of these prior automated efforts operate by manipulating TCP/IP headers; while impressive, deploying these have proven challenging, as header modifications often require greater privileges than are available to censorship circumvention apps. In that line of work, the application layer has gone largely unexplored. This is not without reason: the space of application messages is much larger and far less structured than TCP/IP headers.
In this paper, we present the first techniques to automate the discovery of new censorship evasion techniques purely in the application layer. We present a general solution and apply it specifically to HTTP and DNS censorship in China, India, and Kazakhstan. Our automated techniques discovered a total of 77 unique evasion strategies for HTTP and 9 for DNS, all of which require only application-layer modifications, making them easier to incorporate into apps and deploy. We analyze these strategies and shed new light into the inner workings of the censors. We find that the success of application-layer strategies can depend heavily on the type and version of the destination server. Surprisingly, a large class of our evasion strategies exploit instances in which censors are more RFCcompliant than popular application servers. We have made our code publicly available.
Bodong Zhao, Zheming Li, Shisong Qin, Zheyu Ma, and Ming Yuan, Institute for Network Science and Cyberspace / BNRist, Tsinghua University; Wenyu Zhu, Department of Electronic Engineering, Tsinghua University; Zhihong Tian, Guangzhou University; Chao Zhang, Institute for Network Science and Cyberspace / BNRist, Tsinghua University and Zhongguancun Lab
Coverage-guided fuzzing has achieved great success in finding software vulnerabilities. Existing coverage-guided fuzzers generally favor test cases that hit new code, and discard ones that exercise the same code. However, such a strategy is not optimum. A new test case exercising the same code could be better than a previous test case, as it may trigger new program states useful for code exploration and bug discovery.
In this paper, we assessed the limitation of coverage-guided fuzzing solutions and proposed a state-aware fuzzing solution StateFuzz to address this issue. First, we model program states with values of state-variables and utilize static analysis to recognize such variables. Then, we instrument target programs to track such variables' values and infer program state transition at runtime. Lastly, we utilize state information to prioritize test cases that can trigger new states, and apply a three-dimension feedback mechanism to fine-tune the evolutionary direction of coverage-guided fuzzers. We have implemented a prototype of StateFuzz, and evaluated it on Linux upstream drivers and Android drivers. Evaluation results show that StateFuzz is effective at discovering both new code and vulnerabilities. It finds 18 unknown vulnerabilities and 2 known but unpatched vulnerabilities, and reaches 19% higher code coverage and 32% higher state coverage than the state-of-the-art fuzzer Syzkaller.
Daniel Günther and Maurice Heymann, Technical University of Darmstadt; Benny Pinkas, Bar-Ilan University; Thomas Schneider, Technical University of Darmstadt
Multi-Server Private Information Retrieval (PIR) is a cryptographic protocol that allows a client to securely query a database entry from n ≥ 2 servers of which less than t can collude, s.t. the servers learn no information about the query. Highly efficient PIR could be used for large-scale applications like Compromised Credential Checking (C3) (USENIX Security'19), which allows users to check whether their credentials have been leaked in a data breach. However, state-of-the art PIR schemes are not efficient enough for fast online responses at this scale.
In this work, we introduce Client-Independent Preprocessing (CIP) PIR that moves (t −1)/n of the online computation to a local, client independent, preprocessing phase suitable for efficient batch precomputations. The online performance of CIP-PIR improves linearly with the number of servers n. We show that large-scale applications like C3 with PIR are practical by implementing our CIP-PIR scheme using a parallelized CPU implementation. To the best of our knowledge, this is the first multi-server PIR scheme whose preprocessing phase is completely independent of the client, and where online performance simultaneously improves with the number of servers n. In addition, we accelerate for the first time the huge amount of XOR operations in multi-server PIR with GPUs. Our GPUbased CIP-PIR achieves an improvement up to factor 2.1× over our CPU-based implementation for n = 2 servers, and enables a client to query an entry in a 25 GB database within less than 1 second.
Yeting Li, CAS-KLONAT, Institute of Information Engineering, Chinese Academy of Sciences; University of Chinese Academy of Sciences; SKLCS, Institute of Software, Chinese Academy of Sciences; Yecheng Sun, SKLCS, Institute of Software, Chinese Academy of Sciences; University of Chinese Academy of Sciences; Zhiwu Xu, College of Computer Science and Software Engineering, Shenzhen University; Jialun Cao, The Hong Kong University of Science and Technology; Yuekang Li, School of Computer Science and Engineering, Nanyang Technological University; Rongchen Li, SKLCS, Institute of Software, Chinese Academy of Sciences; University of Chinese Academy of Sciences; Haiming Chen, SKLCS, Institute of Software, Chinese Academy of Sciences; CAS-KLONAT, Institute of Information Engineering, Chinese Academy of Sciences; Shing-Chi Cheung, The Hong Kong University of Science and Technology; Yang Liu, School of Computer Science and Engineering, Nanyang Technological University; Yang Xiao, CAS-KLONAT, Institute of Information Engineering, Chinese Academy of Sciences; University of Chinese Academy of Sciences
The Regular expression Denial of Service (ReDoS) is a class of denial of service attacks that exploit vulnerable regular expressions (regexes) whose execution time can be superlinearly related to input sizes. A common approach of defending ReDoS attacks is to repair the vulnerable regexes. Techniques have been recently proposed to synthesize repaired regexes using program-by-example (PBE) techniques. However, these existing techniques may generate regexes, which are not semantically equivalent or similar to the original ones, or are still vulnerable to ReDoS attacks.
To address the challenges, we propose RegexScalpel, an automatic regex repair framework that adopts a localize-andfix strategy. RegexScalpel first localizes the vulnerabilities by leveraging fine-grained vulnerability patterns proposed by us to analyze their vulnerable patterns, the source (i.e., the pathological sub-regexes), and the root causes (e.g., the overlapping sub-regexes). Then, RegexScalpel targets to fix the pathological sub-regexes according to our predefined repair patterns and the localized vulnerability information. Furthermore, our repair patterns ensure that the repair regexes are semantically either equivalent to or similar to the original ones. Our iterative repair method also keeps out vulnerabilities of the repaired regexes. With an experiment on a total number of 448 vulnerable regexes, we demonstrate that RegexScalpel can outperform all existing automatic regexes fixing techniques by fixing 348 more regexes than the best existing work. Also, we adopted RegexScalpel to detect ten popular projects including Python and NLTK, and revealed 16 vulnerable regexes.We then applied RegexScalpel to successfully repair all of them, and these repairs were merged into the later release by the maintainers, resulting in 8 confirmed CVEs.