Terrapin Attack: Breaking {SSH} Channel Integrity By Sequence Number Manipulation

Fabian Bäumer; Weiran Liu; Marcus Brinkmann; Liqiang Peng; Jörg Schwenk; Marten Oltrogge; Thomas Johansson; Simin Feng; Yasemin Acar; Dan Shumow; Guangdong Bai; Oliver Wiese; Marc Stevens; Tianwei Zhang; Sascha Fahl; Adam Suhl; Simon Oya; Yiyang Song; Yang Guo; Ehsan Amjadian; Fangzhou Dong; Florian Kerschbaum; Xavier Maso; Giovanni Vigna; Wei Huo; Christopher Kruegel; Qi Li; Adam Doupé; Ruoyu Wang; Yan Shoshitaishvili; Aravind Machiry; Ruoyu Wang

Papers and Proceedings

The full Proceedings published by USENIX for the symposium are available for download below. Individual papers can also be downloaded from their respective presentation pages. Copyright to the individual works is retained by the author[s].

Proceedings Front Matter
Proceedings Cover | Title Page and List of Organizers | Message from the Program Co-Chairs | Table of Contents

Full Proceedings PDFs
USENIX Security '24 Full Proceedings (PDF, 717.5 MB)
USENIX Security '24 Proceedings Interior (PDF, 714.3 MB, best for mobile devices)
USENIX Security '24 Errata Slip #1 (PDF)
USENIX Security '24 Full Artifact Appendices Proceedings (PDF, 15.12 MB)
USENIX Security '24 Artifact Appendices Proceedings Interior (PDF, 14.37 MB, best for mobile devices)

Attendee Files

USENIX Security '24 Attendee List (PDF)

Wednesday, August 14

7:00 am–8:45 am

Continental Breakfast

Grand Ballroom Foyer

8:45 am–9:15 am

Opening Remarks and Awards

Program Co-Chairs: Davide Balzarotti, Eurecom; Wenyuan Xu, Zhejiang University
Salon EFGH

9:15 am–10:15 am

Keynote Address

Salon EFGH

A World Where We Trust Hard-Won Lessons in Security Research, Technology, and People

David Brumley, ForAllSecure and CMU

Available Media

Great ideas shouldn't remain confined to papers; they should transform the world. What does it take for our research to make a real-world impact? Are there guiding principles, and do they influence how we conduct fundamental research?

In this keynote, I will share my journey of understanding the principles that bridge the gap between fundamental research and the practical implementation of safer software and systems. Through real-world examples and case studies, I will discuss how I learned to replace "it's more secure" with compelling, actionable arguments. I will delve into adoption challenges that unveiled research gems and share candid moments when my academic hubris was dismantled by industry realities.

This journey has led me to identify four key principles that, I believe, are crucial for ensuring that innovative ideas transition successfully to the broader community and not get stuck as just a great research paper. Join me to explore these principles and how I believe they can help us all build a world with computers we trust.

David Brumley is the CEO of ForAllSecure and a full professor at Carnegie Mellon University. His research focuses on novel program analysis and verification techniques that prove the presence of bugs and vulnerabilities. He has published numerous academic papers, won several test-of-time and achievement awards, competed and won the DARPA Cyber Grand Challenge, and holds a black badge.

10:15 am–11:15 am

Coffee and Tea Break

Grand Ballroom Foyer

11:15 am–12:15 pm

Track 1

User Studies I: Social Media Platforms

Session Chair: Allison McDonald

Salon AB

"I feel physically safe but not politically safe": Understanding the Digital Threats and Safety Practices of OnlyFans Creators

Ananta Soneji, Arizona State University; Vaughn Hamilton, Max Planck Institute for Software Systems; Adam Doupé, Arizona State University; Allison McDonald, Boston University; Elissa M. Redmiles, Georgetown University

Available Media

OnlyFans is a subscription-based social media platform with over 1.5 million content creators and 150 million users worldwide. OnlyFans creators primarily produce intimate content for sale on the platform. As such, they are distinctly positioned as content creators and sex workers. Through a qualitative interview study with OnlyFans creators (n=43), building on an existing framework of online hate and harassment, we shed light on the nuanced threats they face and their safety practices. Additionally, we examine the impact of factors such as stigma, prominence, and platform policies on shaping the threat landscape for OnlyFans creators and detail the preemptive practices they undertake to protect themselves. Leveraging these results, we synthesize opportunities to address the challenges of sexual content creators.

"I chose to fight, be brave, and to deal with it": Threat Experiences and Security Practices of Pakistani Content Creators

Lea Gröber, CISPA Helmholtz Center for Information Security and Saarland University; Waleed Arshad and Shanza, Lahore University of Management Sciences; Angelica Goetzen, Max Planck Institute for Software Systems; Elissa M. Redmiles, Georgetown University; Maryam Mustafa, Lahore University of Management Sciences; Katharina Krombholz, CISPA Helmholtz Center for Information Security

Available Media

Content creators are exposed to elevated risks compared to the general Internet user. This study explores the threat landscape that creators in Pakistan are exposed to, how they protect themselves, and which support structures they rely on. We conducted a semi-structured interview study with 23 creators from diverse backgrounds who create content on various topics. Our data suggests that online threats frequently spill over into the offline world, especially for gender minorities. Creating content on sensitive topics like politics, religion, and human rights is associated with elevated risks. We find that defensive mechanisms and external support structures are non-existent, lacking, or inadequately adjusted to the sociocultural context of Pakistan.

Disclaimer: This paper contains quotes describing harmful experiences relating to sexual and physical assault, eating disorders, and extreme threats of violence.

Investigating Moderation Challenges to Combating Hate and Harassment: The Case of Mod-Admin Power Dynamics and Feature Misuse on Reddit

Madiha Tabassum, Northeastern University; Alana Mackey, Wellesley College; Ashley Schuett, George Washington University; Ada Lerner, Northeastern University

Available Media

Social media platforms often rely on volunteer moderators to combat hate and harassment and create safe online environments. In the face of challenges combating hate and harassment, moderators engage in mutual support with one another. We conducted a qualitative content analysis of 115 hate and harassment-related threads from r/ModSupport and r/modhelp, two major subreddit forums for this type of mutual support. We analyze the challenges moderators face; complex tradeoffs related to privacy, utility, and harassment; and major challenges in the relationship between moderators and platform admins. We also present the first systematization of how platform features (including especially security, privacy, and safety features) are misused for online abuse, and drawing on this systematization we articulate design themes for platforms that want to resist such misuse.

"Did They F***ing Consent to That?": Safer Digital Intimacy via Proactive Protection Against Image-Based Sexual Abuse

Lucy Qin, Georgetown University; Vaughn Hamilton, Max Planck Institute for Software Systems; Sharon Wang, University of Washington; Yigit Aydinalp and Marin Scarlett, European Sex Workers Rights Alliance; Elissa M. Redmiles, Georgetown University

Available Media

As many as 8 in 10 adults share intimate content such as nude or lewd images. Sharing such content has significant benefits for relationship intimacy and body image, and can offer employment. However, stigmatizing attitudes and a lack of technological mitigations put those sharing such content at risk of sexual violence. An estimated 1 in 3 people have been subjected to image-based sexual abuse (IBSA), a spectrum of violence that includes the nonconsensual distribution or threat of distribution of consensually-created intimate content (also called NDII). In this work, we conducted a rigorous empirical interview study of 52 European creators of intimate content to examine the threats they face and how they defend against them, situated in the context of their different use cases for intimate content sharing and their choice of technologies for storing and sharing such content. Synthesizing our results with the limited body of prior work on technological prevention of NDII, we offer concrete next steps for both platforms and security & privacy researchers to work toward safer intimate content sharing through proactive protection.

Content Warning: This work discusses sexual violence, specifically, the harms of image-based sexual abuse (particularly in Sections 2 and 6).

Track 2

Hardware Security I: Attacks and Defense

Session Chair: Stefan Mangard

Salon CD

AttackGNN: Red-Teaming GNNs in Hardware Security Using Reinforcement Learning

Vasudev Gohil, Texas A&M University; Satwik Patnaik, University of Delaware; Dileep Kalathil and Jeyavijayan Rajendran, Texas A&M University

Available Media

Machine learning has shown great promise in addressing several critical hardware security problems. In particular, researchers have developed novel graph neural network (GNN)-based techniques for detecting intellectual property (IP) piracy, detecting hardware Trojans (HTs), and reverse engineering circuits, to name a few. These techniques have demonstrated outstanding accuracy and have received much attention in the community. However, since these techniques are used for security applications, it is imperative to evaluate them thoroughly and ensure they are robust and do not compromise the security of integrated circuits.

In this work, we propose AttackGNN, the first red-team attack on GNN-based techniques in hardware security. To this end, we devise a novel reinforcement learning (RL) agent that generates adversarial examples, i.e., circuits, against the GNN-based techniques. We overcome three challenges related to effectiveness, scalability, and generality to devise a potent RL agent. We target five GNN-based techniques for four crucial classes of problems in hardware security: IP piracy, detecting/localizing HTs, reverse engineering, and hardware obfuscation. Through our approach, we craft circuits that fool all GNNs considered in this work. For instance, to evade IP piracy detection, we generate adversarial pirated circuits that fool the GNN-based defense into classifying our crafted circuits as not pirated. For attacking HT localization GNN, our attack generates HT-infested circuits that fool the defense on all tested circuits. We obtain a similar 100% success rate against GNNs for all classes of problems.

INSIGHT: Attacking Industry-Adopted Learning Resilient Logic Locking Techniques Using Explainable Graph Neural Network

Lakshmi Likhitha Mankali, New York University; Ozgur Sinanoglu, New York University Abu Dhabi; Satwik Patnaik, University of Delaware

Available Media

Logic locking is a hardware-based solution that protects against hardware intellectual property (IP) piracy. With the advent of powerful machine learning (ML)-based attacks, in the last 5 years, researchers have developed several learning resilient locking techniques claiming superior security guarantees. However, these security guarantees are the result of evaluation against existing ML-based attacks having critical limitations, including (i) black-box operation, i.e., does not provide any explanations, (ii) are not practical, i.e., nonconsideration of approaches followed by the semiconductor industry, and (iii) are not broadly applicable, i.e., evaluate the security of a specific logic locking technique.

In this work, we question the security provided by learning resilient locking techniques by developing an attack (INSIGHT) using an explainable graph neural network (GNN). INSIGHT recovers the secret key without requiring scan-access, i.e., in an oracle-less setting for 7 unbroken learning resilient locking techniques, including 2 industry-adopted logic locking techniques. INSIGHT achieves an average key-prediction accuracy (KPA) of2.87×,1.75×,and1.67× higher than existing ML-based attacks. We demonstrate the efficacy of INSIGHT by evaluating locked designs ranging from widely used academic suites (ISCAS-85, ITC-99) to larger designs, such as MIPS, Google IBEX, and mor1kx processors. We perform 2 practical case studies: (i) recovering secret keys of locking techniques used in a widely used commercial EDA tool (Synopsys TestMAX) and (ii) showcasing the ramifications of leaking the secret key for an image processing application. We will open-source our artifacts to foster research on developing learning resilient locking techniques.

Eye of Sauron: Long-Range Hidden Spy Camera Detection and Positioning with Inbuilt Memory EM Radiation

Qibo Zhang and Daibo Liu, Hunan University; Xinyu Zhang, University of California San Diego; Zhichao Cao, Michigan State University; Fanzi Zeng, Hongbo Jiang, and Wenqiang Jin, Hunan University

Available Media

In this paper, we present ESauron — the first proof-of-concept system that can detect diverse forms of spy cameras (i.e., wireless, wired and offline devices) and quickly pinpoint their locations. The key observation is that, for all spy cameras, the captured raw images must be first digested (e.g., encoding and compression) in the video-capture devices before transferring to target receiver or storage medium. This digestion process takes place in an inbuilt read-write memory whose operations cause electromagnetic radiation (EMR). Specifically, the memory clock drives a variable number of switching voltage regulator activities depending on the workloads, causing fluctuating currents injected into memory units, thus emitting EMR signals at the clock frequency. Whenever the visual scene changes, bursts of video data processing (e.g., video encoding) suddenly aggravate the memory workload, bringing responsive EMR patterns. ESauron can detect spy cameras by intentionally stimulating scene changes and then sensing the surge of EMRs even from a considerable distance. We implemented a proof-of-concept prototype of the ESauron by carefully designing techniques to sense and differentiate memory EMRs, assert the existence of spy cameras, and pinpoint their locations. Experiments with 50 camera products show that ESauron can detect all spy cameras with an accuracy of 100% after only 4 stimuli, the detection range can exceed 20 meters even in the presence of blockages, and all spy cameras can be accurately located.

Improving the Ability of Thermal Radiation Based Hardware Trojan Detection

Ting Su, Yaohua Wang, Shi Xu, Lusi Zhang, Simin Feng, Jialong Song, Yiming Liu, Yongkang Tang, Yang Zhang, Shaoqing Li, Yang Guo, and Hengzhu Liu, National University of Defense Technology

Available Media

Hardware Trojans (HTs) pose a significant and growing threat to the field of hardware security. Several side-channel techniques, including power and electromagnetic radiation (EMR), have been proposed for HT detection, constrained by reliance on the golden chip or test vectors. In response, researchers advocate for the use of thermal radiation (TR) to identify HTs. However, existing TR-based methods are designed for the ideal HT that can fully occupy at least one pixel on the thermal radiation map (TRM). In reality, HTs may occupy multiple pixels, substantially diminishing occupancy in each pixel, thereby reducing the accuracy of existing detection methods. This challenge is exacerbated by the noise caused by the thermal camera. To this end, this paper introduces a countermeasure named noise based pixel occupation enhancement (NICE), aiming to improve the ability of TR-based HT detection. The key insight of NICE is that noise can vary the pixel occupation of HTs while disrupting HT detection. Consequently, the noise can be exploited to statistically find out the largest pixel occupation among the variations, thereby enhancing HT detection accuracy. Experimental results on a 0.13 μm Digital Signal Processing (DSP) show that the detection rate of NICE exceeds the existing TR-based method by more than 47%, reaching 91.81%, while maintaining a false alarm rate of less than 9%. Both metrics of NICE are comparable to the existing power-based and EMR-based methods, eliminating the need for the golden chip and test vectors.

Track 3

System Security I: OS

Session Chair: Hamed Okhravi

Salon E

Endokernel: A Thread Safe Monitor for Lightweight Subprocess Isolation

Fangfei Yang, Rice University; Bumjin Im, Amazon.com; Weijie Huang, Rice University; Kelly Kaoudis, Trail of Bits; Anjo Vahldiek-Oberwagner, Intel Labs; Chia-Che Tsai, Texas A&M University; Nathan Dautenhahn, Riverside Research

Available Media

Compartmentalization decomposes applications into isolated components, effectively confining the scope of potential security breaches. Recent approaches nest the protection monitor within processes for efficient memory isolation at the cost of security. However, these systems lack solutions for efficient multithreaded safety and neglect kernel semantics that can be abused to bypass the monitor.

The Endokernel is an intra-process security monitor that isolates memory at subprocess granularity. It ensures backwards-compatible and secure emulation of system interfaces, a task uniquely challenging due to the need to analyze OS and hardware semantics beyond mere interface usability. We introduce an inside-out methodology where we identify core OS primitives that allow bypass and map that back to the interfaces that depend on them. This approach led to the identification of several missing policies as well as aided in developing a fine-grained locking approach to deal with complex thread safety when inserting a monitor between the OS and the application. Results indicate that we can achieve fast isolation while greatly enhancing security and maintaining backwards-compatibility, and also showing a new method for systematically finding gaps in policies.

HIVE: A Hardware-assisted Isolated Execution Environment for eBPF on AArch64

Peihua Zhang, SKLP, Institute of Computing Technology, CAS; University of Chinese Academy of Sciences; Chenggang Wu, SKLP, Institute of Computing Technology, CAS; University of Chinese Academy of Sciences; Zhongguancun Laboratory; Xiangyu Meng, Northwestern Polytechnical University; Yinqian Zhang, Southern University of Science and Technology; Mingfan Peng, Shiyang Zhang, and Bing Hu, SKLP, Institute of Computing Technology, CAS; University of Chinese Academy of Sciences; Mengyao Xie, SKLP, Institute of Computing Technology, CAS; Yuanming Lai and Yan Kang, SKLP, Institute of Computing Technology, CAS; University of Chinese Academy of Sciences; Zhe Wang, SKLP, Institute of Computing Technology, CAS; University of Chinese Academy of Sciences; Zhongguancun Laboratory

Available Media

eBPF has become a critical component in Linux. To ensure kernel security, BPF programs are statically verified before being loaded and executed in the kernel. However, the state-of-the-art eBPF verifier has both security and complexity issues. To this end, we choose to look at BPF programs from a new perspective and regard them as a new type of kernel-mode application, thus an isolation-based rather than a verificationbased approach is needed. In this paper, we propose HIVE, an isolation execution environment for BPF programs on AArch64. To provide the equivalent security guarantees, we systematize the security aims of the eBPF verifier and categorize two types of pointers in eBPF: the inclusive type pointer that points to BPF objects and the exclusive type pointer that points to kernel objects. For the former, HIVE compartmentalizes all BPF memory from the kernel and de-privileges the memory accesses in the BPF programs by leveraging the load/store unprivileged instructions; for the latter, HIVE utilizes the pointer authentication feature to enforce access controls of kernel objects. Evaluation results show that HIVE is not only efficient but also supports complex BPF programs.

BUDAlloc: Defeating Use-After-Free Bugs by Decoupling Virtual Address Management from Kernel

Junho Ahn, Jaehyeon Lee, Kanghyuk Lee, Wooseok Gwak, Minseong Hwang, and Youngjin Kwon, KAIST

Available Media

Use-after-free bugs are an important class of vulnerabilities that often pose serious security threats. To prevent or detect use-after-free bugs, one-time allocators have recently gained attention for their better performance scalability and immediate detection of use-after-free bugs compared to garbage collection approaches. This paper introduces BUDAlloc, a one-time-allocator for detecting and protecting use-after-free bugs in unmodified binaries. The core idea is co-designing a user-level allocator and kernel by separating virtual and physical address management. The user-level allocator manages virtual address layout, eliminating the need for system calls when creating virtual alias, which is essential for reducing internal fragmentation caused by the one-time-allocator. BUDAlloc customizes the kernel page fault handler with eBPF for batching unmap requests when freeing objects. In SPEC CPU 2017, BUDAlloc achieves a 15% performance improvement over DangZero and reduces memory overhead by 61% compared to FFmalloc.

Page-Oriented Programming: Subverting Control-Flow Integrity of Commodity Operating System Kernels with Non-Writable Code Pages

Seunghun Han, The Affiliated Institute of ETRI, Chungnam National University; Seong-Joong Kim, Wook Shin, and Byung Joon Kim, The Affiliated Institute of ETRI; Jae-Cheol Ryou, Chungnam National University

Available Media

This paper presents a novel attack technique called page-oriented programming, which reuses existing code gadgets by remapping physical pages to the virtual address space of a program at runtime. The page remapping vulnerabilities may lead to data breaches or may damage kernel integrity. Therefore, manufacturers have recently released products equipped with hardware-assisted guest kernel integrity enforcement. This paper extends the notion of the page remapping attack to another type of code-reuse attack, which can not only be used for altering or sniffing kernel data but also for building and executing malicious code at runtime. We demonstrate the effectiveness of this attack on state-of-the-art hardware and software, where control-flow integrity policies are enforced, thus highlighting its capability to render most legacy systems vulnerable.

Track 4

Network Security I: DDoS

Session Chair: Ben Stock

Salon F

SmartCookie: Blocking Large-Scale SYN Floods with a Split-Proxy Defense on Programmable Data Planes

Sophia Yoo, Xiaoqi Chen, and Jennifer Rexford, Princeton University

Available Media

Despite decades of mitigation efforts, SYN flooding attacks continue to increase in frequency and scale, and adaptive adversaries continue to evolve. Meanwhile, volumes of benign traffic in modern networks are also growing rampantly. As a result, network providers, which run thousands of servers and process 100s of Gbps of traffic, find themselves urgently requiring defenses that are secure against adaptive adversaries, scalable against large volumes of traffic, and highly performant for benign applications. Unfortunately, existing defenses local to a single device (e.g., purely software-based or hardware-based) are failing to keep up with growing attacks and struggle to provide performance, security, or both. In this paper, we present SmartCookie, the first system to run cryptographically secure SYN cookie checks on high-speed programmable switches, for both security and performance. Our novel split-proxy defense leverages emerging programmable switches to block 100% of SYN floods in the switch data plane and also uses state-of-the-art kernel technologies such as eBPF to enable scalability for serving benign traffic. SmartCookie defends against adaptive adversaries at two orders of magnitude greater attack traffic than traditional CPU-based software defenses, blocking attacks of 136.9 Mpps without packet loss. We also achieve 2x-6.5x lower end-to-end latency for benign traffic compared to existing switch-based hardware defenses.

Loopy Hell(ow): Inﬁnite Traffic Loops at the Application Layer

Yepeng Pan, Anna Ascheman, and Christian Rossow, CISPA Helmholtz Center for Information Security

Available Media

Denial-of-Service (DoS) attacks have long been a persistent threat to network infrastructures. Existing attack primitives require attackers to continuously send traffic, such as in SYN floods, amplification attacks, or application-layer DoS. In contrast, we study the threat of application-layer traffic loops, which are an almost cost-free attack primitive alternative. Such loops exist, e.g., if two servers consider messages sent to each other as malformed and respond with errors that again trigger error messages. Attackers can send a single IP-spoofed loop trigger packet to initiate an infinite loop among two servers. But despite the severity of traffic loops, to the best of our knowledge, they have never been studied in greater detail.

In this paper, we thus investigate the threat of application-layer traffic loops. To this end, we propose a systematic approach to identify loops among real servers. Our core idea is to learn the response functions of all servers of a given application-layer protocol, encode this knowledge into a loop graph, and finally, traverse the graph to spot looping server pairs. Using the proposed method, we examined traffic loops among servers running both popular (DNS, NTP, and TFTP) and legacy (Daytime, Time, Active Users, Chargen, QOTD, and Echo) UDP protocols and confirmed the prevalence of traffic loops. In total, we identified approximately 296k servers in IPv4 vulnerable to traffic loops, providing attackers the opportunity to abuse billions of loop pairs.

Zero-setup Intermediate-rate Communication Guarantees in a Global Internet

Marc Wyss and Adrian Perrig, ETH Zurich

Available Media

Network-targeting volumetric DDoS attacks remain a major threat to Internet communication. Unfortunately, existing solutions fall short of providing forwarding guarantees to the important class of short-lived intermediate-rate communication such as web traffic in a secure, scalable, light-weight, low-cost, and incrementally deployable fashion. To overcome those limitations we design Z-Lane, a system achieving those objectives by ensuring bandwidth isolation among authenticated traffic from (groups of) autonomous systems, thus safeguarding intermediate-rate communication against even the largest volumetric DDoS attacks. Our evaluation on a global testbed and our high-speed implementation on commodity hardware demonstrate Z-Lane's effectiveness and scalability.

Towards an Effective Method of ReDoS Detection for Non-backtracking Engines

Weihao Su, Hong Huang, and Rongchen Li, Key Laboratory of System Software (Chinese Academy of Sciences) and State Key Laboratory of Computer Science, Institute of Software, Chinese Academy of Sciences; University of Chinese Academy of Sciences; Haiming Chen, Key Laboratory of System Software (Chinese Academy of Sciences) and State Key Laboratory of Computer Science, Institute of Software, Chinese Academy of Sciences; Tingjian Ge, Miner School of Computer & Information Sciences, University of Massachusetts, Lowell

Available Media

Regular expressions (regexes) are a fundamental concept across the fields of computer science. However, they can also induce the Regular expression Denial of Service (ReDoS) attacks, which are a class of denial of service attacks, caused by super-linear worst-case matching time. Due to the severity and prevalence of ReDoS attacks, the detection of ReDoS-vulnerable regexes in software is thus vital. Although various ReDoS detection approaches have been proposed, these methods have focused mainly on backtracking regex engines, leaving the problem of ReDoS vulnerability detection on non-backtracking regex engines largely open.

To address the above challenges, in this paper, we first systematically analyze the major causes that could contribute to ReDoS vulnerabilities on non-backtracking regex engines. We then propose a novel type of ReDoS attack strings that builds on the concept of simple strings. Next we propose EvilStrGen, a tool for generating attack strings for ReDoS-vulnerable regexes on non-backtracking engines. It is based on a novel incremental determinisation algorithm with heuristic strategies to lazily find the k-simple strings without explicit construction of finite automata. We evaluate EvilStrGen against six state-of-the-art approaches on a broad range of publicly available datasets containing 736,535 unique regexes. The results illustrate the significant efficacy of our tool. We also apply our tool to 85 intensively-tested projects, and have identified 34 unrevealed ReDoS vulnerabilities.

Track 5

ML I: Federated Learning

Session Chair: Songze Li

Salon G

FAMOS: Robust Privacy-Preserving Authentication on Payment Apps via Federated Multi-Modal Contrastive Learning

Yifeng Cai, Key Laboratory of High Confidence Software Technologies (PKU), Ministry of Education; School of Computer Science, Peking University; Ziqi Zhang, Department of Computer Science, University of Illinois Urbana-Champaign; Jiaping Gui, School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University; Bingyan Liu, School of Computer Science, Beijing University of Posts and Telecommunications; Xiaoke Zhao, Ruoyu Li, and Zhe Li, Ant Group; Ding Li, Key Laboratory of High Confidence Software Technologies (PKU), Ministry of Education; School of Computer Science, Peking University

Available Media

The rise of mobile payment apps necessitates robust user authentication to ensure legitimate user access. Traditional methods, like passwords and biometrics, are vulnerable once a device is compromised. To overcome these limitations, modern solutions utilize sensor data to achieve user-agnostic and scalable behavioral authentication. However, existing solutions face two problems when deployed to real-world applications. First, it is not robust to noisy background activities. Second, it faces the risks of privacy leakage as it relies on centralized training with users' sensor data.

In this paper, we introduce FAMOS, a novel authentication framework based on federated multi-modal contrastive learning. The intuition of FAMOS is to fuse multi-modal sensor data and cluster the representation of one user's data by the action category so that we can eliminate the influence of background noise and guarantee the user's privacy. Furthermore, we incorporate FAMOS with federated learning to enhance performance while protecting users' privacy. We comprehensively evaluate FAMOS using real-world datasets and devices. Experimental results show that FAMOS is efficient and accurate for real-world deployment. FAMOS has an F1-Score of 0.91 and an AUC of 0.97, which are 42.19% and 27.63% higher than the baselines, respectively.

Efficient Privacy Auditing in Federated Learning

Hongyan Chang, National University of Singapore; Brandon Edwards, Intel Corporation; Anindya S. Paul, University of Florida; Reza Shokri, National University of Singapore

Available Media

We design a novel efficient membership inference attack to audit privacy risks in federated learning. Our approach involves computing the slope of specific model performance metrics (e.g., model's output and its loss) across FL rounds to differentiate members from non-members. Since these metrics are automatically computed during the FL process, our solution imposes negligible overhead and can be seamlessly integrated without disrupting training. We validate the effectiveness and superiority of our method over prior work across a wide range of FL settings and real-world datasets.

Defending Against Data Reconstruction Attacks in Federated Learning: An Information Theory Approach

Qi Tan, Department of Computer Science and Technology, Tsinghua University; Qi Li, Institute for Network Science and Cyberspace, Tsinghua University; Yi Zhao, School of Cyberspace Science and Technology, Beijing Institute of Technology; Zhuotao Liu, Institute for Network Science and Cyberspace, Tsinghua University; Xiaobing Guo, Lenovo Research; Ke Xu, Department of Computer Science and Technology, Tsinghua University

Available Media

Federated Learning (FL) trains a black-box and high-dimensional model among different clients by exchanging parameters instead of direct data sharing, which mitigates the privacy leak incurred by machine learning. However, FL still suffers from membership inference attacks (MIA) or data reconstruction attacks (DRA). In particular, an attacker can extract the information from local datasets by constructing DRA, which cannot be effectively throttled by existing techniques, e.g., Differential Privacy (DP).

In this paper, we aim to ensure a strong privacy guarantee for FL under DRA. We prove that econstruction errors under DRA are constrained by the information acquired by an attacker, which means that constraining the transmitted information can effectively throttle DRA. To quantify the information leakage incurred by FL, we establish a channel model, which depends on the upper bound of joint mutual information between the local dataset and multiple transmitted parameters. Moreover, the channel model indicates that the transmitted information can be constrained through data space operation, which can improve training efficiency and the model accuracy under constrained information. According to the channel model, we propose algorithms to constrain the information transmitted in a single round of local training. With a limited number of training rounds, the algorithms ensure that the total amount of transmitted information is limited. Furthermore, our channel model can be applied to various privacy-enhancing techniques (such as DP) to enhance privacy guarantees against DRA. Extensive experiments with real-world datasets validate the effectiveness of our methods.

Lotto: Secure Participant Selection against Adversarial Servers in Federated Learning

Zhifeng Jiang and Peng Ye, Hong Kong University of Science and Technology; Shiqi He, University of Michigan; Wei Wang, Hong Kong University of Science and Technology; Ruichuan Chen, Nokia Bell Labs; Bo Li, Hong Kong University of Science and Technology

Available Media

In Federated Learning (FL), common privacy-enhancing techniques, such as secure aggregation and distributed differential privacy, rely on the critical assumption of an honest majority among participants to withstand various attacks. In practice, however, servers are not always trusted, and an adversarial server can strategically select compromised clients to create a dishonest majority, thereby undermining the system's security guarantees. In this paper, we present Lotto, an FL system that addresses this fundamental, yet underexplored issue by providing secure participant selection against an adversarial server. Lotto supports two selection algorithms: random and informed. To ensure random selection without a trusted server, Lotto enables each client to autonomously determine their participation using verifiable randomness. For informed selection, which is more vulnerable to manipulation, Lotto approximates the algorithm by employing random selection within a refined client pool. Our theoretical analysis shows that Lotto effectively aligns the proportion of server-selected compromised participants with the base rate of dishonest clients in the population. Large-scale experiments further reveal that Lotto achieves time-to-accuracy performance comparable to that of insecure selection methods, indicating a low computational overhead for secure selection.

Track 6

Security Analysis I: Source Code and Binary

Session Chair: Manuel Egele

Salon H

Ahoy SAILR! There is No Need to DREAM of C: A Compiler-Aware Structuring Algorithm for Binary Decompilation

Zion Leonahenahe Basque, Ati Priya Bajaj, Wil Gibbs, Jude O'Kain, Derron Miao, Tiffany Bao, Adam Doupé, Yan Shoshitaishvili, and Ruoyu Wang, Arizona State University

Available Media

Contrary to prevailing wisdom, we argue that the measure of binary decompiler success is not to eliminate all gotos or reduce the complexity of the decompiled code but to get as close as possible to the original source code. Many gotos exist in the original source code (the Linux kernel version 6.1 contains 3,754) and, therefore, should be preserved during decompilation, and only spurious gotos should be removed.

Fundamentally, decompilers insert spurious gotos in decompilation because structuring algorithms fail to recover C-style structures from binary code. Through a quantitative study, we find that the root cause of spurious gotos is compiler-induced optimizations that occur at all optimization levels (17% in non-optimized compilation). Therefore, we believe that to achieve high-quality decompilation, decompilers must be compiler-aware to mirror (and remove) the goto-inducing optimizations.

In this paper, we present a novel structuring algorithm called SAILR that mirrors the compilation pipeline of GCC and precisely inverts goto-inducing transformations. We build an open-source decompiler on angr (the angr decompiler) and implement SAILR as well as otherwise-unavailable prior work (Phoenix, DREAM, and rev.ng's Combing) and evaluate them, using a new metric of how close the decompiled code structure is to the original source code, showing that SAILR markedly improves on prior work. In addition, we find that SAILR performs well on binaries compiled with non-GCC compilers, which suggests that compilers similarly implement goto-inducing transformations.

A Taxonomy of C Decompiler Fidelity Issues

Luke Dramko and Jeremy Lacomis, Carnegie Mellon University; Edward J. Schwartz, Carnegie Mellon University Software Engineering Institute; Bogdan Vasilescu and Claire Le Goues, Carnegie Mellon University

Available Media

Decompilation is an important part of analyzing threats in computer security. Unfortunately, decompiled code contains less information than the corresponding original source code, which makes understanding it more difficult for the reverse engineers who manually perform threat analysis. Thus, the fidelity of decompiled code to the original source code matters, as it can influence reverse engineers' productivity. There is some existing work in predicting some of the missing information using statistical methods, but these focus largely on variable names and variable types. In this work, we more holistically evaluate decompiler output from C-language executables and use our findings to inform directions for future decompiler development. More specifically, we use open-coding techniques to identify defects in decompiled code beyond missing names and types. To ensure that our study is robust, we compare and evaluate four different decompilers. Using thematic analysis, we build a taxonomy of decompiler defects. Using this taxonomy to reason about classes of issues, we suggest specific approaches that can be used to mitigate fidelity issues in decompiled code.

D-Helix: A Generic Decompiler Testing Framework Using Symbolic Differentiation

Muqi Zou, Arslan Khan, Ruoyu Wu, Han Gao, Antonio Bianchi, and Dave (Jing) Tian, Purdue University

Available Media

Decompilers, one of the widely used security tools, transform low-level binary programs back into their high-level source representations, such as C/C++. While state-of-the-art decompilers try to generate more human-readable outputs, for instance, by eliminating goto statements in their decompiled code, the correctness of a decompilation process is largely ignored due to the complexity of decompilers, e.g., involving hundreds of heuristic rules. As a result, outputs from decompilers are often not accurate, which affects the effectiveness of downstream security tasks.

In this paper, we propose D-HELIX, a generic decompiler testing framework that can automatically vet the decompilation correctness on the function level. D-HELIX uses RECOMPILER to compile the decompiled code at the functional level. It then uses SYMDIFF to compare the symbolic model of the original binary with the one of the decompiled code, detecting potential errors introduced by the decompilation process. D-HELIX further provides TUNER to help debug the incorrect decompilation via toggling decompilation heuristic rules automatically. We evaluated D-HELIX on Ghidra and angr using 2,004 binaries and object files ending up with 93K decompiled functions in total. D-HELIX detected 4,515 incorrectly decompiled functions, reproduced 8 known bugs, found 17 distinct previously unknown bugs within these two decompilers, and fixed 7 bugs automatically.

SymFit: Making the Common (Concrete) Case Fast for Binary-Code Concolic Execution

Zhenxiao Qi, Jie Hu, Zhaoqi Xiao, and Heng Yin, UC Riverside

Available Media

Concolic execution is a powerful technique in software testing, as it can systematically explore the code paths and is capable of traversing complex branches. It combines concrete execution for environment modeling and symbolic execution for path exploration. While significant research efforts in concolic execution have been directed toward the improvement of symbolic execution and constraint solving, our study pivots toward the often overlooked yet most common aspect: concrete execution. Our analysis shows that state-of-the-art binary concolic executors have largely overlooked the overhead in the execution of concrete instructions. In light of this observation, we propose optimizations to make the common (concrete) case fast. To validate this idea, we develop the prototype, SymFit, and evaluate it on standard benchmarks and real-world applications. The results showed that the performance of pure concrete execution is much faster than the baseline SymQEMU, and is comparable to the vanilla QEMU. Moreover, we showed that the fast symbolic tracing capability of SymFit can significantly improve the efficiency of crash deduplication.

Track 7

Crypto I: Secret Key Exchange

Session Chair: Cas Cremers

Salon IJK

K-Waay: Fast and Deniable Post-Quantum X3DH without Ring Signatures

Daniel Collins and Loïs Huguenin-Dumittan, EPFL; Ngoc Khanh Nguyen, King’s College London; Nicolas Rolin, Spuerkeess; Serge Vaudenay, EPFL

Available Media

The Signal protocol and its X3DH key exchange core are regularly used by billions of people in applications like WhatsApp but are unfortunately not quantum-secure. Thus, designing an efficient and post-quantum secure X3DH alternative is paramount. Notably, X3DH supports asynchronicity, as parties can immediately derive keys after uploading them to a central server, and deniability, allowing parties to plausibly deny having completed key exchange. To satisfy these constraints, existing post-quantum X3DH proposals use ring signatures (or equivalently a form of designated-verifier signatures) to provide authentication without compromising deniability as regular signatures would. Existing ring signature schemes, however, have some drawbacks. Notably, they are not generally proven secure in the quantum random oracle model (QROM) and so the quantum security of parameters that are proposed is unclear and likely weaker than claimed. In addition, they are generally slower than standard primitives like KEMs.

In this work, we propose an efficient, deniable and post-quantum X3DH-like protocol that we call K-Waay, that does not rely on ring signatures. At its core, K-Waay uses a split-KEM, a primitive introduced by Brendel et al. [SAC 2020], to provide Diffie-Hellman-like implicit authentication and secrecy guarantees. Along the way, we revisit the formalism of Brendel et al. and identify that additional security properties are required to prove a split-KEM-based protocol secure. We instantiate split-KEM by building a protocol based on the Frodo key exchange protocol relying on the plain LWE assumption: our proofs might be of independent interest as we show it satisfies our novel unforgeability and deniability security notions. Finally, we complement our theoretical results by thoroughly benchmarking both K-Waay and existing X3DH protocols. Our results show even when using plain LWE and a conservative choice of parameters that K-Waay is significantly faster than previous work.

Diffie-Hellman Picture Show: Key Exchange Stories from Commercial VoWiFi Deployments

Gabriel K. Gegenhuber and Florian Holzbauer, University of Vienna; Philipp É. Frenzel, SBA Research; Edgar Weippl, University of Vienna and Christian Doppler Laboratory for Security and Quality Improvement in the Production System Lifecycle (CDL-SQI); Adrian Dabrowski, CISPA Helmholtz Center for Information Security

Available Media

Voice over Wi-Fi (VoWiFi) uses a series of IPsec tunnels to deliver IP-based telephony from the subscriber's phone (User Equipment, UE) into the Mobile Network Operator's (MNO) core network via an Internet-facing endpoint, the Evolved Packet Data Gateway (ePDG). IPsec tunnels are set up in phases. The first phase negotiates the cryptographic algorithm and parameters and performs a key exchange via the Internet Key Exchange protocol, while the second phase (protected by the above-established encryption) performs the authentication. An insecure key exchange would jeopardize the later stages and the data's security and confidentiality.

In this paper, we analyze the phase 1 settings and implementations as they are found in phones as well as in commercially deployed networks worldwide. On the UE side, we identified a recent 5G baseband chipset from a major manufacturer that allows for fallback to weak, unannounced modes and verified it experimentally. On the MNO side –among others– we identified 13 operators (totaling an estimated 140 million subscribers) on three continents that all use the same globally static set of ten private keys, serving them at random. Those not-so-private keys allow the decryption of the shared keys of every VoWiFi user of all those operators. All these operators deployed their core network from one common manufacturer.

Formal verification of the PQXDH Post-Quantum key agreement protocol for end-to-end secure messaging

Karthikeyan Bhargavan, Cryspen; Charlie Jacomme, Inria Nancy Grand-Est, Université de Lorraine, LORIA, France; Franziskus Kiefer, Cryspen; Rolfe Schmidt, Signal Messenger

Available Media

The Signal Messenger recently introduced a new asynchronous key agreement protocol called PQXDH (PostQuantum Extended Diffie-Hellman) that seeks to provide post-quantum forward secrecy, in addition to the authentication and confidentiality guarantees already provided by the previous X3DH (Extended Diffie-Hellman) protocol. More precisely, PQXDH seeks to protect the confidentiality of messages against harvest-now-decrypt-later attacks.

In this work, we formally specify the PQXDH protocol and analyze its security using two formal verification tools, PROVERIF and CRYPTOVERIF. In particular, we ask whether PQXDH preserves the guarantees of X3DH, whether it provides post-quantum forward secrecy, and whether it can be securely deployed alongside X3DH. Our analysis identifies several flaws and potential vulnerabilities in the PQXDH specification, although these vulnerabilities are not exploitable in the Signal application, thanks to specific implementation choices which we describe in this paper. To prove the security of the current implementation, our analysis notably highlighted the need for an additional binding property of the KEM, which we formally define and prove for Kyber.

We collaborated with the protocol designers to develop an updated protocol specification based on our findings, where each change was formally verified and validated with a security proof. This work identifies some pitfalls that the community should be aware of when upgrading protocols to be post-quantum secure. It also demonstrates the utility of using formal verification hand-in-hand with protocol design.

SWOOSH: Efficient Lattice-Based Non-Interactive Key Exchange

Phillip Gajland, Max Planck Institute for Security and Privacy, Ruhr University Bochum; Bor de Kock, NTNU - Norwegian University of Science and Technology, Trondheim, Norway; Miguel Quaresma, Max Planck Institute for Security and Privacy; Giulio Malavolta, Bocconi University, Max Planck Institute for Security and Privacy; Peter Schwabe, Max Planck Institute for Security and Privacy, Radboud University

Available Media

The advent of quantum computers has sparked significant interest in post-quantum cryptographic schemes, as a replacement for currently used cryptographic primitives. In this context, lattice-based cryptography has emerged as the leading paradigm to build post-quantum cryptography. However, all existing viable replacements of the classical Diffie-Hellman key exchange require additional rounds of interactions, thus failing to achieve all the benefits of this protocol. Although earlier work has shown that lattice-based Non-Interactive Key Exchange (NIKE) is theoretically possible, it has been considered too inefficient for real-life applications. In this work, we challenge this folklore belief and provide the first evidence against it. We construct an efficient lattice-based NIKE whose security is based on the standard module learning with errors (M-LWE) problem in the quantum random oracle model. Our scheme is obtained in two steps: (i) A passively-secure construction that achieves a strong notion of correctness, coupled with (ii) a generic compiler that turns any such scheme into an actively-secure one. To substantiate our efficiency claim, we provide an optimised implementation of our passively-secure construction in Rust and Jasmin. Our implementation demonstrates the scheme's applicability to real-world scenarios, yielding public keys of approximately 220 KBs. Moreover, the computation of shared keys takes fewer than 12 million cycles on an Intel Skylake CPU, offering a post-quantum security level exceeding 120 bits.

12:15 pm–1:45 pm

Lunch (on your own)

1:45 pm–2:45 pm

Track 1

Social Issues I: Phishing and Password

Session Chair: Gianluca Stringhini

Salon AB

PhishDecloaker: Detecting CAPTCHA-cloaked Phishing Websites via Hybrid Vision-based Interactive Models

Xiwen Teoh, Shanghai Jiao Tong University; National University of Singapore; Yun Lin, Shanghai Jiao Tong University; Ruofan Liu, Zhiyong Huang, and Jin Song Dong, National University of Singapore

Available Media

Phishing is a cybersecurity attack based on social engineering that incurs significant financial losses and erodes societal trust. While phishing detection techniques are emerging, attackers continually strive to bypass state-of-the-arts. Recent phishing campaigns have shown that emerging phishing attacks adopt CAPTCHA-based cloaking techniques, marking a new round of cat-and-mouse game. Our study shows that phishing websites, hardened by CAPTCHA-cloaking, can compromise all known state-of-the-art industrial and academic detectors with almost zero cost.

In this work, we develop PhishDecloaker, an AI-powered solution to soften the shield of the CAPTCHA-cloaking used by phishing websites. PhishDecloaker is designed to mimic human behaviors to solve the CAPTCHAs, allowing modern security-crawlers to see the uncloaked phishing content. Technically, PhishDecloaker orchestrates five deep computer vision models to detect the existence of CAPTCHAs, analyze its type, and solve the challenge in an interactive manner. We conduct extensive experiments to evaluate PhishDecloaker in terms of its effectiveness, efficiency, and robustness against potential adversaries. The results show that PhishDecloaker (1) recovers the phishing detection rate of many state-of-theart phishing detectors from 0% to up to on average 74.25% on diverse CAPTCHA-cloaked phishing websites (2) generalizes to unseen CAPTCHA (with precision of 86% and recall of 69%), and (3) is robust against various adversaries such as FGSM, JSMA, PGD, DeepFool, and DPatch, which allows the existing phishing detectors to achieve new state-of-the-art performance on CAPTCHA-cloaked phishing webpages. Our field study over 30 days shows that PhishDecloaker can help us uniquely discover 7.6% more phishing websites cloaked by CAPTCHAs, raising alarm of the emergence of CAPTCHA-cloaked features in the modern phishing campaigns.

Less Defined Knowledge and More True Alarms: Reference-based Phishing Detection without a Pre-defined Reference List

Ruofan Liu, Shanghai Jiao Tong University/National University of Singapore; Yun Lin, Shanghai Jiao Tong University; Xiwen Teoh, National University of Singapore; Gongshen Liu, Shanghai Jiao Tong University; Zhiyong Huang and Jin Song Dong, National University of Singapore

Available Media

Phishing, a pervasive form of social engineering attack that compromises user credentials, has led to significant financial losses and undermined public trust. Modern phishing detection has gravitated to reference-based methods for their explainability and robustness against zero-day phishing attacks. These methods maintain and update predefined reference lists to specify domain-brand relationships, alarming phishing websites by the inconsistencies between their domain (e.g., payp0l.com) and intended brand (e.g., PayPal). However, the curated lists are largely limited by their lack of comprehensiveness and high maintenance costs in practice.

In this work, we present PhishLLM as a novel reference-based phishing detector that operates without an explicit pre-defined reference list. Our rationale lies in that modern LLMs have encoded far more extensive brand-domain information than any predefined list. Further, the detection of many webpage semantics such as credential-taking intention analysis is more like a linguistic problem, but they are processed as a vision problem now. Thus, we design PhishLLM to decode (or retrieve) the domain-brand relationships from LLM and effectively parse the credential-taking intention of a webpage, without the cost of maintaining and updating an explicit reference list. Moreover, to control the hallucination of LLMs, we introduce a search-engine-based validation mechanism to remove the misinformation. Our extensive experiments show that PhishLLM significantly outperforms state-of-the-art solutions such as Phishpedia and PhishIntention, improving the recall by 21% to 66%, at the cost of negligible precision. Our field studies show that PhishLLM discovers (1) 6 times more zero-day phishing webpages compared to existing approaches such as PhishIntention and (2) close to 2 times more zero-day phishing webpages even if it is enhanced by DynaPhish. Our code is available at https://github.com/code-philia/PhishLLM/.

In Wallet We Trust: Bypassing the Digital Wallets Payment Security for Free Shopping

Raja Hasnain Anwar, University of Massachusetts Amherst; Syed Rafiul Hussain, Pennsylvania State University; Muhammad Taqi Raza, University of Massachusetts Amherst

Available Media

Digital wallets are a new form of payment technology that provides a secure and convenient way of making contactless payments through smart devices. In this paper, we study the security of financial transactions made through digital wallets, focusing on the authentication, authorization, and access control security functions. We find that the digital payment ecosystem supports the decentralized authority delegation which is susceptible to a number of attacks. First, an attacker adds the victim's bank card into their (attacker's) wallet by exploiting the authentication method agreement procedure between the wallet and the bank. Second, they exploit the unconditional trust between the wallet and the bank, and bypass the payment authorization. Third, they create a trap door through different payment types and violate the access control policy for the payments. The implications of these attacks are of a serious nature where the attacker can make purchases of arbitrary amounts by using the victim's bank card, despite these cards being locked and reported to the bank as stolen by the victim. We validate these findings in practice over major US banks (notably Chase, AMEX, Bank of America, and others) and three digital wallet apps (ApplePay, GPay, and PayPal). We have disclosed our findings to all the concerned parties. Finally, we propose remedies for fixing the design flaws to avoid these and other similar attacks.

The Impact of Exposed Passwords on Honeyword Efficacy

Zonghao Huang, Duke University; Lujo Bauer, Carnegie Mellon University; Michael K. Reiter, Duke University

Available Media

Honeywords are decoy passwords that can be added to a credential database; if a login attempt uses a honeyword, this indicates that the site's credential database has been leaked. In this paper we explore the basic requirements for honeywords to be effective, in a threat model where the attacker knows passwords for the same users at other sites. First, we show that for user-chosen (vs. algorithmically generated, i.e., by a password manager) passwords, existing honeyword-generation algorithms do not simultaneously achieve false-positive and false-negative rates near their ideals of ≈0 and ≈ 1/1+n, respectively, in this threat model, where n is the number of honeywords per account. Second, we show that for users leveraging algorithmically generated passwords, state-of-the-art methods for honeyword generation will produce honeywords that are not sufficiently deceptive, yielding many false negatives. Instead, we find that only a honeyword-generation algorithm that uses the same password generator as the user can provide deceptive honeywords in this case. However, when the defender's ability to infer the generator from the (one) account password is less accurate than the attacker's ability to infer the generator from potentially many, this deception can again wane. Taken together, our results provide a cautionary note for the state of honeyword research and pose new challenges to the field.

Track 2

Side Channel I: Transient Execution

Session Chair: Daniel Gruss

Salon CD

InSpectre Gadget: Inspecting the Residual Attack Surface of Cross-privilege Spectre v2

Sander Wiebing, Alvise de Faveri Tron, Herbert Bos, and Cristiano Giuffrida, Vrije Universiteit Amsterdam

Distinguished Paper Award Winner

Available Media

Spectre v2 is one of the most severe transient execution vulnerabilities, as it allows an unprivileged attacker to lure a privileged (e.g., kernel) victim into speculatively jumping to a chosen gadget, which then leaks data back to the attacker. Spectre v2 is hard to eradicate. Even on last-generation Intel CPUs, security hinges on the unavailability of exploitable gadgets. Nonetheless, with (i) deployed mitigations—eIBRS, no-eBPF, (Fine)IBT—all aimed at hindering many usable gadgets, (ii) existing exploits relying on now-privileged features (eBPF), and (iii) recent Linux kernel gadget analysis studies reporting no exploitable gadgets, the common belief is that there is no residual attack surface of practical concern.

In this paper, we challenge this belief and uncover a significant residual attack surface for cross-privilege Spectre-v2 attacks. To this end, we present InSpectre Gadget, a new gadget analysis tool for in-depth inspection of Spectre gadgets. Unlike existing tools, ours performs generic constraint analysis and models knowledge of advanced exploitation techniques to accurately reason over gadget exploitability in an automated fashion. We show that our tool can not only uncover new (unconventionally) exploitable gadgets in the Linux kernel, but that those gadgets are sufficient to bypass all deployed Intel mitigations. As a demonstration, we present the first native Spectre-v2 exploit against the Linux kernel on last-generation Intel CPUs, based on the recent BHI variant and able to leak arbitrary kernel memory at 3.5 kB/sec. We also present a number of gadgets and exploitation techniques to bypass the recent FineIBT mitigation, along with a case study on a 13th Gen Intel CPU that can leak kernel memory at 18 bytes/sec.

Shesha: Multi-head Microarchitectural Leakage Discovery in new-generation Intel Processors

Anirban Chakraborty, Nimish Mishra, and Debdeep Mukhopadhyay, Indian Institute of Technology Kharagpur

Available Media

Transient execution attacks have been one of the widely explored microarchitectural side channels since the discovery of Spectre and Meltdown. However, much of the research has been driven by manual discovery of new transient paths through well-known speculative events. Although a few attempts exist in literature on automating transient leakage discovery, such tools focus on finding variants of known transient attacks and explore a small subset of instruction set. Further, they take a random fuzzing approach that does not scale as the complexity of search space increases. In this work, we identify that the search space of bad speculation is disjointedly fragmented into equivalence classes and then use this observation to develop a framework named Shesha, inspired by Particle Swarm Optimization, which exhibits faster convergence rates than state-of-the-art fuzzing techniques for automatic discovery of transient execution attacks. We then use Shesha to explore the vast search space of extensions to the x86 Instruction Set Architecture (ISAs), thereby focusing on previously unexplored avenues of bad speculation. As such, we report five previously unreported transient execution paths in Instruction Set Extensions (ISEs) on new generation of Intel processors. We then perform extensive reverse engineering of each of the transient execution paths and provide root-cause analysis. Using the discovered transient execution paths, we develop attack building blocks to exhibit exploitable transient windows. Finally, we demonstrate data leakage from Fused Multiply-Add instructions through SIMD buffer and extract victim data from various cryptographic implementations.

BeeBox: Hardening BPF against Transient Execution Attacks

Di Jin, Alexander J. Gaidis, and Vasileios P. Kemerlis, Brown University

Available Media

The Berkeley Packet Filter (BPF) has emerged as the de-facto standard for carrying out safe and performant, user-specified computation(s) in kernel space. However, BPF also increases the attack surface of the OS kernel disproportionately, especially under the presence of transient execution vulnerabilities. In this work, we present BeeBox: a new security architecture that hardens BPF against transient execution attacks, allowing the OS kernel to expose eBPF functionality to unprivileged users and applications. At a high level, BeeBox sandboxes the BPF runtime against speculative code execution in an SFI-like manner. Moreover, by using a combination of static analyses and domain-specific properties, BeeBox selectively elides enforcement checks, improving performance without sacrificing security. We implemented a prototype of BeeBox for the Linux kernel that supports popular features of eBPF (e.g., BPF maps and helper functions), and evaluated it both in terms of effectiveness and performance, demonstrating resilience against prevalent transient execution attacks (i.e., Spectre-PHT and Spectre-STL) with low overhead. On average, BeeBox incurs 20% overhead in the Katran benchmark, while the current mitigations of Linux incur 112% overhead. Lastly, BeeBox exhibits less than 1% throughput degradation in end-to-end, real-world settings that include seccomp-BPF and packet filtering.

SpecLFB: Eliminating Cache Side Channels in Speculative Executions

Xiaoyu Cheng, School of Cyber Science and Engineering, Southeast University, Nanjing, Jiangsu, China; Jiangsu Province Engineering Research Center of Security for Ubiquitous Network, China; Fei Tong, School of Cyber Science and Engineering, Southeast University, Nanjing, Jiangsu, China; Jiangsu Province Engineering Research Center of Security for Ubiquitous Network, China; Purple Mountain Laboratories, Nanjing, Jiangsu, China; Hongyu Wang, State Key Laboratory of Power Equipment Technology, School of Electrical Engineering, Chongqing University, China; Wiscom System Co., LTD, Nanjing, China; Zhe Zhou and Fang Jiang, School of Cyber Science and Engineering, Southeast University, Nanjing, Jiangsu, China; Jiangsu Province Engineering Research Center of Security for Ubiquitous Network, China; Yuxing Mao, State Key Laboratory of Power Equipment Technology, School of Electrical Engineering, Chongqing University, China

Available Media

Cache side-channel attacks based on speculative executions are powerful and difficult to mitigate. Existing hardware defense schemes often require additional hardware data structures, data movement operations and/or complex logical computations, resulting in excessive overhead of both processor performance and hardware resources. To this end, this paper proposes SpecLFB, which utilizes the microarchitecture component, Line-Fill-Buffer, integrated with a proposed mechanism for load security check to prevent the establishment of cache side channels in speculative executions. To ensure the correctness and immediacy of load security check, a structure called ROB unsafe mask is designed for SpecLFB to track instruction state. To further reduce processor performance overhead, SpecLFB narrows down the protection scope of unsafe speculative loads and determines the time at which they can be deprotected as early as possible. SpecLFB has been implemented in the open-source RISC-V core, SonicBOOM, as well as in Gem5. For the enhanced SonicBOOM, its register-transfer-level (RTL) code is generated, and an FPGA hardware prototype burned with the core and running a Linux-kernel-based operating system is developed. Based on the evaluations in terms of security guarantee, performance overhead, and hardware resource overhead through RTL simulation, FPGA prototype experiment, and Gem5 simulation, it shows that SpecLFB effectively defends against attacks. It leads to a hardware resource overhead of only 0.6% and the performance overhead of only 1.85% and 3.20% in the FPGA prototype experiment and Gem5 simulation, respectively.

Track 3

Mobile Security I

Session Chair: Sven Bugiel

Salon E

Towards Privacy-Preserving Social-Media SDKs on Android

Haoran Lu, Yichen Liu, Xiaojing Liao, and Luyi Xing, Indiana University Bloomington

Available Media

Integration of third-party SDKs are essential in the development of mobile apps. However, the rise of in-app privacy threat against mobile SDKs— called cross-library data harvesting (XLDH), targets social media/platform SDKs (called social SDKs) that handles rich user data. Given the widespread integration of social SDKs in mobile apps, XLDH presents a significant privacy risk, as well as raising pressing concerns regarding legal compliance for app developers, social media/platform stakeholders, and policymakers. The emerging XLDH threat, coupled with the increasing demand for privacy and compliance in line with societal expectations, introduces unique challenges that cannot be addressed by existing protection methods against privacy threats or malicious code on mobile platforms. In response to the XLDH threats, in our study, we generalize and define the concept of privacy-preserving social SDKs and their in-app usage, characterize fundamental challenges for combating the XLDH threat and ensuring privacy in design and utilizaiton of social SDKs. We introduce a practical, clean-slate design and end-to-end systems, called PESP, to facilitate privacy-preserving social SDKs. Our thorough evaluation demonstrates its satisfactory effectiveness, performance overhead and practicability for widespread adoption.

UIHash: Detecting Similar Android UIs through Grid-Based Visual Appearance Representation

Jiawei Li, Beihang University; National University of Singapore; Jian Mao, Beihang University; Tianmushan Laboratory; Hangzhou Innovation Institute, Beihang University; Jun Zeng, National University of Singapore; Qixiao Lin and Shaowen Feng, Beihang University; Zhenkai Liang, National University of Singapore

Available Media

User interfaces (UIs) is the main channel for users to interact with mobile apps. As such, attackers often create similar-looking UIs to deceive users, causing various security problems, such as spoofing and phishing. Prior studies identify these similar UIs based on their layout trees or screenshot images. These techniques, however, are susceptible to being evaded. Guided by how users perceive UIs and the features they prioritize, we design a novel grid-based UI representation to capture UI visual appearance while maintaining robustness against evasion. We develop an approach, UIHash, to detect similar Android UIs by comparing their visual appearance. It divides the UI into a #-shaped grid and abstracts UI controls across screen regions, then calculates UI similarity through a neural network architecture that includes a convolutional neural network and a Siamese network. Our evaluation shows that UIHash achieves an F1-score of 0.984 in detection, outperforming existing tree-based methods and image-based methods. Moreover, we have discovered evasion techniques that circumvent existing detection approaches.

Racing for TLS Certificate Validation: A Hijacker's Guide to the Android TLS Galaxy

Sajjad Pourali and Xiufen Yu, Concordia University; Lianying Zhao, Carleton University; Mohammad Mannan and Amr Youssef, Concordia University

Available Media

Besides developers' code, current Android apps usually integrate code from third-party libraries, all of which may include code for TLS validation. We analyze well-known improper TLS certificate validation issues in popular Android apps, and attribute the validation issues to the offending code/party in a fine-grained manner, unlike existing work labelling an entire app for validation failures. Surprisingly, we discovered a widely used practice of overriding the global default validation functions with improper validation logic, or simply performing no validation at all, affecting the entire app's TLS connections, which we call validation hijacking. We design and implement an automated dynamic analysis tool called Marvin to identify TLS validation failures, including validation hijacking, and the responsible parties behind such dangerous practice. We use Marvin to analyze 6315 apps from a Chinese app store and Google Play, and find many occurrences of insecure TLS certificate validation instances (55.7% of the Chinese apps and 4.6% of the Google Play apps). Validation hijacking happens in 34.3% of the insecure apps from the Chinese app store and 20.0% of insecure Google Play apps. A network attacker can exploit these insecure connections in various ways, e.g., to compromise PII, app login and SSO credentials, to launch phishing and other content modification attacks, including code injection. We found that most of these vulnerabilities are related to third-party libraries used by the apps, not the app code created by app developers. The technical root cause enabling validation hijacking appears to be the specific modifications made by Google in the OkHttp library integrated with the Android OS, which is used by many developers by default, without being aware of its potential dangers. Overall, our findings provide valuable insights into the responsible parties for TLS validation issues in Android, including the validation hijacking problem.

DVa: Extracting Victims and Abuse Vectors from Android Accessibility Malware

Haichuan Xu, Mingxuan Yao, and Runze Zhang, Georgia Institute of Technology; Mohamed Moustafa Dawoud, German International University; Jeman Park, Kyung Hee University; Brendan Saltaformaggio, Georgia Institute of Technology

Available Media

The Android accessibility (a11y) service is widely abused by malware to conduct on-device monetization fraud. Existing mitigation techniques focus on malware detection but overlook providing users evidence of abuses that have already occurred and notifying victims to facilitate defenses. We developed DVa, a malware analysis pipeline based on dynamic victim-guided execution and abuse-vector-guided symbolic analysis, to help investigators uncover a11y malware's targeted victims, victim-specific abuse vectors, and persistence mechanisms. We deployed DVa to investigate Android devices infected with 9,850 a11y malware. From the extractions, DVa uncovered 215 unique victims targeted with an average of 13.9 abuse routines. DVa also extracted six persistence mechanisms empowered by the a11y service.

Track 4

Web Security I

Session Chair: Andrei Sabelfeld

Salon F

SoK: State of the Krawlers – Evaluating the Effectiveness of Crawling Algorithms for Web Security Measurements

Aleksei Stafeev and Giancarlo Pellegrino, CISPA Helmholtz Center for Information Security

Available Media

Web crawlers are tools widely used in web security measurements whose performance and impact have been limitedly studied so far. In this paper, we bridge this gap. Starting from the past 12 years of the top security, web measurement, and software engineering literature, we categorize and decompose in building blocks crawling techniques and methodologic choices. We then reimplement and patch crawling techniques and integrate them into Arachnarium, a framework for comparative evaluations, which we use to run one of the most comprehensive experimental evaluations against nine real and two benchmark web applications and top 10K CrUX websites to assess the performance and adequacy of algorithms across three metrics (code, link, and JavaScript source coverage). Finally, we distill 14 insights and lessons learned. Our results show that despite a lack of clear and homogeneous descriptions hindering reimplementations, proposed and commonly used crawling algorithms offer a lower coverage than randomized ones, indicating room for improvement. Also, our results show a complex relationship between experiment parameters, the study's domain, and the available computing resources, where no single best-performing crawler configuration exists. We hope our results will guide future researchers when setting up their studies.

Vulnerability-oriented Testing for RESTful APIs

Wenlong Du and Jian Li, School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University; Yanhao Wang, Independent Researcher; Libo Chen, Ruijie Zhao, and Junmin Zhu, School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University; Zhengguang Han, QI-ANXIN Technology Group; Yijun Wang and Zhi Xue, School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University

Available Media

With the increasing popularity of APIs, ensuring their security has become a crucial concern. However, existing security testing methods for RESTful APIs usually lack targeted approaches to identify and detect security vulnerabilities. In this paper, we propose VOAPI2, a vulnerability-oriented API inspection framework designed to directly expose vulnerabilities in RESTful APIs, based on our observation that the type of vulnerability hidden in an API interface is strongly associated with its functionality. By leveraging this insight, we first track commonly used strings as keywords to identify APIs' functionality. Then, we generate a stateful and suitable request sequence to inspect the candidate API function within a targeted payload. Finally, we verify whether vulnerabilities exist or not through feedback-based testing. Our experiments on real-world APIs demonstrate the effectiveness of our approach, with significant improvements in vulnerability detection compared to state-of-the-art methods. VOAPI2 discovered 7 zero-day and 19 disclosed bugs on seven real-world RESTful APIs, and 23 of them have been assigned CVE IDs. Our findings highlight the importance of considering APIs' functionality when discovering their bugs, and our method provides a practical and efficient solution for securing RESTful APIs.

Web Platform Threats: Automated Detection of Web Security Issues With WPT

Pedro Bernardo and Lorenzo Veronese, TU Wien; Valentino Dalla Valle and Stefano Calzavara, Università Ca' Foscari Venezia; Marco Squarcina, TU Wien; Pedro Adão, Instituto Superior Técnico, Universidade de Lisboa, and Instituto de Telecomunicações; Matteo Maffei, TU Wien

Available Media

Client-side security mechanisms implemented by Web browsers, such as cookie security attributes and the Mixed Content policy, are of paramount importance to protect Web applications. Unfortunately, the design and implementation of such mechanisms are complicated and error-prone, potentially exposing Web applications to security vulnerabilities. In this paper, we present a practical framework to formally and automatically detect security flaws in client-side security mechanisms. In particular, we leverage Web Platform Tests (WPT), a popular cross-browser test suite, to automatically collect browser execution traces and match them against Web invariants, i.e., intended security properties of Web mechanisms expressed in first-order logic. We demonstrate the effectiveness of our approach by validating 9 invariants against the WPT test suite, discovering violations with clear security implications in 104 tests for Firefox, Chromium and Safari. We disclosed the root causes of these violations to browser vendors and standard bodies, which resulted in 8 individual reports and one CVE on Safari.

Rise of Inspectron: Automated Black-box Auditing of Cross-platform Electron Apps

Mir Masood Ali, Mohammad Ghasemisharif, Chris Kanich, and Jason Polakis, University of Illinois Chicago

Available Media

Browser-based cross-platform applications have become increasingly popular as they allow software vendors to sidestep two major issues in the app ecosystem. First, web apps can be impacted by the performance deterioration affecting browsers, as the continuous adoption of diverse and complex features has led to bloating. Second, re-developing or porting apps to different operating systems and execution environments is a costly, error-prone process. Instead, frameworks like Electron allow the creation of standalone apps for different platforms using JavaScript code (e.g., reused from an existing web app) and by incorporating a stripped down and configurable browser engine. Despite the aforementioned advantages, these apps face significant security and privacy threats that are either non-applicable to traditional web apps (due to the lack of access to certain system-facing APIs) or ineffective against them (due to countermeasures already baked into browsers). In this paper we present Inspectron, an automated dynamic analysis framework that audits packaged Electron apps for potential security vulnerabilities stemming from developers' deviation from recommended security practices. Our study reveals a multitude of insecure practices and problematic trends in the Electron app ecosystem, highlighting the gap filled by Inspectron as it provides extensive and comprehensive auditing capabilities for developers and researchers.

Track 5

LLM for Security

Session Chair: Shuang Hao

Salon G

KnowPhish: Large Language Models Meet Multimodal Knowledge Graphs for Enhancing Reference-Based Phishing Detection

Yuexin Li, Chengyu Huang, and Shumin Deng, National University of Singapore; Mei Lin Lock, NCS Cyber Special Ops-R&D; Tri Cao, National University of Singapore; Nay Oo and Hoon Wei Lim, NCS Cyber Special Ops-R&D; Bryan Hooi, National University of Singapore

Available Media

Phishing attacks have inflicted substantial losses on individuals and businesses alike, necessitating the development of robust and efficient automated phishing detection approaches. Reference-based phishing detectors (RBPDs), which compare the logos on a target webpage to a known set of logos, have emerged as the state-of-the-art approach. However, a major limitation of existing RBPDs is that they rely on a manually constructed brand knowledge base, making it infeasible to scale to a large number of brands, which results in false negative errors due to the insufficient brand coverage of the knowledge base. To address this issue, we propose an automated knowledge collection pipeline, using which we collect a large-scale multimodal brand knowledge base, KnowPhish, containing 20k brands with rich information about each brand. KnowPhish can be used to boost the performance of existing RBPDs in a plug-and-play manner. A second limitation of existing RBPDs is that they solely rely on the image modality, ignoring useful textual information present in the webpage HTML. To utilize this textual information, we propose a Large Language Model (LLM)-based approach to extract brand information of webpages from text. Our resulting multimodal phishing detection approach, KnowPhish Detector (KPD), can detect phishing webpages with or without logos. We evaluate KnowPhish and KPD on a manually validated dataset, and a field study under Singapore's local context, showing substantial improvements in effectiveness and efficiency compared to state-of-the-art baselines.

Exploring ChatGPT's Capabilities on Vulnerability Management

Peiyu Liu and Junming Liu, Zhejiang University NGICS Platform; Lirong Fu, Hangzhou Dianzi University; Kangjie Lu, University of Minnesota; Yifan Xia, Zhejiang University NGICS Platform; Xuhong Zhang, Zhejiang University and Jianghuai Advance Technology Center; Wenzhi Chen, Zhejiang University; Haiqin Weng, Ant Group; Shouling Ji, Zhejiang University; Wenhai Wang, Zhejiang University NGICS Platform

Available Media

Recently, ChatGPT has attracted great attention from the code analysis domain. Prior works show that ChatGPT has the capabilities of processing foundational code analysis tasks, such as abstract syntax tree generation, which indicates the potential of using ChatGPT to comprehend code syntax and static behaviors. However, it is unclear whether ChatGPT can complete more complicated real-world vulnerability management tasks, such as the prediction of security relevance and patch correctness, which require an all-encompassing understanding of various aspects, including code syntax, program semantics, and related manual comments.

In this paper, we explore ChatGPT's capabilities on 6 tasks involving the complete vulnerability management process with a large-scale dataset containing 70,346 samples. For each task, we compare ChatGPT against SOTA approaches, investigate the impact of different prompts, and explore the difficulties. The results suggest promising potential in leveraging ChatGPT to assist vulnerability management. One notable example is ChatGPT's proficiency in tasks like generating titles for software bug reports. Furthermore, our findings reveal the difficulties encountered by ChatGPT and shed light on promising future directions. For instance, directly providing random demonstration examples in the prompt cannot consistently guarantee good performance in vulnerability management. By contrast, leveraging ChatGPT in a self-heuristic way—extracting expertise from demonstration examples itself and integrating the extracted expertise in the prompt is a promising research direction. Besides, ChatGPT may misunderstand and misuse the information in the prompt. Consequently, effectively guiding ChatGPT to focus on helpful information rather than the irrelevant content is still an open problem.

Large Language Models for Code Analysis: Do LLMs Really Do Their Job?

Chongzhou Fang, Ning Miao, and Shaurya Srivastav, University of California, Davis; Jialin Liu, Temple University; Ruoyu Zhang, Ruijie Fang, Asmita, Ryan Tsang, and Najmeh Nazari, University of California, Davis; Han Wang, Temple University; Houman Homayoun, University of California, Davis

Available Media

Large language models (LLMs) have demonstrated significant potential in the realm of natural language understanding and programming code processing tasks. Their capacity to comprehend and generate human-like code has spurred research into harnessing LLMs for code analysis purposes. However, the existing body of literature falls short in delivering a systematic evaluation and assessment of LLMs' effectiveness in code analysis, particularly in the context of obfuscated code.

This paper seeks to bridge this gap by offering a comprehensive evaluation of LLMs' capabilities in performing code analysis tasks. Additionally, it presents real-world case studies that employ LLMs for code analysis. Our findings indicate that LLMs can indeed serve as valuable tools for automating code analysis, albeit with certain limitations. Through meticulous exploration, this research contributes to a deeper understanding of the potential and constraints associated with utilizing LLMs in code analysis, paving the way for enhanced applications in this critical domain.

PentestGPT: Evaluating and Harnessing Large Language Models for Automated Penetration Testing

Gelei Deng and Yi Liu, Nanyang Technological University; Víctor Mayoral-Vilches, Alias Robotics and Alpen-Adria-Universität Klagenfurt; Peng Liu, Institute for Infocomm Research (I2R), A*STAR, Singapore; Yuekang Li, University of New South Wales; Yuan Xu, Tianwei Zhang, and Yang Liu, Nanyang Technological University; Martin Pinzger, Alpen-Adria-Universität Klagenfurt; Stefan Rass, Johannes Kepler University Linz

Distinguished Artifact Award Winner

Available Media

Penetration testing, a crucial industrial practice for ensuring system security, has traditionally resisted automation due to the extensive expertise required by human professionals. Large Language Models (LLMs) have shown significant advancements in various domains, and their emergent abilities suggest their potential to revolutionize industries. In this work, we establish a comprehensive benchmark using real-world penetration testing targets and further use it to explore the capabilities of LLMs in this domain. Our findings reveal that while LLMs demonstrate proficiency in specific sub-tasks within the penetration testing process, such as using testing tools, interpreting outputs, and proposing subsequent actions, they also encounter difficulties maintaining a whole context of the overall testing scenario.

Based on these insights, we introduce PENTESTGPT, an LLM-empowered automated penetration testing framework that leverages the abundant domain knowledge inherent in LLMs. PENTESTGPT is meticulously designed with three self-interacting modules, each addressing individual sub-tasks of penetration testing, to mitigate the challenges related to context loss. Our evaluation shows that PENTESTGPT not only outperforms LLMs with a task-completion increase of 228.6% compared to the GPT-3.5 model among the benchmark targets, but also proves effective in tackling real-world penetration testing targets and CTF challenges. Having been open-sourced on GitHub, PENTESTGPT has garnered over 6,500 stars in 12 months and fostered active community engagement, attesting to its value and impact in both the academic and industrial spheres.

Track 6

Fuzzing I: Software

Session Chair: Thorsten Holz

Salon H

OptFuzz: Optimization Path Guided Fuzzing for JavaScript JIT Compilers

Jiming Wang and Yan Kang, SKLP, Institute of Computing Technology, CAS & University of Chinese Academy of Sciences; Chenggang Wu, SKLP, Institute of Computing Technology, CAS & University of Chinese Academy of Sciences & Zhongguancun Laboratory; Yuhao Hu, Yue Sun, and Jikai Ren, SKLP, Institute of Computing Technology, CAS & University of Chinese Academy of Sciences; Yuanming Lai and Mengyao Xie, SKLP, Institute of Computing Technology, CAS; Charles Zhang, Tsinghua University; Tao Li, Nankai University; Zhe Wang, SKLP, Institute of Computing Technology, CAS & University of Chinese Academy of Sciences & Zhongguancun Laboratory

Available Media

Just-In-Time (JIT) compiler is a core component of JavaScript engines, which takes a snippet of JavaScript code as input and applies a series of optimization passes on it and then transforms it to machine code. The optimization passes often have some assumptions (e.g., variable types) on the target JavaScript code, and therefore will yield vulnerabilities if the assumptions do not hold. To discover such bugs, it is essential to thoroughly test different optimization passes, but previous work fails to do so and mainly focused on exploring code coverage. In this paper, we present the first optimization path guided fuzzing solution for JavaScript JIT compilers, namely OptFuzz, which focuses on exploring optimization path coverage. Specifically, we utilize an optimization trunk path metric to approximate the optimization path coverage, and use it as a feedback to guide seed preservation and seed scheduling of the fuzzing process. We have implemented a prototype of OptFuzz and evaluated it on 4 mainstream JavaScript engines. On earlier versions of JavaScript engines, OptFuzz found several times more bugs than baseline solutions. On the latest JavaScript engines, OptFuzz discovered 36 unknown bugs, while baseline solutions found none.

Fuzzing BusyBox: Leveraging LLM and Crash Reuse for Embedded Bug Unearthing

Asmita, University of California, Davis; Yaroslav Oliinyk and Michael Scott, NetRise; Ryan Tsang, Chongzhou Fang, and Houman Homayoun, University of California, Davis

Available Media

BusyBox, an open-source software bundling over 300 essential Linux commands into a single executable, is ubiquitous in Linux-based embedded devices. Vulnerabilities in BusyBox can have far-reaching consequences, affecting a wide array of devices. This research, driven by the extensive use of BusyBox, delved into its analysis. The study revealed the prevalence of older BusyBox versions in real-world embedded products, prompting us to conduct fuzz testing on BusyBox. Fuzzing, a pivotal software testing method, aims to induce crashes that are subsequently scrutinized to uncover vulnerabilities. Within this study, we introduce two techniques to fortify software testing. The first technique enhances fuzzing by leveraging Large Language Models (LLM) to generate target-specific initial seeds. Our study showed a substantial increase in crashes when using LLM-generated initial seeds, highlighting the potential of LLM to efficiently tackle the typically labor-intensive task of generating target-specific initial seeds. The second technique involves repurposing previously acquired crash data from similar fuzzed targets before initiating fuzzing on a new target. This approach streamlines the time-consuming fuzz testing process by providing crash data directly to the new target before commencing fuzzing. We successfully identified crashes in the latest BusyBox target without conducting traditional fuzzing, emphasizing the effectiveness of LLM and crash reuse techniques in enhancing software testing and improving vulnerability detection in embedded systems. Additionally, manual triaging was performed to identify the nature of crashes in the latest BusyBox.

Towards Generic Database Management System Fuzzing

Yupeng Yang and Yongheng Chen, Georgia Institute of Technology; Rui Zhong, Palo Alto Networks; Jizhou Chen and Wenke Lee, Georgia Institute of Technology

Available Media

Database Management Systems play an indispensable role in modern cyberspace. While multiple fuzzing frameworks have been proposed in recent years to test relational (SQL) DBMSs to improve their security, non-relational (NoSQL) DBMSs have yet to experience the same scrutiny and lack an effective testing solution in general. In this work, we identify three limitations of existing approaches when extended to fuzz the DBMSs effectively in general: being non-generic, using static constraints, and generating loose data dependencies. Then, we propose effective solutions to address these limitations. We implement our solutions into an end-to-end fuzzing framework, BUZZBEE, which can effectively fuzz both relational and non-relational DBMSs. BUZZBEE successfully discovered 40 vulnerabilities in eight DBMSs of four different data models, of which 25 have been fixed with 4 new CVEs assigned. In our evaluation, BUZZBEE outperforms state-of-the-art generic fuzzers by up to 177% in terms of code coverage and discovers 30x more bugs than the second-best fuzzer for non-relational DBMSs, while achieving comparable results with specialized SQL fuzzers for the relational counterpart.

HYPERPILL: Fuzzing for Hypervisor-bugs by Leveraging the Hardware Virtualization Interface

Alexander Bulekov, EPFL, Boston University, and Amazon; Qiang Liu, EPFL and Zhejiang University; Manuel Egele, Boston University; Mathias Payer, EPFL

Distinguished Paper Award Winner

Available Media

The security guarantees of cloud computing depend on the isolation guarantees of the underlying hypervisors. Prior works have presented effective methods for automatically identifying vulnerabilities in hypervisors. However, these approaches are limited in scope. For instance, their implementation is typically hypervisor-specific and limited by requirements for detailed grammars, access to source-code, and assumptions about hypervisor behaviors. In practice, complex closed-source and recent open-source hypervisors are often not suitable for off-the-shelf fuzzing techniques.

HYPERPILL introduces a generic approach for fuzzing arbitrary hypervisors. HYPERPILL leverages the insight that although hypervisor implementations are diverse, all hypervisors rely on the identical underlying hardware-virtualization interface to manage virtual-machines. To take advantage of the hardware-virtualization interface, HYPERPILL makes a snapshot of the hypervisor, inspects the snapshotted hardware state to enumerate the hypervisor's input-spaces, and leverages feedback-guided snapshot-fuzzing within an emulated environment to identify vulnerabilities in arbitrary hypervisors. In our evaluation, we found that beyond being the first hypervisor-fuzzer capable of identifying vulnerabilities in arbitrary hypervisors across all major attack-surfaces (i.e., PIO/MMIO/Hypercalls/DMA), HYPERPILL also outperforms state-of-the-art approaches that rely on access to source-code, due to the granularity of feedback provided by HYPERPILL's emulation-based approach. In terms of coverage, HYPERPILL outperformed past fuzzers for 10/12 QEMU devices, without the API hooking or source-code instrumentation techniques required by prior works. HYPERPILL identified 26 new bugs in recent versions of QEMU, Hyper-V, and macOS Virtualization Framework across four device-categories

Track 7

Differential Privacy I

Session Chair: Aysajan Abidin

Salon IJK

Less is More: Revisiting the Gaussian Mechanism for Differential Privacy

Tianxi Ji, Texas Tech University; Pan Li, Case Western Reserve University

Available Media

Differential privacy (DP) via output perturbation has been a de facto standard for releasing query or computation results on sensitive data. Different variants of the classic Gaussian mechanism have been developed to reduce the magnitude of the noise and improve the utility of sanitized query results. However, we identify that all existing Gaussian mechanisms suffer from the curse of full-rank covariance matrices, and hence the expected accuracy losses of these mechanisms equal the trace of the covariance matrix of the noise. Particularly, for query results with multiple entries, in order to achieve DP, the expected accuracy loss of the classic Gaussian mechanism, that of the analytic Gaussian mechanism, and that of the Matrix-Variate Gaussian (MVG) mechanism are lower bounded by terms that scales linearly with the number of entries.

To lift this curse, we design a Rank-1 Singular Multivariate Gaussian (R1SMG) mechanism. It achieves DP on high dimension query results by perturbing the results with noise following a singular multivariate Gaussian distribution, whose covariance matrix is a randomly generated rank-1 positive semi-definite matrix. In contrast, the classic Gaussian mechanism and its variants all consider deterministic full-rank covariance matrices. Our idea is motivated by a clue from Dwork et al.'s seminal work on the classic Gaussian mechanism that has been ignored in the literature: when projecting multivariate Gaussian noise with a full-rank covariance matrix onto a set of orthonormal basis, only the coefficient of a single basis can contribute to the privacy guarantee.

This paper makes the following technical contributions.

(i) The R1SMG mechanisms achieves DP guarantee on high dimension query results in, while its expected accuracy loss is lower bounded by a term that is on a lower order of magnitude by at least the dimension of query results compared with that of the classic Gaussian mechanism, of the analytic Gaussian mechanism, and of the MVG mechanism.

(ii) Compared with other mechanisms, the R1SMG mechanism is more stable and less likely to generate noise with large magnitude that overwhelms the query results, because the kurtosis and skewness of the nondeterministic accuracy loss introduced by this mechanism is larger than that introduced by other mechanisms.

Relation Mining Under Local Differential Privacy

Kai Dong, Zheng Zhang, Chuang Jia, Zhen Ling, Ming Yang, and Junzhou Luo, Southeast University; Xinwen Fu, University of Massachusetts Lowell

Available Media

Existing local differential privacy (LDP) techniques enable untrustworthy aggregators to perform only very simple data mining tasks on distributed private data, including statistical estimation and frequent item mining. There is currently no general LDP method that discovers relations between items. The main challenge lies in the curse of dimensionality, as the quantity of values to be estimated in mining relations is the square of the quantity of values to be estimated in mining item-level knowledge, leading to a considerable decrease in the final estimation accuracy. We propose LDP-RM, the first relation mining method under LDP. It represents items and relations in a matrix and utilizes singular value decomposition and low rank approximation to reduce the number of values to estimate from O(k²) to O(r), where k is the number of all considered items, and r < k is a parameter determined by the aggregator, signifying the rank of the approximation. LDP-RM serves as a fundamental privacy-preserving method for enabling various complex data mining tasks.

Gradients Look Alike: Sensitivity is Often Overestimated in DP-SGD

Anvith Thudi and Hengrui Jia, University of Toronto and Vector Institute; Casey Meehan, University of California, San Diego; Ilia Shumailov, University of Oxford; Nicolas Papernot, University of Toronto and Vector Institute

Available Media

Differentially private stochastic gradient descent (DP-SGD) is the canonical approach to private deep learning. While the current privacy analysis of DP-SGD is known to be tight in some settings, several empirical results suggest that models trained on common benchmark datasets leak significantly less privacy for many datapoints. Yet, despite past attempts, a rigorous explanation for why this is the case has not been reached. Is it because there exist tighter privacy upper bounds when restricted to these dataset settings, or are our attacks not strong enough for certain datapoints? In this paper, we provide the first per-instance (i.e., "data-dependent") DP analysis of DP-SGD. Our analysis captures the intuition that points with similar neighbors in the dataset enjoy better data-dependent privacy than outliers. Formally, this is done by modifying the per-step privacy analysis of DP-SGD to introduce a dependence on the distribution of model updates computed from a training dataset. We further develop a new composition theorem to effectively use this new per-step analysis to reason about an entire training run. Put all together, our evaluation shows that this novel DP-SGD analysis allows us to now formally show that DP-SGD leaks significantly less privacy for many datapoints (when trained on common benchmarks) than the current data-independent guarantee. This implies privacy attacks will necessarily fail against many datapoints if the adversary does not have sufficient control over the possible training datasets.

DPAdapter: Improving Differentially Private Deep Learning through Noise Tolerance Pre-training

Zihao Wang, Rui Zhu, and Dongruo Zhou, Indiana University Bloomington; Zhikun Zhang, Zhejiang University; John Mitchell, Stanford University; Haixu Tang and XiaoFeng Wang, Indiana University Bloomington

Available Media

Recent developments have underscored the critical role of differential privacy (DP) in safeguarding individual data for training machine learning models. However, integrating DP oftentimes incurs significant model performance degradation due to the perturbation introduced into the training process, presenting a formidable challenge in the differentially private machine learning (DPML) field. To this end, several mitigative efforts have been proposed, typically revolving around formulating new DPML algorithms or relaxing DP definitions to harmonize with distinct contexts. In spite of these initiatives, the diminishment induced by DP on models, particularly large-scale models, remains substantial and thus, necessitates an innovative solution that adeptly circumnavigates the consequential impairment of model utility.

In response, we introduce DPAdapter, a pioneering technique designed to amplify the model performance of DPML algorithms by enhancing parameter robustness. The fundamental intuition behind this strategy is that models with robust parameters are inherently more resistant to the noise introduced by DP, thereby retaining better performance despite the perturbations. DPAdapter modifies and enhances the sharpness-aware minimization (SAM) technique, utilizing a two-batch strategy to provide a more accurate perturbation estimate and an efficient gradient descent, thereby improving parameter robustness against noise. Notably, DPAdapter can act as a plug-and-play component and be combined with existing DPML algorithms to further improve their performance. Our experiments show that DPAdapter vastly enhances state-of-the-art DPML algorithms, increasing average accuracy from 72.92% to 77.09% with a privacy budget of ϵ = 4.

2:45 pm–3:15 pm

Coffee and Tea Break

Grand Ballroom Foyer

3:15 pm–4:15 pm

Track 1

Deepfake and Synthesis

Session Chair: Franziska Roesner

Salon AB

Double Face: Leveraging User Intelligence to Characterize and Recognize AI-synthesized Faces

Matthew Joslin, Xian Wang, and Shuang Hao, University of Texas at Dallas

Available Media

Artificial Intelligence (AI) techniques have advanced to generate face images of nonexistent yet photorealistic persons. Despite positive applications, AI-synthesized faces have been increasingly abused to deceive users and manipulate opinions, such as AI-generated profile photos for fake accounts. Deception using generated realistic-appearing images raises severe trust and security concerns. So far, techniques to analyze and recognize AI-synthesized face images are limited, mainly relying on off-the-shelf classification methods or heuristics of researchers' individual perceptions.

As a complement to existing analysis techniques, we develop a novel approach that leverages crowdsourcing annotations to analyze and defend against AI-synthesized face images. We aggregate and characterize AI-synthesis artifacts annotated by multiple users (instead of by individual researchers or automated systems). Our quantitative findings systematically identify where the synthesis artifacts are likely to be located and what characteristics the synthesis patterns have. We further incorporate user annotated regions into an attention learning approach to detect AI-synthesized faces. Our work sheds light on involving human factors to enhance defense against AI-synthesized face images.

SoK: The Good, The Bad, and The Unbalanced: Measuring Structural Limitations of Deepfake Media Datasets

Seth Layton, Tyler Tucker, Daniel Olszewski, Kevin Warren, Kevin Butler, and Patrick Traynor, University of Florida

Available Media

Deepfake media represents an important and growing threat not only to computing systems but to society at large. Datasets of image, video, and voice deepfakes are being created to assist researchers in building strong defenses against these emerging threats. However, despite the growing number of datasets and the relative diversity of their samples, little guidance exists to help researchers select datasets and then meaningfully contrast their results against prior efforts. To assist in this process, this paper presents the first systematization of deepfake media. Using traditional anomaly detection datasets as a baseline, we characterize the metrics, generation techniques, and class distributions of existing datasets. Through this process, we discover significant problems impacting the comparability of systems using these datasets, including unaccounted-for heavy class imbalance and reliance upon limited metrics. These observations have a potentially profound impact should such systems be transitioned to practice - as an example, we demonstrate that the widely-viewed best detector applied to a typical call center scenario would result in only 1 out of 333 flagged results being a true positive. To improve reproducibility and future comparisons, we provide a template for reporting results in this space and advocate for the release of model score files such that a wider range of statistics can easily be found and/or calculated. Through this, and our recommendations for improving dataset construction, we provide important steps to move this community forward.

Can I Hear Your Face? Pervasive Attack on Voice Authentication Systems with a Single Face Image

Nan Jiang, Bangjie Sun, and Terence Sim, National University of Singapore; Jun Han, KAIST

Available Media

We present Foice, a novel deepfake attack against voice authentication systems. Foice generates a synthetic voice of the victim from just a single image of the victim's face, without requiring any voice sample. This synthetic voice is realistic enough to fool commercial authentication systems. Since face images are generally easier to obtain than voice samples, Foice effectively makes it easier for an attacker to mount large-scale attacks. The key idea lies in learning the partial correlation between face and voice features and adding to that a face-independent voice feature sampled from a Gaussian distribution. We demonstrate the effectiveness of Foice with a comprehensive set of real-world experiments involving ten offline participants and an online dataset of 1029 unique individuals. By evaluating eight state-of-the-art systems, including WeChat's Voiceprint and Microsoft Azure, we show that all these systems are vulnerable to Foice attack.

dp-promise: Differentially Private Diffusion Probabilistic Models for Image Synthesis

Haichen Wang and Shuchao Pang, Nanjing University of Science and Technology; Zhigang Lu, James Cook University; Yihang Rao and Yongbin Zhou, Nanjing University of Science and Technology; Minhui Xue, CSIRO's Data61

Available Media

Utilizing sensitive images (e.g., human faces) for training DL models raises privacy concerns. One straightforward solution is to replace the private images with synthetic ones generated by deep generative models. Among all image synthesis methods, diffusion models (DMs) yield impressive performance. Unfortunately, recent studies have revealed that DMs incur privacy challenges due to the memorization of the training instances. To preserve the existence of a single private sample of DMs, many works have explored to apply DP on DMs from different perspectives. However, existing works on differentially private DMs only consider DMs as regular deep models, such that they inject unnecessary DP noise in addition to the forward process noise in DMs, damaging the model utility. To address the issue, this paper proposes Differentially Private Diffusion Probabilistic Models for Image Synthesis, dp-promise, which theoretically guarantees approximate DP by leveraging the DM noise during the forward process. Extensive experiments demonstrate that, given the same privacy budget, dp-promise outperforms the state-of-the-art on the image quality of differentially private image synthesis across the standard metrics and datasets.

Track 2

Hardware Security II: Architecture and Microarchitecture

Session Chair: Yuval Yarom

Salon CD

DMAAUTH: A Lightweight Pointer Integrity-based Secure Architecture to Defeat DMA Attacks

Xingkai Wang, Wenbo Shen, Yujie Bu, Jinmeng Zhou, and Yajin Zhou, Zhejiang University

Available Media

IOMMU has been introduced to thwart DMA attacks. However, the performance degradation prevents it from being enabled on most systems. Even worse, recent studies show that IOMMU is still vulnerable to sub-page and deferred invalidation attacks, posing threats to systems with IOMMU enabled.

This paper aims to provide a lightweight and secure solution to defend against DMA attacks. Based on our measurement and characterizing of DMA behavior, we propose DMAAUTH, a lightweight pointer integrity-based hardware-software co-design architecture. DMAAUTH utilizes a novel technique named Arithmetic-capable Pointer AuthentiCation (APAC), which protects the DMA pointer integrity while supporting pointer arithmetic. It also places a dedicated hardware named Authenticator on the bus to authenticate all the DMA transactions. Combining APAC, per-mapping metadata, and the Authenticator, DMAAUTH achieves strict byte-grained spatial protection and temporal protection.

We implement DMAAUTH on a real FPGA hardware board. Specifically, we first realize a PCIe-customizable SoC on real FPGA, based on which we implement hardware version DMAAUTH and conduct a thorough evaluation. We also implement DMAAUTH on both ARM and RISC-V emulators to demonstrate its cross-architecture capability. Our evaluation shows that DMAAUTH is faster and safer than IOMMU while being transparent to devices, drivers, and IOMMU.

Bending microarchitectural weird machines towards practicality

Ping-Lun Wang, Riccardo Paccagnella, Riad S. Wahby, and Fraser Brown, Carnegie Mellon University

Available Media

A large body of work has demonstrated attacks that rely on the difference between CPUs' nominal instruction set architectures and their actual (microarchitectural) implementations. Most of these attacks, like Spectre, bypass the CPU's data-protection boundaries. A recent line of work considers a different primitive, called a microarchitectural weird machine (µWM), that can execute computations almost entirely using microarchitectural side effects. While µWMs would seem to be an extremely powerful tool, e.g., for obfuscating malware, thus far they have seen very limited application. This is because prior µWMs must be hand-crafted by experts, and even then have trouble reliably executing complex computations.

In this work, we show that µWMs are a practical, near-term threat. First, we design a new µWM architecture, Flexo, that improves performance by 1–2 orders of magnitude and reduces circuit size by 75–87%, dramatically improving the applicability of µWMs to complex computation. Second, we build the first compiler from a high-level language to µWMs, letting experts craft automatic optimizations and non-experts construct state-of-the-art obfuscated computations. Finally, we demonstrate the practicality of our approach by extending the popular UPX packer to encrypt its payload and use a µWM for decryption, frustrating malware analysis.

GoFetch: Breaking Constant-Time Cryptographic Implementations Using Data Memory-Dependent Prefetchers

Boru Chen, University of Illinois Urbana-Champaign; Yingchen Wang, University of Texas at Austin; Pradyumna Shome, Georgia Institute of Technology; Christopher Fletcher, University of California, Berkeley; David Kohlbrenner, University of Washington; Riccardo Paccagnella, Carnegie Mellon University; Daniel Genkin, Georgia Institute of Technology

Available Media

Microarchitectural side-channel attacks have shaken the foundations of modern processor design. The cornerstone defense against these attacks has been to ensure that security-critical programs do not use secret-dependent data as addresses. Put simply: do not pass secrets as addresses to, e.g., data memory instructions. Yet, the discovery of data memory-dependent prefetchers (DMPs)—which turn program data into addresses directly from within the memory system—calls into question whether this approach will continue to remain secure.

This paper shows that the security threat from DMPs is significantly worse than previously thought and demonstrates the first end-to-end attacks on security-critical software using the Apple m-series DMP. Undergirding our attacks is a new understanding of how DMPs behave which shows, among other things, that the Apple DMP will activate on behalf of any victim program and attempt to "leak" any cached data that resembles a pointer. From this understanding, we design a new type of chosen-input attack that uses the DMP to perform end-to-end key extraction on popular constant-time implementations of classical (OpenSSL Diffie-Hellman Key Exchange, Go RSA decryption) and post-quantum cryptography (CRYSTALS-Kyber and CRYSTALS-Dilithium).

CacheWarp: Software-based Fault Injection using Selective State Reset

Ruiyi Zhang, Lukas Gerlach, Daniel Weber, and Lorenz Hetterich, CISPA Helmholtz Center for Information Security; Youheng Lü, Independent; Andreas Kogler, Graz University of Technology; Michael Schwarz, CISPA Helmholtz Center for Information Security

Available Media

AMD SEV is a trusted-execution environment (TEE), providing confidentiality and integrity for virtual machines (VMs). With AMD SEV, it is possible to securely run VMs on an untrusted hypervisor. While previous attacks demonstrated architectural shortcomings of earlier SEV versions, AMD claims that SEV-SNP prevents all attacks on the integrity.

In this paper, we introduce CacheWarp, a new software-based fault attack on AMD SEV-ES and SEV-SNP, exploiting the possibility to architecturally revert modified cache lines of guest VMs to their previous (stale) state. Unlike previous attacks on the integrity, CacheWarp is not mitigated on the newest SEV-SNP implementation, and it does not rely on specifics of the guest VM. CacheWarp only has to interrupt the VM at an attacker-chosen point to invalidate modified cache lines without them being written back to memory. Consequently, the VM continues with architecturally stale data. In 3 case studies, we demonstrate an attack on RSA in the Intel IPP crypto library, recovering the entire private key, logging into an OpenSSH server without authentication, and escalating privileges to root via the sudo binary. While we implement a software-based mitigation proof-of-concept, we argue that mitigations are difficult, as the root cause is in the hardware.

Track 3

System Security II: OS Kernel

Session Chair: Shweta Shinde

Salon E

MOAT: Towards Safe BPF Kernel Extension

Hongyi Lu, Research Institute of Trustworthy Autonomous Systems, Southern University of Science and Technology, and Hong Kong University of Science and Technology; Shuai Wang, Hong Kong University of Science and Technology; Yechang Wu and Wanning He, Southern University of Science and Technology; Fengwei Zhang, Southern University of Science and Technology and Research Institute of Trustworthy Autonomous Systems

Available Media

The Linux kernel extensively uses the Berkeley Packet Filter (BPF) to allow user-written BPF applications to execute in the kernel space. The BPF employs a verifier to check the security of user-supplied BPF code statically. Recent attacks show that BPF programs can evade security checks and gain unauthorized access to kernel memory, indicating that the verification process is not flawless. In this paper, we present MOAT, a system that isolates potentially malicious BPF programs using Intel Memory Protection Keys (MPK). Enforcing BPF program isolation with MPK is not straightforward; MOAT is designed to alleviate technical obstacles, such as limited hardware keys and the need to protect a wide variety of BPF helper functions. We implement MOAT on Linux (ver. 6.1.38), and our evaluation shows that MOAT delivers low-cost isolation of BPF programs under mainstream use cases, such as isolating a BPF packet filter with only 3% throughput loss.

SeaK: Rethinking the Design of a Secure Allocator for OS Kernel

Zicheng Wang, University of Colorado Boulder & Nanjing University; Yicheng Guang, Nanjing University; Yueqi Chen, University of Colorado Boulder; Zhenpeng Lin, Northwestern University; Michael Le, IBM Research; Dang K Le, Northwestern University; Dan Williams, Virginia Tech; Xinyu Xing, Northwestern University; Zhongshu Gu and Hani Jamjoom, IBM Research

Available Media

In recent years, heap-based exploitation has become the most dominant attack against the Linux kernel. Securing the kernel heap is of vital importance for kernel protection. Though the Linux kernel allocator has some security designs in place to counter exploitation, our analytical experiments reveal that they can barely provide the expected results. This shortfall is rooted in the current strategy of designing secure kernel allocators which insists on protecting every object all the time. Such strategy inherently conflicts with the kernel nature. To this end, we advocate for rethinking the design of secure kernel allocator. In this work, we explore a new strategy which centers around the "atomic alleviation" concept, featuring flexibility and efficiency in design and deployment. Recent advancements in kernel design and research outcomes on exploitation techniques enable us to prototype this strategy in a tool named SeaK. We used real-world cases to thoroughly evaluate SeaK. The results validate that SeaK substantially strengthens heap security, outperforming all existing features, without incurring noticeable performance and memory cost. Besides, SeaK shows excellent scalability and stability in the production scenario.

Take a Step Further: Understanding Page Spray in Linux Kernel Exploitation

Ziyi Guo, Dang K Le, and Zhenpeng Lin, Northwestern University; Kyle Zeng, Ruoyu Wang, Tiffany Bao, Yan Shoshitaishvili, and Adam Doupé, Arizona State University; Xinyu Xing, Northwestern University

Available Media

Recently, a novel method known as Page Spray emerges, focusing on page-level exploitation for kernel vulnerabilities. Despite the advantages it offers in terms of exploitability, stability, and compatibility, comprehensive research on Page Spray remains scarce. Questions regarding its root causes, exploitation model, comparative benefits over other exploitation techniques, and possible mitigation strategies have largely remained unanswered. In this paper, we conduct a systematic investigation into Page Spray, providing an in-depth understanding of this exploitation technique. We introduce a comprehensive exploit model termed the DirtyPage model, elucidating its fundamental principles. Additionally, we conduct a thorough analysis of the root causes underlying Page Spray occurrences within the Linux Kernel. We design an analyzer based on the Page Spray analysis model to identify Page Spray callsites. Subsequently, we evaluate the stability, exploitability, and compatibility of Page Spray through meticulously designed experiments. Finally, we propose mitigation principles for addressing Page Spray and introduce our own lightweight mitigation approach. This research aims to assist security researchers and developers in gaining insights into Page Spray, ultimately enhancing our collective understanding of this emerging exploitation technique and making improvements to community.

SafeFetch: Practical Double-Fetch Protection with Kernel-Fetch Caching

Victor Duta, Mitchel Josephus Aloserij, and Cristiano Giuffrida, Vrije Universiteit Amsterdam

Distinguished Artifact Award Winner

Available Media

Double-fetch bugs (or vulnerabilities) stem from in-kernel system call execution fetching the same user data twice without proper data (re)sanitization, enabling TOCTTOU attacks and posing a major threat to operating systems security. Existing double-fetch protection systems rely on the MMU to trap on writes to syscall-accessed user pages and provide the kernel with a consistent snapshot of user memory. While this strategy can hinder attacks, it also introduces nontrivial runtime performance overhead due to the cost of trapping/remapping and the coarse (page-granular) write interposition mechanism.

In this paper, we propose SafeFetch, a practical solution to protect the kernel from double-fetch bugs. The key intuition is that most system calls fetch small amounts of user data (if at all), hence caching this data in the kernel can be done at a small performance cost. To this end, SafeFetch creates per-syscall caches to persist fetched user data and replay them when they are fetched again within the same syscall. This strategy neutralizes all double-fetch bugs, while eliminating trapping/remapping overheads and relying on efficient byte-granular interposition. Our Linux prototype evaluation shows SafeFetch can provide comprehensive protection with low performance overheads (e.g., 4.4% geomean on LMBench), significantly outperforming state-of-the-art solutions.

Track 4

Network Security II: Attacks

Session Chair: Hamed Okhravi

Salon F

LanDscAPe: Exploring LDAP Weaknesses and Data Leaks at Internet Scale

Jonas Kaspereit and Gurur Öndarö, Münster University of Applied Sciences; Gustavo Luvizotto Cesar, University of Twente; Simon Ebbers, Münster University of Applied Sciences; Fabian Ising, Fraunhofer SIT and National Research Center for Applied Cybersecurity ATHENE; Christoph Saatjohann, Münster University of Applied Sciences, Fraunhofer SIT, and National Research Center for Applied Cybersecurity ATHENE; Mattijs Jonker, University of Twente; Ralph Holz, University of Twente and University of Münster; Sebastian Schinzel, Münster University of Applied Sciences, Fraunhofer SIT, and National Research Center for Applied Cybersecurity ATHENE

Available Media

The Lightweight Directory Access Protocol (LDAP) is the standard technology to query information stored in directories. These directories can contain sensitive personal data such as usernames, email addresses, and passwords. LDAP is also used as a central, organization-wide storage of configuration data for other services. Hence, it is important to the security posture of many organizations, not least because it is also at the core of Microsoft's Active Directory, and other identity management and authentication services.

We report on a large-scale security analysis of deployed LDAP servers on the Internet. We developed LanDscAPe, a scanning tool that analyzes security-relevant misconfigurations of LDAP servers and the security of their TLS configurations. Our Internet-wide analysis revealed more than 10k servers that appear susceptible to a range of threats, including insecure configurations, deprecated software with known vulnerabilities, and insecure TLS setups. 4.9k LDAP servers host personal data, and 1.8k even leak passwords. We document, classify, and discuss these and briefly describe our notification campaign to address these concerning issues.

FakeBehalf: Imperceptible Email Spoofing Attacks against the Delegation Mechanism in Email Systems

Jinrui Ma, Lutong Chen, and Kaiping Xue, University of Science and Technology of China; Bo Luo, The University of Kansas; Xuanbo Huang, Mingrui Ai, and Huanjie Zhang, University of Science and Technology of China; David S.L. Wei, Fordham University; Yan Zhuang, University of Science and Technology of China

Available Media

Email has become an essential service for global communication.In email protocols, a Delegation Mechanism allows emails to be sent by other entities on behalf of the email author. Specifically, the Sender field indicates the agent for email delivery (i.e., the Delegate). Despite well-implemented security extensions (e.g., DKIM, DMARC) that validate the authenticity of email authors, vulnerabilities in the Delegation Mechanism can still be exploited to bypass these security measures with well-crafted spoofing emails.

This paper systematically analyzes the security vulnerabilities within the Delegation Mechanism. Due to the absence of validation for the Sender field, adversaries can arbitrarily fabricate this field, thus spoofing the Delegate presented to email recipients. Our observations reveal that emails with a spoofed Sender field can pass authentications and reach the inboxes of all target providers. We also conduct a user study with 50 participants to assess the recipients' comprehension of spoofed Delegates, finding that 50% are susceptible to deceiving Delegate information. Furthermore, we propose novel email spoofing attacks where adversaries can impersonate arbitrary entities as email authors to craft highly deceptive emails while passing security extensions. We assess their impact across 16 service providers and 20 clients, observing that half of the providers and all clients are vulnerable to the discovered attacks. To mitigate the threats within the Delegation Mechanism, we propose a validation scheme to verify the authenticity of the Sender field, along with design suggestions to enhance the security of email clients.

Rethinking the Security Threats of Stale DNS Glue Records

Yunyi Zhang, National University of Defense Technology and Tsinghua University; Baojun Liu, Tsinghua University; Haixin Duan, Tsinghua University, Zhongguancun Laboratory, and Quan Cheng Laboratory; Min Zhang, National University of Defense Technology; Xiang Li, Tsinghua University; Fan Shi and Chengxi Xu, National University of Defense Technology; Eihal Alowaisheq, King Saud University

Available Media

The Domain Name System (DNS) fundamentally relies on glue records to provide authoritative nameserver IP addresses, enabling essential in-domain delegation. While previous studies have identified potential security risks associated with glue records, the exploitation of these records, especially in the context of out-domain delegation, remains unclear due to their inherently low trust level and the diverse ways in which resolvers handle them. This paper undertakes the first systematic exploration of the potential threats posed by DNS glue records, uncovering significant real-world security risks. We empirically identify that 23.18% of glue records across 1,096 TLDs are outdated yet still served in practice. More concerningly, through reverse engineering 9 mainstream DNS implementations (e.g., BIND 9 and Microsoft DNS), we reveal manipulable behaviors associated with glue records. The convergence of these systemic issues allows us to propose the novel threat model that could enable large-scale domain hijacking and denial-of-service attacks. Furthermore, our analysis determines over 193,558 exploitable records exist, placing more than 6 million domains at risk. Additional measurement studies on global open resolvers demonstrate that 90% of them use unvalidated and outdated glue records, including OpenDNS and AliDNS. Our responsible disclosure has already prompted mitigation efforts by affected stakeholders. Microsoft DNS, PowerDNS, OpenDNS, and Alibaba Cloud DNS have acknowledged our reported vulnerability. In summary, this work highlights that glue records constitute a forgotten foundation of DNS architecture requiring renewed security prioritization

EVOKE: Efficient Revocation of Verifiable Credentials in IoT Networks

Carlo Mazzocca, University of Bologna; Abbas Acar and Selcuk Uluagac, Cyber-Physical Systems Security Lab, Florida International University; Rebecca Montanari, University of Bologna

Available Media

The lack of trust is one of the major factors that hinder collaboration among Internet of Things (IoT) devices and harness the usage of the vast amount of data generated. Traditional methods rely on Public Key Infrastructure (PKI), managed by centralized certification authorities (CAs), which suffer from scalability issues, single points of failure, and limited interoperability. To address these concerns, Decentralized Identifiers (DIDs) and Verifiable Credentials (VCs) have been proposed by the World Wide Web Consortium (W3C) and the European Union as viable solutions for promoting decentralization and "electronic IDentification, Authentication, and trust Services" (eIDAS). Nevertheless, at the state-of-the-art, there are no efficient revocation mechanisms for VCs specifically tailored for IoT devices, which are characterized by limited connectivity, storage, and computational power.

This paper presents EVOKE, an efficient revocation mechanism of VCs in IoT networks. EVOKE leverages an ECC-based accumulator to manage VCs with minimal computing and storage overhead while offering additional features like mass and offline revocation. We designed, implemented, and evaluated a prototype of EVOKE across various deployment scenarios. Our experiments on commodity IoT devices demonstrate that each device only requires minimal storage (i.e., approximately 1.5 KB) to maintain verification information, and most notably half the storage required by the most efficient PKI certificates. Moreover, our experiments on hybrid networks, representing typical IoT protocols (e.g., Zigbee), also show minimal latency in the order of milliseconds. Finally, our large-scale analysis demonstrates that even when 50% of devices missed updates, approximately 96% of devices in the entire network were updated within the first hour, proving the scalability of EVOKE in offline updates.

Track 5

ML II: Fault Injection and Robustness

Session Chair: Mulong Luo

Salon G

DNN-GP: Diagnosing and Mitigating Model's Faults Using Latent Concepts

Shuo Wang, Shanghai Jiao Tong University; Hongsheng Hu, CSIRO's Data61; Jiamin Chang, University of New South Wales and CSIRO's Data61; Benjamin Zi Hao Zhao, Macquarie University; Qi Alfred Chen, University of California, Irvine; Minhui Xue, CSIRO's Data61

Available Media

Despite the impressive capabilities of Deep Neural Networks (DNN), these systems remain fault-prone due to unresolved issues of robustness to perturbations and concept drift. Existing approaches to interpreting faults often provide only low-level abstractions, while struggling to extract meaningful concepts to understand the root cause. Furthermore, these prior methods lack integration and generalization across multiple types of faults. To address these limitations, we present a fault diagnosis tool (akin to a General Practitioner) DNN-GP, an integrated interpreter designed to diagnose various types of model faults through the interpretation of latent concepts. DNN-GP incorporates probing samples derived from adversarial attacks, semantic attacks, and samples exhibiting drifting issues to provide a comprehensible interpretation of a model's erroneous decisions. Armed with an awareness of the faults, DNN-GP derives countermeasures from the concept space to bolster the model's resilience. DNN-GP is trained once on a dataset and can be transferred to provide versatile, unsupervised diagnoses for other models, and is sufficiently general to effectively mitigate unseen attacks. DNN-GP is evaluated on three real-world datasets covering both attack and drift scenarios to demonstrate state-to the-art detection accuracy (near 100%) with low false positive rates (<5%).

Yes, One-Bit-Flip Matters! Universal DNN Model Inference Depletion with Runtime Code Fault Injection

Shaofeng Li, Peng Cheng Laboratory; Xinyu Wang, Shanghai Jiao Tong University; Minhui Xue, CSIRO's Data61; Haojin Zhu, Shanghai Jiao Tong University; Zhi Zhang, University of Western Australia; Yansong Gao, CSIRO's Data61; Wen Wu, Peng Cheng Laboratory; Xuemin (Sherman) Shen, University of Waterloo

Distinguished Paper Award Winner

Available Media

We propose, FrameFlip, a novel attack for depleting DNN model inference with runtime code fault injections. Notably, Frameflip operates independently of the DNN models deployed and succeeds with only a single bit-flip injection. This fundamentally distinguishes it from the existing DNN inference depletion paradigm that requires injecting tens of deterministic faults concurrently. Since our attack performs at the universal code or library level, the mandatory code snippet can be perversely called by all mainstream machine learning frameworks, such as PyTorch and TensorFlow, dependent on the library code. Using DRAM Rowhammer to facilitate end-to-end fault injection, we implement Frameflip across diverse model architectures (LeNet, VGG-16, ResNet-34 and ResNet-50) with different datasets (FMNIST, CIFAR-10, GTSRB, and ImageNet). With a single bit fault injection, Frameflip achieves high depletion efficacy that consistently renders the model inference utility as no better than guessing. We also experimentally verify that identified vulnerable bits are almost equally effective at depleting different deployed models. In contrast, transferability is unattainable for all existing state-of-the-art model inference depletion attacks. Frameflip is shown to be evasive against all known defenses, generally due to the nature of current defenses operating at the model level (which is model-dependent) in lieu of the underlying code level.

Tossing in the Dark: Practical Bit-Flipping on Gray-box Deep Neural Networks for Runtime Trojan Injection

Zihao Wang, Di Tang, and XiaoFeng Wang, Indiana University Bloomington; Wei He, Zhaoyang Geng, and Wenhao Wang, SKLOIS, Institute of Information Engineering, Chinese Academy of Sciences

Available Media

Although Trojan attacks on deep neural networks (DNNs) have been extensively studied, the threat of run-time Trojan injection has only recently been brought to attention. Unlike data poisoning attacks that target the training stage of a DNN model, a run-time attack executes an exploit such as Rowhammer on memory to flip the bits of the target model and thereby implant a Trojan. This threat is stealthier but more challenging, as it requires flipping a set of bits in the target model to introduce an effective Trojan without noticeably downgrading the model's accuracy. This has been achieved only under the less realistic assumption that the target model is fully shared with the adversary through memory, thus enabling them to flip bits across all model layers, including the last few layers.

For the first time, we have investigated run-time Trojan Injection under a more realistic gray-box scenario. In this scenario, a model is perceived in an encoder-decoder manner: the encoder is public and shared through memory, while the decoder is private and so considered to be black-box and inaccessible to unauthorized parties. To address the unique challenge posed by the black-box decoder to Trojan injection in this scenario, we developed a suite of innovative techniques. Using these techniques, we constructed our gray-box attack, Groan, which stands out as both effective and stealthy. Our experiments show that Groan is capable of injecting a highly effective Trojan into the target model, while also largely preserving its performance, even in the presence of state-of-theart memory protection.

Forget and Rewire: Enhancing the Resilience of Transformer-based Models against Bit-Flip Attacks

Najmeh Nazari, Hosein Mohammadi Makrani, and Chongzhou Fang, University of California, Davis; Hossein Sayadi, California State University, Long Beach; Setareh Rafatirad, University of California, Davis; Khaled N. Khasawneh, George Mason University; Houman Homayoun, University of California, Davis

Available Media

Bit-Flip Attacks (BFAs) involve adversaries manipulating a model's parameter bits to undermine its accuracy significantly. They typically target the most vulnerable parameters, causing maximal damage with minimal bit-flips. While BFAs' impact on Deep Neural Networks (DNNs) is well-studied, their effects on Large Language Models (LLMs) and Vision Transformers (ViTs) have not received the same attention. Inspired by "brain rewiring," we explore enhancing Transformers' resilience against such attacks. This potential lies in the unique architecture of transformer-based models, particularly their Linear layers. Our novel approach, called Forget and Rewire (FaR), strategically applies rewiring to Linear layers to obfuscate neuron connections. By redistributing tasks from critical to non-essential neurons, we reduce the model's sensitivity to specific parameters while preserving its core functionality. This strategy thwarts adversaries' attempts to identify and target crucial parameters using gradient-based algorithms. Our approach conceals pivotal parameters and enhances robustness against random attacks. Comprehensive evaluations across widely used datasets and Transformer frameworks show that the FaR mechanism significantly reduces BFA success rates by 1.4 to 4.2 times with minimal accuracy loss (less than 2%).

Track 6

Security Analysis II: Program Analysis

Session Chair: Mathias Payer

Salon H

What IF Is Not Enough? Fixing Null Pointer Dereference With Contextual Check

Yunlong Xing, Shu Wang, Shiyu Sun, Xu He, and Kun Sun, George Mason University; Qi Li, Tsinghua University

Available Media

Null pointer dereference (NPD) errors pose the risk of unexpected behavior and system instability, potentially leading to abrupt program termination due to exceptions or segmentation faults. When generating NPD fixes, all existing solutions are confined to the function level fixes and ignore the valuable intraprocedural and interprocedural contextual information, potentially resulting in incorrect patches. In this paper, we introduce CONCH, a novel approach that addresses the challenges of generating correct fixes for NPD issues by incorporating contextual checks. Our method first constructs an NPD context graph to maintain the semantics related to patch generation. Then we summarize distinct fixing position selection policies based on the distribution of the error positions, ensuring the resolution of bugs without introducing duplicate code. Next, the intraprocedural state retrogression builds the if condition, retrogresses the local resources, and constructs return statements as an initial patch. Finally, we conduct interprocedural state propagation to assess the correctness of the initial patch in the entire call chain. We evaluate the effectiveness of CONCH over two real-world datasets. The experimental results demonstrate that CONCH outperforms the SOTA methods and yields over 85% accurate patches.

Unleashing the Power of Type-Based Call Graph Construction by Using Regional Pointer Information

Yuandao Cai, Yibo Jin, and Charles Zhang, The Hong Kong University of Science and Technology

Available Media

When dealing with millions of lines of C code, we still cannot have the cake and eat it: type analysis for call graph construction is scalable yet highly imprecise. We address this precision issue through a practical observation: many function pointers are simple; they are not referenced by other pointers, nor do they derive their values by dereferencing other pointers. As a result, simple function pointers can be resolved with precise and affordable pointer aliasing information. In this work, we advocate Kelp with two concerted stages. First, instead of directly using type analysis, Kelp performs regional pointer analysis along def-use chains to early and precisely resolve the indirect calls through simple function pointers. Second, Kelp then leverages type analysis to handle the remaining indirect calls. The ﬁrst stage is efﬁcient as Kelp selectively reasons about simple function pointers, thereby avoiding prohibitive performance penalties. The second stage is precise as the candidate address-taken functions for checking type compatibility are largely reduced thanks to the ﬁrst stage. Our experiments on twenty large-scale and popular software programs show that, on average, Kelp can reduce spurious callees by 54.2% with only a negligible additional time cost of 8.5% (equivalent to 6.3 seconds) compared to the previous approach. More excitingly, when evaluating the call graphs through the lens of three various downstream clients (i.e., thread-sharing analysis, value-ﬂow bug detection, and directed grey-box fuzzing), Kelp can signiﬁcantly enhance their effectiveness for better vulnerability understanding, hunting, and reproduction.

Practical Data-Only Attack Generation

Brian Johannesmeyer, Asia Slowinska, Herbert Bos, and Cristiano Giuffrida, Vrije Universiteit Amsterdam

Available Media

As control-flow hijacking is getting harder due to increasingly sophisticated CFI solutions, recent work has instead focused on automatically building data-only attacks, typically using symbolic execution, simplifying assumptions that do not always match the attacker's goals, manual gadget chaining, or all of the above. As a result, the practical adoption of such methods is minimal. In this work, we abstract away unnecessary complexities and instead use a lightweight approach that targets the vulnerabilities that are both the most tractable for analysis, and the most promising for an attacker.

In particular, we present Einstein, a data-only attack exploitation pipeline that uses dynamic taint analysis policies to: (i) scan for chains of vulnerable system calls (e.g., to execute code or corrupt the filesystem), and (ii) generate exploits for those that take unmodified attacker data as input. Einstein discovers thousands of vulnerable syscalls in common server applications—well beyond the reach of existing approaches. Moreover, using nginx as a case study, we use Einstein to generate 944 exploits, and we discuss two such exploits that bypass state-of-the-art mitigations.

Don't Waste My Efforts: Pruning Redundant Sanitizer Checks by Developer-Implemented Type Checks

Yizhuo Zhai, Zhiyun Qian, Chengyu Song, Manu Sridharan, and Trent Jaeger, University of California, Riverside; Paul Yu, U.S. Army Research Laboratory; Srikanth V. Krishnamurthy, University of California, Riverside

Available Media

Type confusion occurs when C or C++ code accesses an object after casting it to an incompatible type. The security impacts of type confusion vulnerabilities are significant, potentially leading to system crashes or even arbitrary code execution. To mitigate these security threats, both static and dynamic approaches have been developed to detect type confusion bugs. However, static approaches can suffer from excessive false positives, while existing dynamic approaches track type information for each object to enable safety checking at each cast, introducing a high runtime overhead.

In this paper, we present a novel tool T-PRUNIFY to reduce the overhead of dynamic type confusion sanitizers. We observe that in large complex C++ projects, to prevent type confusion bugs, developers often add their own encoding of runtime type information (RTTI) into classes, to enable efficient runtime type checks before casts. T-PRUNIFY works by first identifying these custom RTTI in classes, automatically determining the relationship between field and method return values and the concrete types of corresponding objects. Based on these custom RTTI, T-PRUNIFY can identify cases where a cast is protected by developer-written type checks that guarantee the safety of the cast. Consequently, it can safely remove sanitizer instrumentation for such casts, reducing performance overhead. We evaluate T-PRUNIFY based on HexType, a state-of-the-art type confusion sanitizer that supports extensive C++ projects such as Google Chrome. Our findings demonstrate that our method significantly lowers HexType's average overhead by 25% to 75% in large C++ programs, marking a substantial enhancement in performance.

Track 7

Zero-Knowledge Proof I

Session Chair: Sarah Meiklejohn

Salon IJK

Two Shuffles Make a RAM: Improved Constant Overhead Zero Knowledge RAM

Yibin Yang, Georgia Institute of Technology; David Heath, University of Illinois Urbana-Champaign

Available Media

We optimize Zero Knowledge (ZK) proofs of statements expressed as RAM programs over arithmetic values. Our arithmetic-circuit-based read/write memory uses only 4 input gates and 6 multiplication gates per memory access. This is an almost 3× total gate improvement over prior state of the art (Delpech de Saint Guilhem et al., SCN'22).

We implemented our memory in the context of ZK proofs based on vector oblivious linear evaluation (VOLE), and we further optimized based on techniques available in the VOLE setting. Our experiments show that (1) our total runtime improves over that of the prior best VOLE-ZK RAM (Franzese et al., CCS'21) by 2-20× and (2) on a typical hardware setup, we can achieve ≈ 600K RAM accesses per second.

We also develop improved read-only memory and set ZK data structures. These are used internally in our read/write memory and improve over prior work.

Notus: Dynamic Proofs of Liabilities from Zero-knowledge RSA Accumulators

Jiajun Xin, Arman Haghighi, Xiangan Tian, and Dimitrios Papadopoulos, The Hong Kong University of Science and Technology

Available Media

Proofs of Liabilities (PoL) allow an untrusted prover to commit to its liabilities towards a set of users and then prove independent users' amounts or the total sum of liabilities, upon queries by users or third-party auditors. This application setting is highly dynamic. User liabilities may increase/decrease arbitrarily and the prover needs to update proofs in epoch increments (e.g., once a day for a crypto-asset exchange platform). However, prior works mostly focus on the static case and trivial extensions to the dynamic setting open the system to windows of opportunity for the prover to under-report its liabilities and rectify its books in time for the next check, unless all users check their liabilities at all epochs. In this work, we develop Notus, the first dynamic PoL system for general liability updates that avoids this issue. Moreover, it achieves O(1) query proof size, verification time, and auditor overhead-per-epoch. The core building blocks underlying Notus are a novel zero-knowledge (and SNARK-friendly) RSA accumulator and a corresponding zero-knowledge MultiSwap protocol, which may be of independent interest. We then propose optimizations to reduce the prover's update overhead and make Notus scale to large numbers of users (10^6 in our experiments). Our results are very encouraging, e.g., it takes less than 2ms to verify a user's liability and the proof size is 256 Bytes. On the prover side, deploying Notus on a cloud-based testbed with 256 cores and exploiting parallelism, it takes about 3 minutes to perform the complete epoch update, after which all proofs have already been computed.

Practical Security Analysis of Zero-Knowledge Proof Circuits

Hongbo Wen, University of California, Santa Barbara; Jon Stephens, The University of Texas at Austin and Veridise; Yanju Chen, University of California, Santa Barbara; Kostas Ferles, Veridise; Shankara Pailoor, The University of Texas at Austin and Veridise; Kyle Charbonnet, Ethereum Foundation; Isil Dillig, The University of Texas at Austin and Veridise; Yu Feng, University of California, Santa Barbara, and Veridise

Available Media

As privacy-sensitive applications based on zero-knowledge proofs (ZKPs) gain increasing traction, there is a pressing need to detect vulnerabilities in ZKP circuits. This paper studies common vulnerabilities in Circom (the most popular domain-specific language for ZKP circuits) and describes a static analysis framework for detecting these vulnerabilities. Our technique operates over an abstraction called the circuit dependence graph (CDG) that captures key properties of the circuit and allows expressing semantic vulnerability patterns as queries over the CDG abstraction. We have implemented 9 different detectors using this framework and performed an experimental evaluation on over 258 circuits from popular Circom projects on GitHub. According to our evaluation, these detectors can identify vulnerabilities, including previously unknown ones, with high precision and recall.

Formalizing Soundness Proofs of Linear PCP SNARKs

Bolton Bailey and Andrew Miller, University of Illinois at Urbana-Champaign

Available Media

Succinct Non-interactive Arguments of Knowledge (SNARKs) have seen interest and development from the cryptographic community over recent years, and there are now constructions with very small proof size designed to work well in practice. A SNARK protocol can only be widely accepted as secure, however, if a rigorous proof of its security properties has been vetted by the community. Even then, it is sometimes the case that these security proofs are flawed, and it is then necessary for further research to identify these flaws and correct the record.

To increase the rigor of these proofs, we create a formal framework in the Lean theorem prover for representing a widespread subclass of SNARKs based on linear PCPs. We then describe a decision procedure for checking the soundness of SNARKs in this class. We program this procedure and use it to formalize the soundness proof of several different SNARK constructions, including the well-known Groth '16.

4:15 pm–4:30 pm

Short Break

Grand Ballroom Foyer

4:30 pm–5:30 pm

Track 1

Measurement I: Fraud and Malware and Spam

Session Chair: Guillermo Suarez-Tangil

Salon AB

Guardians of the Galaxy: Content Moderation in the InterPlanetary File System

Saidu Sokoto, City, University of London; Leonhard Balduf, TU Darmstadt; Dennis Trautwein, University of Göttingen; Yiluo Wei and Gareth Tyson, Hong Kong Univ. of Science & Technology (GZ); Ignacio Castro, Queen Mary, University of London; Onur Ascigil, Lancaster University; George Pavlou, University College London; Maciej Korczyński, Univ. Grenoble Alpes; Björn Scheuermann, TU Darmstadt; Michał Król, City, University of London

Available Media

The InterPlanetary File System (IPFS) is one of the largest platforms in the growing "Decentralized Web". The increasing popularity of IPFS has attracted large volumes of users and content. Unfortunately, some of this content could be considered "problematic". Content moderation is always hard. With a completely decentralized infrastructure and administration, content moderation in IPFS is even more difficult. In this paper, we examine this challenge. We identify, characterize, and measure the presence of problematic content in IPFS (e.g. subject to takedown notices). Our analysis covers 368,762 files. We analyze the complete content moderation process including how these files are flagged, who hosts and retrieves them. We also measure the efficacy of the process. We analyze content submitted to denylist, showing that notable volumes of problematic content are served, and the lack of a centralized approach facilitates its spread. While we identify fast reactions to takedown requests, we also test the resilience of multiple gateways and show that existing means to filter problematic content can be circumvented. We end by proposing improvements to content moderation that result in 227% increase in the detection of phishing content and reduce the average time to filter such content by 43%.

True Attacks, Attack Attempts, or Benign Triggers? An Empirical Measurement of Network Alerts in a Security Operations Center

Limin Yang, Zhi Chen, Chenkai Wang, Zhenning Zhang, and Sushruth Booma, University of Illinois at Urbana-Champaign; Phuong Cao, NCSA; Constantin Adam, IBM Research; Alexander Withers, NCSA; Zbigniew Kalbarczyk, Ravishankar K. Iyer, and Gang Wang, University of Illinois at Urbana-Champaign

Available Media

Security Operations Centers (SOCs) face the key challenge of handling excessive security alerts. While existing works have studied this problem qualitatively via user studies, there is still a lack of quantitative understanding of the impact of excessive alerts and their effectiveness and limitations in capturing true attacks.

In this paper, we fill the gap by working with a real-world SOC and collecting and analyzing their network alert logs over 4 years (115 million alerts, from 2018 to 2022). To further understand how alerts are associated with true attacks, we also obtain the ground truth of 227 successful attacks in the past 20 years (11 during the overlapping period). Through analysis, we observe that SOC analysts are facing excessive alerts (24K–134K per day), but only a small percentage of the alerts (0.01%) are associated with true attacks. While the majority of true attacks can be detected within the same day, the post-attack investigation takes much longer time (53 days on average). Furthermore, we observe a significant portion of the alerts are related to "attack attempts'' (attacks that did not lead to true compromises, 27%), and "benign triggers'' (correctly matched security events but had business-justified explanations, 49%). Empirically, we show there are opportunities to use rare/abnormal alert patterns to help isolate signals related to true attacks. Given that enterprise SOCs rarely disclose internal data, this paper helps contextualize SOCs' pain points and refine existing problem definitions.

DARKFLEECE: Probing the Dark Side of Android Subscription Apps

Chang Yue, Institute of Information Engineering, Chinese Academy of Sciences, China; School of Cyber Security, University of Chinese Academy of Sciences, China; Chen Zhong, University of Tampa, USA; Kai Chen and Zhiyu Zhang, Institute of Information Engineering, Chinese Academy of Sciences, China; School of Cyber Security, University of Chinese Academy of Sciences, China; Yeonjoon Lee, Hanyang University, Ansan, Republic of Korea

Available Media

Fleeceware, a novel category of malicious subscription apps, is increasingly tricking users into expensive subscriptions, leading to substantial financial consequences. These apps' ambiguous nature, closely resembling legitimate subscription apps, complicates their detection in app markets. To address this, our study aims to devise an automated method, named DARKFLEECE, to identify fleeceware through their prevalent use of dark patterns. By recruiting domain experts, we curated the first-ever fleeceware feature library, based on dark patterns extracted from user interfaces (UI). A unique extraction method, which integrates UI elements, layout, and multifaceted extraction rules, has been developed. DARKFLEECE boasts a detection accuracy of 93.43% on our dataset and utilizes Explainable Artificial Intelligence (XAI) to present user-friendly alerts about potential fleeceware risks. When deployed to assess Google Play's app landscape, DARKFLEECE examined 13,597 apps and identified an alarming 75.21% of 589 subscription apps that displayed different levels of fleeceware, totaling around 5 billion downloads. Our results are consistent with user reviews on Google Play. Our detailed exploration into the implications of our results for ethical app developers, app users, and app market regulators provides crucial insights for different stakeholders. This underscores the need for proactive measures against the rise of fleeceware.

Into the Dark: Unveiling Internal Site Search Abused for Black Hat SEO

Yunyi Zhang, National University of Defense Technology; Tsinghua University; Mingxuan Liu, Zhongguancun Laboratory; Baojun Liu, Tsinghua University; Zhongguancun Laboratory; Yiming Zhang, Tsinghua University; Haixin Duan, Tsinghua University; Zhongguancun Laboratory; Min Zhang, National University of Defense Technology; Hui Jiang, Tsinghua University; Baidu Inc; Yanzhe Li, Baidu Inc; Fan Shi, National University of Defense Technology

Available Media

Internal site Search Abuse Promotion (ISAP) is a prevalent Black Hat Search Engine Optimization (SEO) technique, which exploits the reputation of abused internal search websites with minimal effort. However, ISAP is underappreciated and not systematically understood by the security community. To shed light on ISAP risks, we established a collaboration with Baidu, a leading search engine in China. The key challenge of efficiently detecting ISAP risks stems from the sheer volume of daily search traffic, which involves billions of URLs. To address these efficiency bottlenecks, we introduced a first-of-its-kind lightweight detector utilizing a funnel-like approach, tailored to the unique characteristics of ISAP. This approach allows us to single out 3,222,864 ISAP URLs from 10,209 abused websites from Baidu's traffic data. We found that the businesses most likely to fall prey to this practice are porn and gambling, with two emerging areas: self-promotion for SEO and promotion for anonymous servers. By analyzing Baidu's search logs, we discovered that these malicious websites had reached millions of users in just 4 days. We further evaluated this threat on Google and Bing, thereby confirming the widespread presence of ISAP across various search engines. Moreover, we responsibly disclosed the issue to affected search engines and websites, and actively helped them fix it. In summary, our findings highlight the widespread impact and prevalence of ISAP, emphasizing the urgent need for the security community to prioritize and address such risks.

Track 2

Side Channel II: RowHammer

Session Chair: Aurélien Francillon

Salon CD

ABACuS: All-Bank Activation Counters for Scalable and Low Overhead RowHammer Mitigation

Ataberk Olgun, Yahya Can Tugrul, Nisa Bostanci, Ismail Emir Yuksel, Haocong Luo, Steve Rhyner, Abdullah Giray Yaglikci, Geraldo F. Oliveira, and Onur Mutlu, ETH Zurich

Available Media

We introduce ABACuS, a new low-cost hardware-counterbased RowHammer mitigation technique that performance-, energy-, and area-efficiently scales with worsening RowHammer vulnerability. We observe that both benign workloads and RowHammer attacks tend to access DRAM rows with the same row address in multiple DRAM banks at around the same time. Based on this observation, ABACuS's key idea is to use a single shared row activation counter to track activations to the rows with the same row address in all DRAM banks. Unlike state-of-the-art RowHammer mitigation mechanisms that implement a separate row activation counter for each DRAM bank, ABACuS implements fewer counters (e.g., only one) to track an equal number of aggressor rows.

Our comprehensive evaluations show that ABACuS securely prevents RowHammer bitflips at low performance/energy overhead and low area cost. We compare ABACuS to four state-of-the-art mitigation mechanisms. At a nearfuture RowHammer threshold of 1000, ABACuS incurs only 0.58% (0.77%) performance and 1.66% (2.12%) DRAM energy overheads, averaged across 62 single-core (8-core) workloads, requiring only 9.47 KiB of storage per DRAM rank. At the RowHammer threshold of 1000, the best prior lowarea-cost mitigation mechanism incurs 1.80% higher average performance overhead than ABACuS, while ABACuS requires 2.50× smaller chip area to implement. At a future RowHammer threshold of 125, ABACuS performs very similarly to (within 0.38% of the performance of) the best prior performance- and energy-efficient RowHammer mitigation mechanism while requiring 22.72× smaller chip area. We show that ABACuS's performance scales well with the number of DRAM banks. At the RowHammer threshold of 125, ABACuS incurs 1.58%, 1.50%, and 2.60% performance overheads for 16-, 32-, and 64-bank systems across all single-core workloads, respectively. ABACuS is freely and openly available at https://github.com/CMU-SAFARI/ABACuS.

SledgeHammer: Amplifying Rowhammer via Bank-level Parallelism

Ingab Kang, University of Michigan; Walter Wang and Jason Kim, Georgia Tech; Stephan van Schaik and Youssef Tobah, University of Michigan; Daniel Genkin, Georgia Tech; Andrew Kwong, UNC Chapel Hill; Yuval Yarom, Ruhr University Bochum

Available Media

Rowhammer is a hardware vulnerability in DDR memory by which attackers can perform specific access patterns in their own memory to flip bits in adjacent, uncontrolled rows with- out accessing them. Since its discovery by Kim et. al. (ISCA 2014), Rowhammer attacks have emerged as an alarming threat to numerous security mechanisms.

In this paper, we show that Rowhammer attacks can in fact be more effective when combined with bank-level parallelism, a technique in which the attacker hammers multiple memory banks simultaneously. This allows us to increase the amount of Rowhammer-induced flips 7-fold and significantly speed up prior Rowhammer attacks relying on native code execution.

Furthermore, we tackle the task of mounting browser-based Rowhammer attacks. Here, we develop a self-evicting ver- sion of multi-bank hammering, allowing us to replace clflush instructions with cache evictions. We then develop a novel method for detecting contiguous physical addresses using memory access timings, thereby obviating the need for trans- parent huge pages. Finally, by combining both techniques, we are the first, to our knowledge, to obtain Rowhammer bit flips on DDR4 memory from the Chrome and Firefox browsers running on default Linux configurations, without enabling transparent huge pages.

ZenHammer: Rowhammer Attacks on AMD Zen-based Platforms

Patrick Jattke, Max Wipfli, Flavien Solt, Michele Marazzi, Matej Bölcskei, and Kaveh Razavi, ETH Zurich

Available Media

AMD has gained a significant market share in recent years with the introduction of the Zen microarchitecture. While there are many recent Rowhammer attacks launched from Intel CPUs, they are completely absent on these newer AMD CPUs due to three non-trivial challenges: 1) reverse engineering the unknown DRAM addressing functions, 2) synchronizing with refresh commands for evading in-DRAM mitigations, and 3) achieving a sufficient row activation throughput. We address these challenges in the design of ZenHammer, the first Rowhammer attack on recent AMD CPUs. ZenHammer reverse engineers DRAM addressing functions despite their non-linear nature, uses specially crafted access patterns for proper synchronization, and carefully schedules flush and fence instructions within a pattern to increase the activation throughput while preserving the access order necessary to bypass in-DRAM mitigations. Our evaluation with ten DDR4 devices shows that ZenHammer finds bit flips on seven and six devices on AMD Zen 2 and Zen 3, respectively, enabling Rowhammer exploitation on current AMD platforms. Furthermore, ZenHammer triggers Rowhammer bit flips on a DDR5 device for the first time.

Go Go Gadget Hammer: Flipping Nested Pointers for Arbitrary Data Leakage

Youssef Tobah, University of Michigan; Andrew Kwong, UNC Chapel Hill; Ingab Kang, University of Michigan; Daniel Genkin, Georgia Tech; Kang G. Shin, University of Michigan

Available Media

Rowhammer is an increasingly threatening vulnerability that grants an attacker the ability to flip bits in memory without directly accessing them. Despite efforts to mitigate Rowhammer via software and defenses built directly into DRAM modules, more recent generations of DRAM are actually more susceptible to malicious bit-flips than their predecessors. This phenomenon has spawned numerous exploits, showing how Rowhammer acts as the basis for various vulnerabilities that target sensitive structures, such as Page Table Entries (PTEs) or opcodes, to grant control over a victim machine.

However, in this paper, we consider Rowhammer as a more general vulnerability, presenting a novel exploit vector for Rowhammer that targets particular code patterns. We show that if victim code is designed to return benign data to an unprivileged user, and uses nested pointer dereferences, Rowhammer can flip these pointers to gain arbitrary read access in the victim's address space. Furthermore, we identify gadgets present in the Linux kernel, and demonstrate an end-to-end attack that precisely flips a targeted pointer. To do so we developed a number of improved Rowhammer primitives, including kernel memory massaging, Rowhammer synchronization, and testing for kernel flips, which may be of broader interest to the Rowhammer community. Compared to prior works' leakage rate of .3 bits/s, we show that such gadgets can be used to read out kernel data at a rate of 82.6 bits/s.

By targeting code gadgets, this work expands the scope and attack surface exposed by Rowhammer. It is no longer sufficient for software defenses to selectively pad previously exploited memory structures in flip-safe memory, as any victim code that follows the pattern in question must be protected.

Track 3

Forensics

Session Chair: Adam Doupé

Salon E

00SEVen – Re-enabling Virtual Machine Forensics: Introspecting Confidential VMs Using Privileged in-VM Agents

Fabian Schwarz and Christian Rossow, CISPA Helmholtz Center for Information Security

Available Media

The security guarantees of confidential VMs (e.g., AMD's SEV) are a double-edged sword: Their protection against undesired VM inspection by malicious or compromised cloud operators inherently renders existing VM introspection (VMI) services infeasible. However, considering that these VMs particularly target sensitive workloads (e.g., finance), their customers demand secure forensic capabilities.

In this paper, we enable VM owners to remotely inspect their confidential VMs without weakening the VMs' protection against the cloud platform. In contrast to naïve in-VM memory aggregation tools, our approach (dubbed 00SEVen) is isolated from strong in-VM attackers and thus resistant against kernel-level attacks, and it provides VMI features beyond memory access. 00SEVen leverages the recent intra-VM privilege domains of AMD SEV-SNP—called VMPLs—and extends the QEMU/KVM hypervisor to provide VMPL-aware network I/O and VMI-assisting hypercalls. That way, we can serve VM owners with a protected in-VM forensic agent. The agent provides VM owners with attested remote memory and VM register introspection, secure pausing of the analysis target, and page access traps and function traps, all isolated from the cloud platform (incl. hypervisor) and in-VM rootkits.

WEBRR: A Forensic System for Replaying and Investigating Web-Based Attacks in The Modern Web

Joey Allen, Palo Alto Networks; Zheng Yang, Feng Xiao, and Matthew Landen, Georgia Institute of Technology; Roberto Perdisci, Georgia Institute of Technology and University of Georgia; Wenke Lee, Georgia Institute of Technology

Available Media

After a sophisticated attack or data breach occurs at an organization, a postmortem forensic analysis must be conducted to reconstruct and understand the root causes of the attack. Unfortunately, the majority of proposed forensic analysis systems rely on system-level auditing, making it difficult to reconstruct and investigate web-based attacks, due to the semantic-gap between system- and web-level semantics. This limited visibility into web-based attacks has recently become increasingly concerning because web-based attacks are commonly employed by nation-state adversaries to penetrate and achieve the initial compromise of an enterprise network. To enable forensic analysts to replay and investigate web-based attacks, we propose WebRR, a novel OS- and device- independent record and replay (RR) forensic auditing system for Chromium-based web browsers. While there exist prior works that focus on web-based auditing, current systems are either record-only or suffer from critical limitations that prevent them from deterministically replaying attacks. WebRR addresses these limitation by introducing a novel design that allows it to record and deterministically replay modern web applications by leveraging JavaScript Execution Unit Partitioning.

Our evaluation demonstrates that WebRR is capable of replaying web-based attacks that fail to replay on prior state-of-the-art systems. Furthermore, we demonstrate that WebRR can replay highly-dynamic modern websites in a deterministic fashion with an average runtime overhead of only 3.44%

AI Psychiatry: Forensic Investigation of Deep Learning Networks in Memory Images

David Oygenblik, Georgia Institute of Technology; Carter Yagemann, Ohio State University; Joseph Zhang, University of Pennsylvania; Arianna Mastali, Georgia Institute of Technology; Jeman Park, Kyung Hee University; Brendan Saltaformaggio, Georgia Institute of Technology

Available Media

Online learning is widely used in production to refine model parameters after initial deployment. This opens several vectors for covertly launching attacks against deployed models. To detect these attacks, prior work developed black-box and white-box testing methods. However, this has left prohibitive open challenge: how the investigator is supposed to recover the model (uniquely refined on an in-the-field device) for testing in the first place. We propose a novel memory forensic technique, named AiP, which automatically recovers the unique deployment model and rehosts it in a lab environment for investigation. AiP navigates through both main memory and GPU memory spaces to recover complex ML data structures, using recovered Python objects to guide the recovery of lower-level C objects, ultimately leading to the recovery of the uniquely refined model. AiP then rehosts the model within the investigator's device, where the investigator can apply various white-box testing methodologies. We have evaluated AiP using three versions of TensorFlow and PyTorch with the CIFAR-10, LISA, and IMDB datasets. AiP recovered 30 models from main memory and GPU memory with 100% accuracy and rehosted them into a live process successfully.

Cost-effective Attack Forensics by Recording and Correlating File System Changes

Le Yu, Yapeng Ye, Zhuo Zhang, and Xiangyu Zhang, Purdue University

Available Media

Attack forensics is particularly challenging for systems with restrictive resource constraints, such as IoT systems, because most existing methods entail logging high frequency events in the temporal dimension, which is costly. We propose a novel and cost-effective forensics technique that records information in the spatial dimension. It takes regular file-system snapshots that only record deltas between two timestamps. It infers causality by analyzing and correlating file changes (e.g., through methods similar to information retrieval). We show that in practice the resulting provenance graphs are as informative as the traditional attack provenance graphs based on temporal event logging. In the context of IoT attacks, they are better than those by existing techniques. In addition, our runtime and space overheads are only 8.08% and 5.13% of those for the state-of-the-arts, respectively.

Track 4

ML for Security

Session Chair: Mulong Luo

Salon F

Automated Large-Scale Analysis of Cookie Notice Compliance

Ahmed Bouhoula, Karel Kubicek, Amit Zac, Carlos Cotrini, and David Basin, ETH Zurich

Available Media

Privacy regulations such as the General Data Protection Regulation (GDPR) require websites to inform EU-based users about non-essential data collection and to request their consent to this practice. Previous studies have documented widespread violations of these regulations. However, these studies provide a limited view of the general compliance picture: they are either restricted to a subset of notice types, detect only simple violations using prescribed patterns, or analyze notices manually. Thus, they are restricted both in their scope and in their ability to analyze violations at scale.

We present the first general, automated, large-scale analysis of cookie notice compliance. Our method interacts with cookie notices, e.g., by navigating through their settings. It observes declared processing purposes and available consent options using Natural Language Processing and compares them to the actual use of cookies. By virtue of the generality and scale of our analysis, we correct for the selection bias present in previous studies focusing on specific Consent Management Platforms (CMP). We also provide a more general view of the overall compliance picture using a set of 97k websites popular in the EU. We report, in particular, that 65.4% of websites offering a cookie rejection option likely collect user data despite explicit negative consent.

Detecting and Mitigating Sampling Bias in Cybersecurity with Unlabeled Data

Saravanan Thirumuruganathan, Independent Researcher; Fatih Deniz, Issa Khalil, and Ting Yu, Qatar Computing Research Institute, HBKU; Mohamed Nabeel, Palo Alto Networks; Mourad Ouzzani, Qatar Computing Research Institute, HBKU

Available Media

Machine Learning (ML) based systems have demonstrated remarkable success in addressing various challenges within the ever-evolving cybersecurity landscape, particularly in the domain of malware detection/classification. However, a notable performance gap becomes evident when such classifiers are deployed in production. This discrepancy, often observed between accuracy scores reported in research papers and their real-world deployments, can be largely attributed to sampling bias. Intuitively, the data distribution in the production differs from that of training resulting in reduced performance of the classifier. How to deal with such sampling bias is an important problem in cybersecurity practice. In this paper, we propose principled approaches to detect and mitigate the adverse effects of sampling bias. First, we propose two simple and intuitive algorithms based on domain discrimination and distribution of k-th nearest neighbor distance to detect discrepancies between training and production data distributions. Second, we propose two algorithms based on the self-training paradigm to alleviate the impact of sampling bias. Our approaches are inspired by domain adaptation and judiciously harness the unlabeled data for enhancing the generalizability of ML classifiers. Critically, our approach does not require any modifications to the classifiers themselves, thus ensuring seamless integration into existing deployments. We conducted extensive experiments on four diverse datasets from malware, web domains, and intrusion detection. In an adversarial setting with large sampling bias, our proposed algorithms can improve the F-score by as much as 10-16 percentage points. Concretely, the F-score of a malware classifier on AndroZoo dataset increases from 0.83 to 0.937.

Code is not Natural Language: Unlock the Power of Semantics-Oriented Graph Representation for Binary Code Similarity Detection

Haojie He, School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University; Xingwei Lin, Ant Group; Ziang Weng and Ruijie Zhao, School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University; Shuitao Gan, Laboratory for Advanced Computing and Intelligence Engineering; Libo Chen, School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University; Yuede Ji, University of North Texas; Jiashui Wang, Ant Group; Zhi Xue, School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University

Available Media

Binary code similarity detection (BCSD) has garnered significant attention in recent years due to its crucial role in various binary code-related tasks, such as vulnerability search and software plagiarism detection. Currently, BCSD systems are typically based on either instruction streams or control flow graphs (CFGs). However, these approaches have limitations. Instruction stream-based approaches treat binary code as natural languages, overlooking well-defined semantic structures. CFG-based approaches exploit only the control flow structures, neglecting other essential aspects of code. Our key insight is that unlike natural languages, binary code has well-defined semantic structures, including intra-instruction structures, inter-instruction relations (e.g., def-use, branches), and implicit conventions (e.g. calling conventions). Motivated by that, we carefully examine the necessary relations and structures required to express the full semantics and expose them directly to the deep neural network through a novel semantics-oriented graph representation. Furthermore, we propose a lightweight multi-head softmax aggregator to effectively and efficiently fuse multiple aspects of the binary code. Extensive experiments show that our method significantly outperforms the state-of-the-art (e.g., in the x64-XC retrieval experiment with a pool size of 10000, our method achieves a recall score of 184%, 220%, and 153% over Trex, GMN, and jTrans, respectively).

VulSim: Leveraging Similarity of Multi-Dimensional Neighbor Embeddings for Vulnerability Detection

Samiha Shimmi, Ashiqur Rahman, and Mohan Gadde, Northern Illinois University; Hamed Okhravi, MIT Lincoln Laboratory; Mona Rahimi, Northern Illinois University

Available Media

Despite decades of research in vulnerability detection, vulnerabilities in source code remain a growing problem, and more effective techniques are needed in this domain. To enhance software vulnerability detection, in this paper, we first show that various vulnerability classes in the C programming language share common characteristics, encompassing semantic, contextual, and syntactic properties. We then leverage this knowledge to enhance the learning process of Deep Learning (DL) models for vulnerability detection when only sparse data is available. To achieve this, we extract multiple dimensions of information from the available, albeit limited, data. We then consolidate this information into a unified space, allowing for the identification of similarities among vulnerabilities through nearest-neighbor embeddings. The combination of these steps allows us to improve the effectiveness and efficiency of vulnerability detection using DL models. Evaluation results demonstrate that our approach surpasses existing State-of-the-art (SOTA) models and exhibits strong performance on unseen data, thereby enhancing generalizability.

Track 5

LLM I: Attack and Defense

Session Chair: Shuang Hao

Salon G

An LLM-Assisted Easy-to-Trigger Backdoor Attack on Code Completion Models: Injecting Disguised Vulnerabilities against Strong Detection

Shenao Yan, University of Connecticut; Shen Wang and Yue Duan, Singapore Management University; Hanbin Hong, University of Connecticut; Kiho Lee and Doowon Kim, University of Tennessee, Knoxville; Yuan Hong, University of Connecticut

Available Media

Large Language Models (LLMs) have transformed code completion tasks, providing context-based suggestions to boost developer productivity in software engineering. As users often fine-tune these models for specific applications, poisoning and backdoor attacks can covertly alter the model outputs. To address this critical security challenge, we introduce CodeBreaker, a pioneering LLM-assisted backdoor attack framework on code completion models. Unlike recent attacks that embed malicious payloads in detectable or irrelevant sections of the code (e.g., comments), CodeBreaker leverages LLMs (e.g., GPT-4) for sophisticated payload transformation (without affecting functionalities), ensuring that both the poisoned data for fine-tuning and generated code can evade strong vulnerability detection. CodeBreaker stands out with its comprehensive coverage of vulnerabilities, making it the first to provide such an extensive set for evaluation. Our extensive experimental evaluations and user studies underline the strong attack performance of CodeBreaker across various settings, validating its superiority over existing approaches. By integrating malicious payloads directly into the source code with minimal transformation, CodeBreaker challenges current security measures, underscoring the critical need for more robust defenses for code completion.

REMARK-LLM: A Robust and Efficient Watermarking Framework for Generative Large Language Models

Ruisi Zhang, Shehzeen Samarah Hussain, Paarth Neekhara, and Farinaz Koushanfar, University of California, San Diego

Available Media

We present REMARK-LLM, a novel efficient, and robust watermarking framework designed for texts generated by large language models (LLMs). Synthesizing human-like content using LLMs necessitates vast computational resources and extensive datasets, encapsulating critical intellectual property (IP). However, the generated content is prone to malicious exploitation, including spamming and plagiarism. To address the challenges, REMARK-LLM proposes three new components: (i) a learning-based message encoding module to infuse binary signatures into LLM-generated texts; (ii) a reparameterization module to transform the dense distributions from the message encoding to the sparse distribution of the watermarked textual tokens; (iii) a decoding module dedicated for signature extraction; Besides, we introduce an optimized beam search algorithm to generate content with coherence and consistency. REMARK-LLM is rigorously trained to encourage the preservation of semantic integrity in watermarked content, while ensuring effective watermark retrieval. Extensive evaluations on multiple unseen datasets highlight REMARK-LLM's proficiency and transferability in inserting 2× more signature bits into the same texts when compared to prior art, all while maintaining semantic integrity. Furthermore, REMARK-LLM exhibits better resilience against a spectrum of watermark detection and removal attacks.

Formalizing and Benchmarking Prompt Injection Attacks and Defenses

Yupei Liu, The Pennsylvania State University; Yuqi Jia, Duke University; Runpeng Geng and Jinyuan Jia, The Pennsylvania State University; Neil Zhenqiang Gong, Duke University

Available Media

A prompt injection attack aims to inject malicious instruction/data into the input of an LLM-Integrated Application such that it produces results as an attacker desires. Existing works are limited to case studies. As a result, the literature lacks a systematic understanding of prompt injection attacks and their defenses. We aim to bridge the gap in this work. In particular, we propose a framework to formalize prompt injection attacks. Existing attacks are special cases in our framework. Moreover, based on our framework, we design a new attack by combining existing ones. Using our framework, we conduct a systematic evaluation on 5 prompt injection attacks and 10 defenses with 10 LLMs and 7 tasks. Our work provides a common benchmark for quantitatively evaluating future prompt injection attacks and defenses. To facilitate research on this topic, we make our platform public at https://github.com/liu00222/Open-Prompt-Injection.

Instruction Backdoor Attacks Against Customized LLMs

Rui Zhang and Hongwei Li, University of Electronic Science and Technology of China; Rui Wen, CISPA Helmholtz Center for Information Security; Wenbo Jiang and Yuan Zhang, University of Electronic Science and Technology of China; Michael Backes, CISPA Helmholtz Center for Information Security; Yun Shen, NetApp; Yang Zhang, CISPA Helmholtz Center for Information Security

Available Media

The increasing demand for customized Large Language Models (LLMs) has led to the development of solutions like GPTs. These solutions facilitate tailored LLM creation via natural language prompts without coding. However, the trustworthiness of third-party custom versions of LLMs remains an essential concern. In this paper, we propose the first instruction backdoor attacks against applications integrated with untrusted customized LLMs (e.g., GPTs). Specifically, these attacks embed the backdoor into the custom version of LLMs by designing prompts with backdoor instructions, outputting the attacker's desired result when inputs contain the predefined triggers. Our attack includes 3 levels of attacks: word-level, syntax-level, and semantic-level, which adopt different types of triggers with progressive stealthiness. We stress that our attacks do not require fine-tuning or any modification to the backend LLMs, adhering strictly to GPTs development guidelines. We conduct extensive experiments on 6 prominent LLMs and 5 benchmark text classification datasets. The results show that our instruction backdoor attacks achieve the desired attack performance without compromising utility. Additionally, we propose two defense strategies and demonstrate their effectiveness in reducing such attacks. Our findings highlight the vulnerability and the potential risks of LLM customization such as GPTs.

Track 6

Software Vulnerability Detection

Session Chair: Fish Wang

Salon H

FIRE: Combining Multi-Stage Filtering with Taint Analysis for Scalable Recurring Vulnerability Detection

Siyue Feng, National Engineering Research Center for Big Data Technology and System, Services Computing Technology and System Lab, Hubei Key Laboratory of Distributed System Security, Hubei Engineering Research Center on Big Data Security, Cluster and Grid Computing Lab; School of Cyber Science and Engineering, Huazhong University of Science and Technology; Yueming Wu, Nanyang Technological University; Wenjie Xue and Sikui Pan, National Engineering Research Center for Big Data Technology and System, Services Computing Technology and System Lab, Hubei Key Laboratory of Distributed System Security, Hubei Engineering Research Center on Big Data Security, Cluster and Grid Computing Lab; School of Cyber Science and Engineering, Huazhong University of Science and Technology; Deqing Zou, National Engineering Research Center for Big Data Technology and System, Services Computing Technology and System Lab, Hubei Key Laboratory of Distributed System Security, Hubei Engineering Research Center on Big Data Security, Cluster and Grid Computing Lab; School of Cyber Science and Engineering, Huazhong University of Science and Technology; Jinyinhu Laboratory; Yang Liu, Nanyang Technological University; Hai Jin, National Engineering Research Center for Big Data Technology and System, Services Computing Technology and System Lab, Hubei Key Laboratory of Distributed System Security, Hubei Engineering Research Center on Big Data Security, Cluster and Grid Computing Lab; School of Computer Science and Technology, Huazhong University of Science and Technology

Available Media

With the continuous development of software open-sourcing, the reuse of open-source software has led to a significant increase in the occurrence of recurring vulnerabilities. These vulnerabilities often arise through the practice of copying and pasting existing vulnerabilities. Many methods have been proposed for detecting recurring vulnerabilities, but they often struggle to ensure both high efficiency and consideration of semantic information about vulnerabilities and patches. In this paper, we introduce FIRE, a scalable method for largescale recurring vulnerability detection. It utilizes multi-stage f iltering and differential taint paths to achieve precise clone vulnerability scanning at an extensive scale. In our evaluation across ten open-source software projects, FIRE demonstrates a precision of 90.0% in detecting 298 recurring vulnerabilities out of 385 ground truth instance. This surpasses the performance of existing advanced recurring vulnerability detection tools, detecting 31.4% more vulnerabilities than VUDDY and 47.0% more than MOVERY. When detecting vulnerabilities in large-scale software, FIRE outperforms MOVERY by saving about twice the time, enabling the scanning of recurring vulnerabilities on an ultra-large scale.

Inference of Error Specifications and Bug Detection Using Structural Similarities

Niels Dossche and Bart Coppens, Ghent University

Available Media

Error-handling code is a crucial part of software to ensure stability and security. Failing to handle errors correctly can lead to security vulnerabilities such as DoS, privilege escalation, and data corruption. We propose a novel approach to automatically infer error specifications for system software without a priori domain knowledge, while still achieving a high recall and precision. The key insight behind our approach is that we can identify error-handling paths automatically based on structural similarities between error-handling code. We use the inferred error specification to detect three kinds of bugs: missing error checks, incorrect error checks, and error propagation bugs. Our technique uses a combination of path-sensitive, flow-sensitive and both intra-procedural and inter-procedural data-flow analysis to achieve high accuracy and great scalability. We implemented our technique in a tool called ESSS to demonstrate the effectiveness and efficiency of our approach on 7 well-tested, widely-used open-source software projects: OpenSSL, OpenSSH, PHP, zlib, libpng, freetype2, and libwebp. Our tool reported 827 potential bugs in total for all 7 projects combined. We manually categorised these 827 issues into 279 false positives and 541 true positives. Out of these 541 true positives, we sent bug reports and corresponding patches for 46 of them. All the patches were accepted and applied.

A Binary-level Thread Sanitizer or Why Sanitizing on the Binary Level is Hard

Joschua Schilling, CISPA Helmholtz Center for Information Security; Andreas Wendler, Friedrich-Alexander-Universität Erlangen-Nürnberg; Philipp Görz, Nils Bars, Moritz Schloegel, and Thorsten Holz, CISPA Helmholtz Center for Information Security

Available Media

Dynamic software testing methods, such as fuzzing, have become a popular and effective method for detecting many types of faults in programs. While most research focuses on targets for which source code is available, much of the software used in practice is only available as closed source. Testing software without having access to source code forces a user to resort to binary-only testing methods, which are typically slower and lack support for crucial features, such as advanced bug oracles in the form of sanitizers, i.e., dynamic methods to detect faults based on undefined or suspicious behavior. Almost all existing sanitizers work by injecting instrumentation at compile time, requiring access to the target's source code. In this paper, we systematically identify the key challenges of applying sanitizers to binary-only targets. As a result of our analysis, we present the design and implementation of BINTSAN, an approach to realize the data race detector TSAN targeting binary-only Linux x86-64 targets. We systematically evaluate BINTSAN for correctness, effectiveness, and performance. We find that our approach has a runtime overhead of only 15% compared to source-based TSAN. Compared to existing binary solutions, our approach has better performance (up to 5.0× performance improvement) and precision, while preserving compatibility with the compiler-based TSAN.

ORANalyst: Systematic Testing Framework for Open RAN Implementations

Tianchang Yang, Syed Md Mukit Rashid, Ali Ranjbar, Gang Tan, and Syed Rafiul Hussain, The Pennsylvania State University

Available Media

We develop ORANalyst, the first systematic testing framework tailored for analyzing the robustness and operational integrity of Open RAN (O-RAN) implementations. O-RAN systems are composed of numerous microservice-based components. ORANalyst initially gains insights into these complex component dependencies by combining efficient static analysis with dynamic tracing. Applying these insights, ORANalyst crafts test inputs that effectively navigate these dependencies and thoroughly test each target component. We evaluate ORANalyst on two O-RAN implementations, O-RAN-SC and SD-RAN, and identify 19 previously undiscovered vulnerabilities. If exploited, these vulnerabilities could lead to various denial-of-service attacks, resulting from component crashes and disruptions in communication channels.

Track 7

Cryptographic Protocols I: Multi-Party Computation

Session Chair: Esha Ghosh

Salon IJK

Scalable Multi-Party Computation Protocols for Machine Learning in the Honest-Majority Setting

Fengrun Liu, University of Science and Technology of China & Shanghai Qi Zhi Institute; Xiang Xie, Shanghai Qi Zhi Institute & PADO Labs; Yu Yu, Shanghai Jiao Tong University & State Key Laboratory of Cryptology

Available Media

In this paper, we present a novel and scalable multi-party computation (MPC) protocol tailored for privacy-preserving machine learning (PPML) with semi-honest security in the honest-majority setting. Our protocol utilizes the Damgaard-Nielsen (Crypto '07) protocol with Mersenne prime fields. By leveraging the special properties of Mersenne primes, we are able to design highly efficient protocols for securely computing operations such as truncation and comparison. Additionally, we extend the two-layer multiplication protocol in ATLAS (Crypto '21) to further reduce the round complexity of operations commonly used in neural networks.

Our protocol is very scalable in terms of the number of parties involved. For instance, our protocol completes the online oblivious inference of a 4-layer convolutional neural network with 63 parties in 0.1 seconds and 4.6 seconds in the LAN and WAN settings, respectively. To the best of our knowledge, this is the first fully implemented protocol in the field of PPML that can successfully run with such a large number of parties. Notably, even in the three-party case, the online phase of our protocol is more than 1.4x faster than the Falcon (PETS '21) protocol.

Lightweight Authentication of Web Data via Garble-Then-Prove

Xiang Xie, PADO Labs; Kang Yang, State Key Laboratory of Cryptology; Xiao Wang, Northwestern University; Yu Yu, Shanghai Jiao Tong University and Shanghai Qi Zhi Institute

Available Media

Transport Layer Security (TLS) establishes an authenticated and confidential channel to deliver data for almost all Internet applications. A recent work (Zhang et al., CCS'20) proposed a protocol to prove the TLS payload to a third party, without any modification of TLS servers, while ensuring the privacy and originality of the data in the presence of malicious adversaries. However, it required maliciously secure Two-Party Computation (2PC) for generic circuits, leading to significant computational and communication overhead.

This paper proposes the garble-then-prove technique to achieve the same security requirement without using any heavy mechanism like generic malicious 2PC. Our end-to-end implementation shows 14x improvement in communication and an order of magnitude improvement in computation over the state-of-the-art protocol. We also show worldwide performance when using our protocol to authenticate payload data from Coinbase and Twitter APIs. Finally, we propose an efficient gadget to privately convert the above authenticated TLS payload to additively homomorphic commitments so that the properties of the payload can be proven efficiently using zkSNARKs.

Holding Secrets Accountable: Auditing Privacy-Preserving Machine Learning

Hidde Lycklama, ETH Zurich; Alexander Viand, Intel Labs; Nicolas Küchler, ETH Zurich; Christian Knabenhans, EPFL; Anwar Hithnawi, ETH Zurich

Available Media

Recent advancements in privacy-preserving machine learning are paving the way to extend the benefits of ML to highly sensitive data that, until now, has been hard to utilize due to privacy concerns and regulatory constraints. Simultaneously, there is a growing emphasis on enhancing the transparency and accountability of ML, including the ability to audit deployments for aspects such as fairness, accuracy and compliance. Although ML auditing and privacy-preserving machine learning have been extensively researched, they have largely been studied in isolation. However, the integration of these two areas is becoming increasingly important. In this work, we introduce Arc, an MPC framework designed for auditing privacy-preserving machine learning. Arc cryptographically ties together the training, inference, and auditing phases to allow robust and private auditing. At the core of our framework is a new protocol for efficiently verifying inputs against succinct commitments. We evaluate the performance of our framework when instantiated with our consistency protocol and compare it to hashing-based and homomorphic-commitment-based approaches, demonstrating that it is up to 10^4× faster and up to 10^6× more concise.

Secure Account Recovery for a Privacy-Preserving Web Service

Ryan Little, Boston University; Lucy Qin, Georgetown University; Mayank Varia, Boston University

Available Media

If a web service is so secure that it does not even know—and does not want to know—the identity and contact info of its users, can it still offer account recovery if a user forgets their password? This paper is the culmination of the authors' work to design a cryptographic protocol for account recovery for use by a prominent secure matching system: a web-based service that allows survivors of sexual misconduct to become aware of other survivors harmed by the same perpetrator. In such a system, the list of account-holders must be safeguarded, even against the service provider itself.

In this work, we design an account recovery system that, on the surface, appears to follow the typical workflow: the user types in their email address, receives an email containing a one-time link, and answers some security questions. Behind the scenes, the defining feature of our recovery system is that the service provider can perform email-based account validation without knowing, or being able to learn, a list of users' email addresses. Our construction uses standardized cryptography for most components, and it has been deployed in production at the secure matching system.

As a building block toward our main construction, we design a new cryptographic primitive that may be of independent interest: an oblivious pseudorandom function that can either have a fully-private input or a partially-public input, and that reaches the same output either way. This primitive allows us to perform online rate limiting for account recovery attempts, without imposing a bound on the creation of new accounts. We provide an open-source implementation of this primitive and provide evaluation results showing that the end-to-end interaction time takes 8.4-60.4 ms in fully-private input mode and 3.1-41.2 ms in partially-public input mode.

6:00 pm–7:30 pm

Symposium Reception

Franklin Hall

Thursday, August 15

8:00 am–9:00 am

Continental Breakfast

Grand Ballroom Foyer

9:00 am–10:15 am

Track 1

User Studies II: At-Risk Users

Session Chair: Daniel Zappala

Salon AB

Navigating Traumatic Stress Reactions During Computer Security Interventions

Lana Ramjit, Cornell Tech; Natalie Dolci, UW-Safe Campus; Francesca Rossi, Thriving Through; Ryan Garcia, UW-Safe Campus; Thomas Ristenpart, Cornell Tech; Dana Cuomo, Lafayette College

Available Media

At-risk populations need direct support from computer security and privacy consultants, what we refer to as a security intervention. However, at-risk populations often face security threats while experiencing traumatic events and ensuing traumatic stress reactions. While existing security interventions follow broad principles for trauma-informed care, no prior work has studied the domain-specific effects of trauma on intervention efficacy, nor how to improve the ability of tech abuse specialists to navigate them.

We perform a multi-part study into traumatic stress in the context of digital security interventions. We first interview technology consultants from three computer security clinics that help intimate partner violence survivors with technology abuse. We identify four challenges reported by consultants emanating out of traumatic stress, some of which appear to be unique to the digital security context. To better understand these challenges, we analyze transcripts of sessions at one of the clinics, extracting five patterns of how stress reactions affect consultations. We use our findings to develop new recommended best practices, including a new intervention protocol design to help guide security interventions.

Exploring digital security and privacy in relative poverty in Germany through qualitative interviews

Anastassija Kostan and Sara Olschar, Paderborn University; Lucy Simko, The George Washington University; Yasemin Acar, Paderborn University & The George Washington University

Available Media

When developing security and privacy policy, technical solutions, and research for end users, assumptions about end users' financial means and technology use situations often fail to take users' income status into account. This means that the status quo may marginalize those affected by poverty in security and privacy, and exacerbate inequalities. To enable more equitable security and privacy for all, it is crucial to understand the overall situation of low income users, their security and privacy concerns, perceptions, behaviors, and challenges. In this paper, we report on a semi-structured, in-depth interview study with low income users living in Germany (n=28) which we understand as a case study for the growing number of low income users in global north countries. We find that low income end users may be literate regarding technology use and possess solid basic knowledge about security and privacy, and generally show awareness of security and privacy threats and risks. Despite these resources, we also find that low income users are driven to poor security and privacy practices like using an untrusted cloud due to little storage space, and relying on old, broken, or used hardware. Additionally we find the mindset of a—potentially false—sense of security and privacy because through attacking them, there is "not much to get". Based on our findings, we discuss how the security and privacy community can expand comprehension about diverse end users, increase awareness and design for the specific situation of low income users, and should take more vulnerable groups into account.

"But they have overlooked a few things in Afghanistan:" An Analysis of the Integration of Biometric Voter Verification in the 2019 Afghan Presidential Elections

Kabir Panahi and Shawn Robertson, University of Kansas; Yasemin Acar, Paderborn University; Alexandru G. Bardas, University of Kansas; Tadayoshi Kohno, University of Washington; Lucy Simko, The George Washington University

Available Media

Afghanistan deployed biometric voter verification (BVV) machines nationally for the first time in the critical 2019 presidential election. Through the leading authors' unique backgrounds and involvement in this election, which facilitated interviews with 18 Afghan nationals and international participants who had an active role in this Afghan election, we explore the gap between the expected outcomes of the electoral system, centered around BVVs, and the reality on election day and beyond. We find that BVVs supported and violated the electoral goals of voter enfranchisement, fraud prevention, enabling public trust, and created threats for voters, staff, and officials. We identify technical, usability, and bureaucratic underlying causes for these mismatches and discuss several vital factors that are part of an election.

Understanding How to Inform Blind and Low-Vision Users about Data Privacy through Privacy Question Answering Assistants

Yuanyuan Feng, University of Vermont; Abhilasha Ravichander, Allen Institute for Artificial Intelligence; Yaxing Yao, Virginia Tech; Shikun Zhang and Rex Chen, Carnegie Mellon University; Shomir Wilson, Pennsylvania State University; Norman Sadeh, Carnegie Mellon University

Available Media

Understanding and managing data privacy in the digital world can be challenging for sighted users, let alone blind and low-vision (BLV) users. There is limited research on how BLV users, who have special accessibility needs, navigate data privacy, and how potential privacy tools could assist them. We conducted an in-depth qualitative study with 21 US BLV participants to understand their data privacy risk perception and mitigation, as well as their information behaviors related to data privacy. We also explored BLV users' attitudes towards potential privacy question answering (Q&A) assistants that enable them to better navigate data privacy information. We found that BLV users face heightened security and privacy risks, but their risk mitigation is often insufficient. They do not necessarily seek data privacy information but clearly recognize the benefits of a potential privacy Q&A assistant. They also expect privacy Q&A assistants to possess cross-platform compatibility, support multi-modality, and demonstrate robust functionality. Our study sheds light on BLV users' expectations when it comes to usability, accessibility, trust and equity issues regarding digital data privacy.

Assessing Suspicious Emails with Banner Warnings Among Blind and Low-Vision Users in Realistic Settings

Filipo Sharevski, DePaul University; Aziz Zeidieh, University of Illinois at Urbana-Champaign

Available Media

Warning users about suspicious emails usually happens through visual interventions such as banners. Evidence from laboratory experiments shows that email banner warnings are unsuitable for blind and low-vision (BLV) users as they tend to miss or make no use of them. However, the laboratory settings preclude a full understanding of how BLV users would realistically behave around these banner warnings because the experiments don't use the individuals' own email addresses, devices, or emails of their choice. To address this limitation, we devised a study with n=21 BLV email users in realistic settings. Our findings indicate that this user population misses or makes no use of Gmail and Outlook banner warnings because these are implemented in a "narrow" sense, that is, (i) they allow access to the warning text without providing context relevant to the risk of associated email, and (ii) the formatting, together with the possible actions, is confusing as to how a user should deal with the email in question. To address these barriers, our participants proposed designs to accommodate the accessibility preferences and usability habits of individuals with visual disabilities according to their capabilities to engage with email banner warnings.

Track 2

Side Channel III

Session Chair: Yuval Yarom

Salon CD

Invalidate+Compare: A Timer-Free GPU Cache Attack Primitive

Zhenkai Zhang, Clemson University; Kunbei Cai, University of Central Florida; Yanan Guo, University of Rochester; Fan Yao, University of Central Florida; Xing Gao, University of Delaware

Available Media

While extensive research has been conducted on CPU cache side-channel attacks, the landscape of similar studies on modern GPUs remains largely uncharted. In this paper, we investigate potential information leakage threats posed by the caches in GPUs of NVIDIA's latest Ampere and Ada Lovelace generations. We first exploit a GPU cache maintenance instruction to reverse engineer certain key properties of the cache hierarchy in these GPUs, and then we introduce a novel GPU cache side-channel attack primitive named Invalidate+Compare that is designed to spy on the GPU cache activities of a victim in a timer-free manner. We further showcase the use of this primitive with two case studies. The first one is a website fingerprinting attack that can accurately identify the web pages visited by a user, while the second one uncovers keystroke data entered via a virtual keyboard. To our knowledge, these stand as the first demonstrations of timer-free cache side-channel attacks on GPUs.

Peep With A Mirror: Breaking The Integrity of Android App Sandboxing via Unprivileged Cache Side Channel

Yan Lin, Jinan University; Joshua Wong, Singapore Management University; Xiang Li and Haoyu Ma, Zhejiang Lab; Debin Gao, Singapore Management University

Available Media

Application sandboxing is a well-established security principle employed in the Android platform to safeguard sensitive information. However, hardware resources, specifically the CPU caches, are beyond the protection of this software-based mechanism, leaving room for potential side-channel attacks. Existing attacks against this particular weakness of app sandboxing mainly target shared components among apps, hence can only observe system-level program dynamics (such as UI tracing). In this work, we advance cache side-channel attacks by demonstrating the viability of non-intrusive and fine-grained probing across different app sandboxes, which have the potential to uncover app-specific and private program behaviors, thereby highlighting the importance of further research in this area.

In contrast to conventional attack schemes, our proposal leverages a user-level attack surface within the Android platform, namely the dynamic inter-app component sharing with package context (also known as DICI), to fully map the code of targeted victim apps into the memory space of the attacker's sandbox. Building upon this concept, we have developed a proof-of-concept attack demo called ANDROSCOPE and demonstrated its effectiveness with empirical evaluations where the attack app was shown to be able to successfully infer private information pertaining to individual apps, such as driving routes and keystroke dynamics with considerable accuracy.

Indirector: High-Precision Branch Target Injection Attacks Exploiting the Indirect Branch Predictor

Luyi Li, Hosein Yavarzadeh, and Dean Tullsen, UC San Diego

Distinguished Paper Award Winner

Available Media

This paper introduces novel high-precision Branch Target Injection (BTI) attacks, leveraging the intricate structures of the Indirect Branch Predictor (IBP) and the Branch Target Buffer (BTB) in high-end Intel CPUs. It presents, for the first time, a comprehensive picture of the IBP and the BTB within the most recent Intel processors, revealing their size, structure, and the precise functions governing index and tag hashing. Additionally, this study reveals new details into the inner workings of Intel's hardware defenses, such as IBPB, IBRS, and STIBP, including previously unknown holes in their coverage. Leveraging insights from reverse engineering efforts, this research develops highly precise Branch Target Injection (BTI) attacks to breach security boundaries across diverse scenarios, including cross-process and cross-privilege scenarios and uses the IBP and the BTB to break Address Space Layout Randomization (ASLR).

Intellectual Property Exposure: Subverting and Securing Intellectual Property Encapsulation in Texas Instruments Microcontrollers

Marton Bognar, Cas Magnus, Frank Piessens, and Jo Van Bulck, DistriNet, KU Leuven

Available Media

In contrast to high-end computing platforms, specialized memory protection features in low-end embedded devices remain relatively unexplored despite the ubiquity of these devices. Hence, we perform an in-depth security evaluation of the state-of-the-art Intellectual Property Encapsulation (IPE) technology found in widely used off-the-shelf, Texas Instruments MSP430 microcontrollers. While we find IPE to be promising, bearing remarkable similarities with trusted execution environments (TEEs) from research and industry, we reveal several fundamental protection shortcomings in current IPE hardware. We show that many software-level attack techniques from the academic TEE literature apply to this platform, and we discover a novel attack primitive, dubbed controlled call corruption, exploiting a vulnerability in the IPE access control mechanism. Our practical, end-to-end attack scenarios demonstrate a complete bypass of confidentiality and integrity guarantees of IPE-protected programs.

Informed by our systematic attack study on IPE and root-cause analysis, also considering related research prototypes, we propose lightweight hardware changes to secure IPE. Furthermore, we develop a prototype framework that transparently implements software responsibilities to reduce information leakage and repurposes the onboard memory protection unit to reinstate IPE security guarantees on currently vulnerable devices with low performance overheads.

Track 3

ML III: Secure ML

Session Chair: Yuan Tian

Salon E

AutoFHE: Automated Adaption of CNNs for Efficient Evaluation over FHE

Wei Ao and Vishnu Naresh Boddeti, Michigan State University

Available Media

Secure inference of deep convolutional neural networks (CNNs) under RNS-CKKS involves polynomial approximation of unsupported non-linear activation functions. However, existing approaches have three main limitations: 1) Inflexibility: The polynomial approximation and associated homomorphic evaluation architecture are customized manually for each CNN architecture and do not generalize to other networks. 2) Suboptimal Approximation: Each activation function is approximated instead of the function represented by the CNN. 3) Restricted Design: Either high-degree or low-degree polynomial approximations are used. The former retains high accuracy but slows down inference due to bootstrapping operations, while the latter accelerates ciphertext inference but compromises accuracy. To address these limitations, we present AutoFHE, which automatically adapts standard CNNs for secure inference under RNS-CKKS. The key idea is to adopt layerwise mixed-degree polynomial activation functions, which are optimized jointly with the homomorphic evaluation architecture in terms of the placement of bootstrapping operations. The problem is modeled within a multi-objective optimization framework to maximize accuracy and minimize the number of bootstrapping operations. AutoFHE can be applied flexibly on any CNN architecture, and it provides diverse solutions that span the trade-off between accuracy and latency. Experimental evaluation over RNS-CKKS encrypted CIFAR datasets shows that AutoFHE accelerates secure inference by 1.32x to 1.8x compared to methods employing high-degree polynomials. It also improves accuracy by up to 2.56% compared to methods using low-degree polynomials. Lastly, AutoFHE accelerates inference and improves accuracy by 103x and 3.46%, respectively, compared to CNNs under TFHE.

Fast and Private Inference of Deep Neural Networks by Co-designing Activation Functions

Abdulrahman Diaa, Lucas Fenaux, Thomas Humphries, Marian Dietz, Faezeh Ebrahimianghazani, Bailey Kacsmar, Xinda Li, Nils Lukas, Rasoul Akhavan Mahdavi, and Simon Oya, University of Waterloo; Ehsan Amjadian, University of Waterloo and Royal Bank of Canada; Florian Kerschbaum, University of Waterloo

Available Media

Machine Learning as a Service (MLaaS) is an increasingly popular design where a company with abundant computing resources trains a deep neural network and offers query access for tasks like image classification. The challenge with this design is that MLaaS requires the client to reveal their potentially sensitive queries to the company hosting the model. Multi-party computation (MPC) protects the client's data by allowing encrypted inferences. However, current approaches suffer from prohibitively large inference times. The inference time bottleneck in MPC is the evaluation of non-linear layers such as ReLU activation functions. Motivated by the success of previous work co-designing machine learning and MPC, we develop an activation function co-design. We replace all ReLUs with a polynomial approximation and evaluate them with single-round MPC protocols, which give state-of-theart inference times in wide-area networks. Furthermore, to address the accuracy issues previously encountered with polynomial activations, we propose a novel training algorithm that gives accuracy competitive with plaintext models. Our evaluation shows between 3 and 110× speedups in inference time on large models with up to 23 million parameters while maintaining competitive inference accuracy.

OblivGNN: Oblivious Inference on Transductive and Inductive Graph Neural Network

Zhibo Xu, Monash University and CSIRO's Data61; Shangqi Lai, CSIRO's Data61; Xiaoning Liu, RMIT University; Alsharif Abuadbba, CSIRO's Data61; Xingliang Yuan, The University of Melbourne; Xun Yi, RMIT University

Available Media

Graph Neural Networks (GNNs) have emerged as a powerful tool for analysing graph-structured data across various domains, including social networks, banking, and bioinformatics. In the meantime, graph data contains sensitive information, such as social relations, financial transactions, and chemical structures, and GNN models are IPs of the model owner. Thus, deploying GNNs in cloud-based Machine Learning as a Service (MLaaS) raises significant privacy concerns.

In this paper, we present a comprehensive solution to enable secure GNN inference in MLaaS, named OblivGNN. OblivGNN is designed to support both transductive (static graph) and inductive (dynamic graph) inference services without revealing either graph data or GNN models. In particular, we adopt a lightweight cryptographic primitive, i.e., function secret sharing, to achieve low communication and computation overhead during inference. Furthermore, we are the first to propose a secure update protocol for the inductive setting, which can obliviously update the graph without revealing which parts of the graph are updated. Particularly, our results with three widely-used graph datasets (Cora, Citeseer, and Pubmed) show that OblivGNN can achieve comparable accuracy to an Additive Secret Sharing-based baseline. Nonetheless, our design reduces the runtime cost by up to 38% and the communication cost by 10x to 151x, highlighting its practicality when processing large graphs with GNN models.

MD-ML: Super Fast Privacy-Preserving Machine Learning for Malicious Security with a Dishonest Majority

Boshi Yuan, Shixuan Yang, and Yongxiang Zhang, Shanghai Jiao Tong University, China; Ning Ding, Dawu Gu, and Shi-Feng Sun, Shanghai Jiao Tong University, China; Shanghai Jiao Tong University (Wuxi) Blockchain Advanced Research Center

Available Media

Privacy-preserving machine learning (PPML) enables the training and inference of models on private data, addressing security concerns in machine learning. PPML based on secure multi-party computation (MPC) has garnered significant attention from both the academic and industrial communities. Nevertheless, only a few PPML works provide malicious security with a dishonest majority. The state of the art by Damgård et al. (SP'19) fails to meet the demand for large models in practice, due to insufficient efficiency. In this work, we propose MD-ML, a framework for Maliciously secure Dishonest majority PPML, with a focus on boosting online efficiency.

MD-ML works for n parties, tolerating corruption of up to n-1 parties. We construct our novel protocols for PPML, including truncation, dot product, matrix multiplication, and comparison. The online communication of our dot product protocol is one single element per party, independent of input length. In addition, the online cost of our multiply-then-truncate protocol is identical to multiplication, which means truncation incurs no additional online cost. These features are achieved for the first time in the literature concerning maliciously secure dishonest majority PPML.

Benchmarking of MD-ML is conducted for SVM and NN including LeNet, AlexNet, and ResNet-18. For NN inference, compared to the state of the art (Damgård et al., SP'19), we are about 3.4—11.0x (LAN) and 9.7—157.7x (WAN) faster in online execution time.

Accelerating Secure Collaborative Machine Learning with Protocol-Aware RDMA

Zhenghang Ren, Mingxuan Fan, Zilong Wang, Junxue Zhang, and Chaoliang Zeng, iSING Lab@The Hong Kong University of Science and Technology; Zhicong Huang and Cheng Hong, Ant Group; Kai Chen, iSING Lab@The Hong Kong University of Science and Technology and University of Science and Technology of China

Available Media

Secure Collaborative Machine Learning (SCML) suffers from high communication cost caused by secure computation protocols. While modern datacenters offer high-bandwidth and low-latency networks with Remote Direct Memory Access (RDMA) capability, existing SCML implementation remains to use TCP sockets, leading to inefficiency. We present CORA1 to implement SCML over RDMA. By using a protocol-aware design, CORA identifies the protocol used by the SCML program and sends messages directly to the remote party's protocol buffer, improving the efficiency of message exchange. CORA exploits the chance that the SCML task is determined before execution and the pattern is largely input-irrelevant, so that CORA can plan message destinations on remote hosts at compile time. CORA can be readily deployed with existing SCML frameworks such as Piranha with its socket-like interface. We evaluate CORA in SCML training tasks, and our results show that CORA can reduce communication cost by up to 11x and achieve 1.2x - 4.2x end-to-end speedup over TCP in SCML training.

Track 4

Measurement II: Network

Session Chair: Rob Jansen

Salon F

CalcuLatency: Leveraging Cross-Layer Network Latency Measurements to Detect Proxy-Enabled Abuse

Reethika Ramesh, University of Michigan; Philipp Winter, Independent; Sam Korman and Roya Ensafi, University of Michigan

Available Media

Efforts from emerging technology companies aim to democratize the ad delivery ecosystem and build systems that are privacy-centric and even share ad revenue benefits with their users. Other providers offer remuneration for users on their platform for interacting with and making use of services. But these efforts may suffer from coordinated abuse efforts aiming to defraud them. Attackers can use VPNs and proxies to fabricate their geolocation and earn disproportionate rewards. Balancing proxy-enabled abuse-prevention techniques with a privacy-focused business model is a hard challenge. Can service providers use minimal connection features to infer proxy use without jeopardizing user privacy?

In this paper, we build and evaluate a solution, CalcuLatency, that incorporates various network latency measurement techniques and leverage the application-layer and network-layer differences in roundtrip-times when a user connects to the service using a proxy. We evaluate our four measurement techniques individually, and as an integrated system using a two-pronged evaluation. CalcuLatency is an easy-to-deploy, open-source solution that can serve as an inexpensive first-step to label proxies.

6Sense: Internet-Wide IPv6 Scanning and its Security Applications

Grant Williams, Mert Erdemir, Amanda Hsu, Shraddha Bhat, Abhishek Bhaskar, Frank Li, and Paul Pearce, Georgia Institute of Technology

Available Media

Internet-wide scanning is a critical tool for security researchers and practitioners alike. By exhaustively exploring the entire IPv4 address space, Internet scanning has driven the development of new security protocols, found and tracked vulnerabilities, improved DDoS defenses, and illuminated global censorship. Unfortunately, the vast scale of the IPv6 address space—340 trillion trillion trillion addresses—precludes exhaustive scanning, necessitating entirely new IPv6-specific scanning methods. As IPv6 adoption continues to grow, developing IPv6 scanning methods is vital for maintaining our capability to comprehensively investigate Internet security.

We present 6SENSE, an end-to-end Internet-wide IPv6 scanning system. 6SENSE utilizes reinforcement learning coupled with an online scanner to iteratively reduce the space of possible IPv6 addresses into a tractable scannable subspace, thus discovering new IPv6 Internet hosts. 6SENSE is driven by a set of metrics we identify and define as key for evaluating the generality, diversity, and correctness of IPv6 scanning. We evaluate 6SENSE and prior generative IPv6 discovery methods across these metrics, showing that 6SENSE is able to identify tens of millions of IPv6 hosts, which compared to prior approaches, is up to 3.6x more hosts and 4x more end-site assignments, across a more diverse set of networks. From our analysis, we identify limitations in prior generative approaches that preclude their use for Internet-scale security scans. We also conduct the first Internet-wide scanning-driven security analysis of IPv6 hosts, focusing on TLS certificates unique to IPv6, surveying open ports and security-sensitive services, and identifying potential CVEs.

A Flushing Attack on the DNS Cache

Yehuda Afek and Anat Bremler-Barr, Tel-Aviv University; Shoham Danino, Reichman University; Yuval Shavitt, Tel-Aviv University

Available Media

A severe vulnerability in the DNS resolver's cache is exposed here, introducing a new type of attack, termed DNS CacheFlush. This attack poses a significant threat as it can easily disrupt a resolver's ability to provide service to its clients.

DNS resolver software incorporates various mechanisms to safeguard its cache. However, we have identified a tricky path to bypass these safeguards, allowing a high-rate flood of malicious but seemingly existent domain name resolutions to thrash the benign DNS cache. The resulting attack has a high amplification factor, where with a low rate attack it produces a continuous high rate resource records insertions into the resolver cache. This prevents benign request resolutions from surviving in the DNS LRU cache long enough for subsequent requests to be resolved directly from the cache. Thus leading to repeated cache misses for most benign domains, resulting in a substantial delay in the DNS service. The attack rate amplification factor is high enough to even flush out popular benign domains that are requested at a high frequency (∼ 100/1sec). Moreover, the attack packets introduce additional processing overhead and all together the attack easily denies service from the resolver's legitimate clients.

In our experiments we observed 95.7% cache miss rate for a domain queried once per second under 8,000 qps attack on a resolver with 100MB cache. Even on a resolver with 2GB cache size we observed a drop of 88.3% in the resolver benign traffic throughput.

A result of this study is a recommendation to deny and drop any authoritative replies that contain many server names, e.g., a long referral response, or a long CNAME chain, before the resolver starts any processing of such a response.

SnailLoad: Exploiting Remote Network Latency Measurements without JavaScript

Stefan Gast, Roland Czerny, Jonas Juffinger, Fabian Rauscher, Simone Franza, and Daniel Gruss, Graz University of Technology

Available Media

Inferring user activities on a computer from network traffic is a well-studied attack vector. Previous work has shown that they can infer websites visited, videos watched, and even user actions within specific applications. However, all of these attacks require a scenario where the attacker can observe the (possibly encrypted) network traffic, e.g., through a person-in-the-middle (PITM) attack or sitting in physical proximity to monitor WiFi packets.

In this paper, we present SnailLoad, a new side-channel attack where the victim loads an asset, e.g., a file or an image, from an attacker-controlled server, exploiting the victim's network latency as a side channel tied to activities on the victim system, e.g., watching videos or websites. SnailLoad requires no JavaScript, no form of code execution on the victim system, and no user interaction but only a constant exchange of network packets, e.g., a network connection in the background. SnailLoad measures the latency to the victim system and infers the network activity on the victim system from the latency variations. We demonstrate SnailLoad in a non-PITM video-fingerprinting attack, where we use a single SnailLoad trace to infer what video a victim user is watching momentarily. For our evaluation, we focused on a set of 10 YouTube videos the victim watches, and show that SnailLoad reaches classification F₁ scores of up to 98%. We also evaluated SnailLoad in an open-world top 100 website fingerprinting attack, resulting in an F₁ score of 62.8%. This shows that numerous prior works, based on network traffic observations in PITM attack scenarios, could potentially be lifted to non-PITM remote attack scenarios.

An Interview Study on Third-Party Cyber Threat Hunting Processes in the U.S. Department of Homeland Security

William P. Maxam III, US Coast Guard Academy; James C. Davis, Purdue University

Available Media

Cybersecurity is a major challenge for large organizations. Traditional cybersecurity defense is reactive. Cybersecurity operations centers keep out adversaries and incident response teams clean up after break-ins. Recently a proactive stage has been introduced: Cyber Threat Hunting (TH) looks for potential compromises missed by other cyber defenses. TH is mandated for federal executive agencies and government contractors. As threat hunting is a new cybersecurity discipline, most TH teams operate without a defined process. The practices and challenges of TH have not yet been documented.

To address this gap, this paper describes the first interview study of threat hunt practitioners. We obtained access and interviewed 11 threat hunters associated with the U.S. government's Department of Homeland Security. Hour-long interviews were conducted. We analyzed the transcripts with process and thematic coding. We describe the diversity among their processes, show that their processes differ from the TH processes reported in the literature, and unify our subjects' descriptions into a single TH process. We enumerate common TH challenges and solutions according to the subjects. The two most common challenges were difficulty in assessing a Threat Hunter's expertise, and developing and maintaining automation. We conclude with recommendations for TH teams (improve planning, focus on automation, and apprentice new members) and highlight directions for future work (finding a TH process that balances flexibility and formalism, and identifying assessments for TH team performance).

Track 5

ML IV: Privacy Inference I

Session Chair: Brendan Saltaformaggio

Salon G

A Linear Reconstruction Approach for Attribute Inference Attacks against Synthetic Data

Meenatchi Sundaram Muthu Selva Annamalai, University College London; Andrea Gadotti and Luc Rocher, University of Oxford

Available Media

Recent advances in synthetic data generation (SDG) have been hailed as a solution to the difficult problem of sharing sensitive data while protecting privacy. SDG aims to learn statistical properties of real data in order to generate "artificial" data that are structurally and statistically similar to sensitive data. However, prior research suggests that inference attacks on synthetic data can undermine privacy, but only for specific outlier records.

In this work, we introduce a new attribute inference attack against synthetic data. The attack is based on linear reconstruction methods for aggregate statistics, which target all records in the dataset, not only outliers. We evaluate our attack on state-of-the-art SDG algorithms, including Probabilistic Graphical Models, Generative Adversarial Networks, and recent differentially private SDG mechanisms. By defining a formal privacy game, we show that our attack can be highly accurate even on arbitrary records, and that this is the result of individual information leakage (as opposed to population-level inference).

We then systematically evaluate the tradeoff between protecting privacy and preserving statistical utility. Our findings suggest that current SDG methods cannot consistently provide sufficient privacy protection against inference attacks while retaining reasonable utility. The best method evaluated, a differentially private SDG mechanism, can provide both protection against inference attacks and reasonable utility, but only in very specific settings. Lastly, we show that releasing a larger number of synthetic records can improve utility but at the cost of making attacks far more effective.

Did the Neurons Read your Book? Document-level Membership Inference for Large Language Models

Matthieu Meeus, Imperial College London; Shubham Jain, Sense Street; Marek Rei and Yves-Alexandre de Montjoye, Imperial College London

Available Media

With large language models (LLMs) poised to become embedded in our daily lives, questions are starting to be raised about the data they learned from. These questions range from potential bias or misinformation LLMs could retain from their training data to questions of copyright and fair use of human-generated text. However, while these questions emerge, developers of the recent state-of-the-art LLMs become increasingly reluctant to disclose details on their training corpus. We here introduce the task of document-level membership inference for real-world LLMs, i.e. inferring whether the LLM has seen a given document during training or not. First, we propose a procedure for the development and evaluation of document-level membership inference for LLMs by leveraging commonly used data sources for training and the model release date. We then propose a practical, black-box method to predict document-level membership and instantiate it on OpenLLaMA-7B with both books and academic papers. We show our methodology to perform very well, reaching an AUC of 0.856 for books and 0.678 for papers. We then show our approach to outperform the sentence-level membership inference attacks used in the privacy literature for the document-level membership task. We further evaluate whether smaller models might be less sensitive to document-level inference and show OpenLLaMA-3B to be approximately as sensitive as OpenLLaMA-7B to our approach. Finally, we consider two mitigation strategies and find the AUC to slowly decrease when only partial documents are considered but to remain fairly high when the model precision is reduced. Taken together, our results show that accurate document-level membership can be inferred for LLMs, increasing the transparency of technology poised to change our lives.

MIST: Defending Against Membership Inference Attacks Through Membership-Invariant Subspace Training

Jiacheng Li, Ninghui Li, and Bruno Ribeiro, Purdue University

Available Media

In Member Inference (MI) attacks, the adversary try to determine whether an instance is used to train a machine learning (ML) model. MI attacks are a major privacy concern when using private data to train ML models. Most MI attacks in the literature take advantage of the fact that ML models are trained to fit the training data well, and thus have very low loss on training instances. Most defenses against MI attacks therefore try to make the model fit the training data less well. Doing so, however, generally results in lower accuracy.

We observe that training instances have different degrees of vulnerability to MI attacks. Most instances will have low loss even when not included in training. For these instances, the model can fit them well without concerns of MI attacks. An effective defense only needs to (possibly implicitly) identify instances that are vulnerable to MI attacks and avoids overfitting them. A major challenge is how to achieve such an effect in an efficient training process.

Leveraging two distinct recent advancements in representation learning: counterfactually-invariant representations and subspace learning methods, we introduce a novel Membership-Invariant Subspace Training (MIST) method to defend against MI attacks. MIST avoids overfitting the vulnerable instances without significant impact on other instances. We have conducted extensive experimental studies, comparing MIST with various other state-of-the-art (SOTA) MI defenses against several SOTA MI attacks. We find that MIST outperforms other defenses while resulting in minimal reduction in testing accuracy.

Inf2Guard: An Information-Theoretic Framework for Learning Privacy-Preserving Representations against Inference Attacks

Sayedeh Leila Noorbakhsh and Binghui Zhang, Illinois Institute of Technology; Yuan Hong, University of Connecticut; Binghui Wang, Illinois Institute of Technology

Available Media

Machine learning (ML) is vulnerable to inference (e.g., membership inference, property inference, and data reconstruction) attacks that aim to infer the private information of training data or dataset. Existing defenses are only designed for one specific type of attack and sacrifice significant utility or are soon broken by adaptive attacks. We address these limitations by proposing an information-theoretic defense framework, called Inf2Guard, against the three major types of inference attacks. Our framework, inspired by the success of representation learning, posits that learning shared representations not only saves time/costs but also benefits numerous downstream tasks. Generally, Inf2Guard involves two mutual information objectives, for privacy protection and utility preservation, respectively. Inf2Guard exhibits many merits: it facilitates the design of customized objectives against the specific inference attack; it provides a general defense framework which can treat certain existing defenses as special cases; and importantly, it aids in deriving theoretical results, e.g., inherent utility-privacy tradeoff and guaranteed privacy leakage. Extensive evaluations validate the effectiveness of Inf2Guard for learning privacy-preserving representations against inference attacks and demonstrate the superiority over the baselines.

Property Existence Inference against Generative Models

Lijin Wang, Jingjing Wang, Jie Wan, and Lin Long, Zhejiang University; Ziqi Yang and Zhan Qin, Zhejiang University, ZJU-Hangzhou Global Scientific and Technological Innovation Center

Available Media

Generative models have served as the backbone of versatile tools with a wide range of applications across various fields in recent years. However, it has been demonstrated that privacy concerns, such as membership information leakage of the training dataset, exist for generative models. In this paper, we perform property existence inference against generative models as a new type of information leakage, which aims to infer whether any samples with a given property are contained in the training set. For example, to infer if any images (i.e., samples) of a specific brand of cars (i.e., property) are used to train the target model. We focus on the leakage of existence information of properties with very low proportions in the training set, which has been overlooked in previous works. We leverage the feature-level consistency of the generated data with the training data to launch inferences and validate the property existence information leakage across diverse architectures of generative models. We have examined various factors influencing the property existence inference and investigated how generated samples leak property existence information. In our conclusion, most generative models are vulnerable to property existence inferences. Additionally, we have validated our attack in Stable Diffusion which is a large-scale open-source generative model in real-world scenarios, and demonstrated its risk of property existence information leakage. The source code is available at https://github.com/wljLlla/PEI_Code.

Track 6

Fuzzing II: Method

Session Chair: Mathias Payer

Salon H

SDFuzz: Target States Driven Directed Fuzzing

Penghui Li, The Chinese University of Hong Kong and Zhongguancun Laboratory; Wei Meng, The Chinese University of Hong Kong; Chao Zhang, Tsinghua University and Zhongguancun Laboratory

Available Media

Directed fuzzers often unnecessarily explore program code and paths that cannot trigger the target vulnerabilities. We observe that the major application scenarios of directed fuzzing provide detailed vulnerability descriptions, from which highly-valuable program states (i.e., target states) can be derived, e.g., call traces when a vulnerability gets triggered. By driving to expose such target states, directed fuzzers can exclude massive unnecessary exploration.

Inspired by the observation, we present SDFuzz, an efficient directed fuzzing tool driven by target states. SDFuzz first automatically extracts target states in vulnerability reports and static analysis results. SDFuzz employs a selective instrumentation technique to reduce the fuzzing scope to the required code for reaching target states. SDFuzz then early terminates the execution of a test case once SDFuzz probes that the remaining execution cannot reach the target states. It further uses a new target state feedback and refines prior imprecise distance metric into a two-dimensional feedback mechanism to proactively drive the exploration towards the target states.

We thoroughly evaluated SDFuzz on known vulnerabilities and compared it to related works. The results show that SDFuzz could improve vulnerability exposure capability with more vulnerability triggered and less time used, outperforming the state-of-the-art solutions. SDFuzz could significantly improve the fuzzing throughput. Our application of SDFuzz to automatically validate the static analysis results successfully discovered four new vulnerabilities in well-tested applications. Three of them have been acknowledged by developers.

Critical Code Guided Directed Greybox Fuzzing for Commits

Yi Xiang, Zhejiang University NGICS Platform; Xuhong Zhang, Zhejiang University and Jianghuai Advance Technology Center; Peiyu Liu, Zhejiang University NGICS Platform; Shouling Ji, Xiao Xiao, Hong Liang, and Jiacheng Xu, Zhejiang University; Wenhai Wang, Zhejiang University NGICS Platform

Available Media

Newly submitted commits are prone to introducing vulnerabilities into programs. As a promising countermeasure, directed greybox fuzzers can be employed to test commit changes by designating the commit change sites as targets. However, existing directed fuzzers primarily focus on reaching a single target and neglect the diverse exploration of the additional affected code. As a result, they may overlook bugs that crash at a distant site from the change site and lack directness in multi-target scenarios, which are both very common in the context of commit testing.

In this paper, we propose WAFLGO, a direct greybox fuzzer, to effectively discover vulnerabilities introduced by commits. WAFLGO employs a novel critical code guided input generation strategy to thoroughly explore the affected code. Specifically, we identify two types of critical code: pathprefix code and data-suffix code. The critical code first guides the input generation to gradually and incrementally reach the change sites. Then while maintaining the reachability of the critical code, the input generation strategy further encourages the diversity of the generated inputs in exploring the affected code. Additionally, WAFLGO introduces a lightweight multitarget distance metric for directness and thorough examination of all change sites. We implement WAFLGO and evaluate it with 30 real-world bugs introduced by commits. Compared to eight state-of-the-art tools, WAFLGO achieves an average speedup of 10.3×. Furthermore, WAFLGO discovers seven new vulnerabilities including four CVEs while testing the most recent 50 commits of real-world software, including libtiff, fig2dev, and libming, etc.

Toward Unbiased Multiple-Target Fuzzing with Path Diversity

Huanyao Rong, Indiana University Bloomington; Wei You, Renmin University of China; XiaoFeng Wang and Tianhao Mao, Indiana University Bloomington

Available Media

Directed fuzzing is an advanced software testing approach that systematically guides the fuzzing campaign toward user-defined target sites, enabling efficient discovery of vulnerabilities related to these sites. However, we have observed that some complex vulnerabilities remain undetected by directed fuzzers even when the flawed target sites are frequently tested by the generated test cases, because triggering these bugs often requires the execution of additional code in related program locations. Furthermore, when fuzzing multiple targets, the existing energy assignment in directed fuzzing lacks precision and does not ensure the fairness across targets, which leads to insufficient fuzzing effort spent on some deeper targets.

In this paper, we propose a novel directed fuzzing solution named AFLRUN, which features target path-diversity metric and unbiased energy assignment. Firstly, we develop a new coverage metric by maintaining extra virgin map for each covered target to track the coverage status of seeds that hit the target. This approach enables the storage of waypoints that hit a target through interesting path into the corpus, thus enriching the path diversity for each target. Additionally, we propose a corpus-level energy assignment strategy that ensures fairness for each target. AFLRUN starts with uniform target weight and propagates this weight to seeds to get a desired seed weight distribution. By assigning energy to each seed in the corpus according to such desired distribution, a precise and unbiased energy assignment can be achieved.

We built a prototype system and assessed its performance using a standard benchmark and several extensively fuzzed real-world applications. The evaluation results demonstrate that AFLRUN outperforms state-of-the-art fuzzers in terms of vulnerability detection, both in quantity and speed. Moreover, AFLRUN uncovers 29 previously unidentified vulnerabilities, including 8 CVEs, across four distinct programs.

SymBisect: Accurate Bisection for Fuzzer-Exposed Vulnerabilities

Zheng Zhang and Yu Hao, UC Riverside; Weiteng Chen, Microsoft Research; Xiaochen Zou, Xingyu Li, Haonan Li, Yizhuo Zhai, and Zhiyun Qian, UC Riverside; Billy Lau, Google

Available Media

The popularity of fuzzing has led to its tight integration into the software development process as a routine part of the build and test, i.e., continuous fuzzing. This has resulted in a substantial increase in the reporting of bugs in open-source software, including the Linux kernel. To keep up with the volume of bugs, it is crucial to automatically analyze the bugs to assist developers and maintainers. Bug bisection, i.e., locating the commit that introduced a vulnerability, is one such analysis that can reveal the range of affected software versions and help bug prioritization and patching. However, existing automated solutions fall short in a number of ways: most of them either (1) directly run the same PoC on older software versions without adapting to changes in bug-triggering conditions and are prone to broken dynamic environments or (2) require patches that may not be available when the bug is discovered. In this work, we take a different approach to looking for evidence of fuzzer-exposed vulnerabilities by looking for the underlying bug logic. In this way, we can perform bug bisection much more precisely and accurately. Specifically, we apply underconstrained symbolic execution with several principled guiding techniques to search for the presence of the bug logic efficiently. We show that our approach achieves significantly better accuracy than the state-of-the-art solution by 16% (from 74.7% to 90.7%).

Data Coverage for Guided Fuzzing

Mingzhe Wang, Jie Liang, Chijin Zhou, Zhiyong Wu, Jingzhou Fu, and Zhuo Su, Tsinghua University; Qing Liao, Harbin Institute of Technology; Bin Gu, Beijing Institute of Control Engineering; Bodong Wu, Huawei Technologies Co., Ltd; Yu Jiang, Tsinghua University

Distinguished Paper Award Winner

Available Media

Code coverage is crucial for fuzzing. It helps fuzzers identify areas of a program that have not been explored, which are often the most likely to contain bugs. However, code coverage only reflects a small part of a program's structure. Many crucial program constructs, such as constraints, automata, and Turing-complete domain-specific languages, are embedded in a program as constant data. Since this data cannot be effectively reflected by code coverage, it remains a major challenge for modern fuzzing practices.

To address this challenge, we propose data coverage for guided fuzzing. The idea is to detect novel constant data references and maximize their coverage. However, the widespread use of constant data can significantly impact fuzzing throughput if not handled carefully. To overcome this issue, we optimize for real-world fuzzing practices by classifying data access according to semantics and designing customized collection strategies. We also develop novel storage and utilization techniques for improved fuzzing efficiency. Finally, we enhance libFuzzer with data coverage and submit it to Google's FuzzBench for evaluation. Our approach outperforms many state-of-the-art fuzzers and achieves the best coverage score in the experiment. Furthermore, we have discovered 28 previously-unknown bugs on OSS-Fuzz projects that were well-fuzzed using code coverage.

Track 7

Crypto II: Searchable Encryption

Session Chair: Daniel S. Roche

Salon IJK

I/O-Efficient Dynamic Searchable Encryption meets Forward & Backward Privacy

Priyanka Mondal, University of California, Santa Cruz; Javad Ghareh Chamani, HKUST; Ioannis Demertzis, University of California, Santa Cruz; Dimitrios Papadopoulos, HKUST

Available Media

We focus on the problem of I/O-efficient Dynamic Searchable Encryption (DSE), i.e., schemes that perform well when executed with the dataset on-disk. Towards this direction, for HDDs, schemes have been proposed with good locality (i.e., low number of performed non-continuous memory reads) and read efficiency (the number of additional memory locations read per result item). Similarly, for SSDs, schemes with good page efficiency (reading as few pages as possible) have been proposed. However, the vast majority of these works are limited to the static case (i.e. no dataset modifications) and the only dynamic scheme fails to achieve forward and backward privacy, the de-facto leakage standard in the literature. In fact, prior related works (Bost [CCS'16] and Minaud and Reichle[CRYPTO'22]) claim that I/O-efficiency and forward-privacy are two irreconcilable notions. Contrary to that, in this work, we "reconcile" for the first time forward and backward privacy with I/O-efficiency for DSE both for HDDs and SSDs. We propose two families of DSE constructions which also improve the state-of-the-art (non I/O-efficient) both asymptotically and experimentally. Indeed, some of our schemes improve the in-memory performance of prior works. At a technical level, we revisit and enhance the lazy de-amortization DSE construction by Demertzis et al. [NDSS'20], transforming it into an I/O-preserving one. Importantly, we introduce an oblivious-merge protocol that merges two equal-sized databases without revealing any information, effectively replacing the costly oblivious data structures with more lightweight computations.

FEASE: Fast and Expressive Asymmetric Searchable Encryption

Long Meng, Liqun Chen, and Yangguang Tian, University of Surrey; Mark Manulis, Universität der Bundeswehr München; Suhui Liu, Southeast University

Available Media

Asymmetric Searchable Encryption (ASE) is a promising cryptographic mechanism that enables a semi-trusted cloud server to perform keyword searches over encrypted data for users. To be useful, an ASE scheme must support expressive search queries, which are expressed as conjunction, disjunction, or any Boolean formulas. In this paper, we propose a fast and expressive ASE scheme that is adaptively secure, called FEASE. It requires only 3 pairing operations for searching any conjunctive set of keywords independent of the set size and has linear complexity for encryption and trapdoor algorithms in the number of keywords. FEASE is based on a new fast Anonymous Key-Policy Attribute-Based Encryption (A-KP-ABE) scheme as our first proposal, which is of independent interest. To address optional protection against keyword guessing attacks, we extend FEASE into the first expressive Public-Key Authenticated Encryption with Keyword Search (PAEKS) scheme. We provide implementations and evaluate the performance of all three schemes, while also comparing them with the state of the art. We observe that FEASE outperforms all existing expressive ASE constructions and that our A-KP-ABE scheme offers anonymity with efficiency comparable to the currently fastest yet non-anonymous KP-ABE schemes FAME (ACM CCS 2017) and FABEO (ACM CCS 2022).

d-DSE: Distinct Dynamic Searchable Encryption Resisting Volume Leakage in Encrypted Databases

Dongli Liu and Wei Wang, Huazhong University of Science and Technology; Peng Xu, Huazhong University of Science and Technology, Hubei Key Laboratory of Distributed System Security, School of Cyber Science and Engineering, JinYinHu Laboratory, and State Key Laboratory of Cryptology; Laurence T. Yang, Huazhong University of Science and Technology and St. Francis Xavier University; Bo Luo, The University of Kansas; Kaitai Liang, Delft University of Technology

Available Media

Dynamic Searchable Encryption (DSE) has emerged as a solution to efficiently handle and protect large-scale data storage in encrypted databases (EDBs). Volume leakage poses a significant threat, as it enables adversaries to reconstruct search queries and potentially compromise the security and privacy of data. Padding strategies are common countermeasures for the leakage, but they significantly increase storage and communication costs. In this work, we develop a new perspective on handling volume leakage. We start with distinct search and further explore a new concept called distinct DSE (d-DSE).

We also define new security notions, in particular Distinct with Volume-Hiding security, as well as forward and backward privacy, for the new concept. Based on d-DSE, we construct the d-DSE designed EDB with related constructions for distinct keyword (d-KW-dDSE), keyword (KW-dDSE), and join queries (JOIN-dDSE) and update queries in encrypted databases. We instantiate a concrete scheme BF-SRE, employing Symmetric Revocable Encryption. We conduct extensive experiments on real-world datasets, such as Crime, Wikipedia, and Enron, for performance evaluation. The results demonstrate that our scheme is practical in data search and with comparable computational performance to the SOTA DSE scheme (MITRA*, AURA) and padding strategies (SEAL, ShieldDB). Furthermore, our proposal sharply reduces the communication cost as compared to padding strategies, with roughly 6.36 to 53.14x advantage for search queries.

MUSES: Efficient Multi-User Searchable Encrypted Database

Tung Le, Virginia Tech; Rouzbeh Behnia, University of South Florida; Jorge Guajardo, Robert Bosch Research and Technology Center; Thang Hoang, Virginia Tech

Available Media

Searchable encrypted systems enable privacy-preserving keyword search on encrypted data. Symmetric systems achieve high efficiency (e.g., sublinear search), but they mostly support single-user search. Although systems based on public-key or hybrid models support multi-user search, they incur inherent security weaknesses (e.g., keyword-guessing vulnerabilities) and scalability limitations due to costly public-key operations (e.g., pairing). More importantly, most encrypted search designs leak statistical information (e.g., search, result, and volume patterns) and thus are vulnerable to devastating leakage-abuse attacks. Some pattern-hiding schemes were proposed. However, they incur significant user bandwidth/computation costs, and thus are not desirable for large-scale outsourced databases with resource-constrained users.

In this paper, we propose MUSES, a new multi-user encrypted search platform that addresses the functionality, security, and performance limitations in the existing encrypted search designs. Specifically, MUSES permits multi-user functionalities (reader/writer separation, permission revocation) and hides all statistical information (including search, result, and volume patterns) while featuring minimal user overhead. In MUSES, we demonstrate a unique incorporation of various emerging distributed cryptographic protocols including Distributed Point Function, Distributed PRF, and Oblivious Linear Group Action. We also introduce novel distributed protocols for oblivious counting and shuffling on arithmetic shares for the general multi-party setting with a dishonest majority, which can be found useful in other applications. Our experimental results showed that the keyword search by MUSES is two orders of magnitude faster with up to 12× lower user bandwidth cost than the state-of-the-art.

Query Recovery from Easy to Hard: Jigsaw Attack against SSE

Hao Nie and Wei Wang, Huazhong University of Science and Technology; Peng Xu, Huazhong University of Science and Technology, Hubei Key Laboratory of Distributed System Security, School of Cyber Science and Engineering, JinYinHu Laboratory, and State Key Laboratory of Cryptology; Xianglong Zhang, Huazhong University of Science and Technology; Laurence T. Yang, Huazhong University of Science and Technology and St. Francis Xavier University; Kaitai Liang, Delft University of Technology

Available Media

Searchable symmetric encryption schemes often unintentionally disclose certain sensitive information, such as access, volume, and search patterns. Attackers can exploit such leakages and other available knowledge related to the user's database to recover queries. We find that the effectiveness of query recovery attacks depends on the volume/frequency distribution of keywords. Queries containing keywords with high volumes/frequencies are more susceptible to recovery, even when countermeasures are implemented. Attackers can also effectively leverage these "special" queries to recover all others.

By exploiting the above finding, we propose a Jigsaw attack that begins by accurately identifying and recovering those distinctive queries. Leveraging the volume, frequency, and cooccurrence information, our attack achieves 90% accuracy in three tested datasets, which is comparable to previous attacks (Oya et al., USENIX' 22 and Damie et al., USENIX' 21). With the same runtime, our attack demonstrates an advantage over the attack proposed by Oya et al (approximately 15% more accuracy when the keyword universe size is 15k). Furthermore, our proposed attack outperforms existing attacks against widely studied countermeasures, achieving roughly 60% and 85% accuracy against the padding and the obfuscation, respectively. In this context, with a large keyword universe (≥3k), it surpasses current state-of-the-art attacks by more than 20%.

10:15 am–10:45 am

Coffee and Tea Break

Grand Ballroom Foyer

10:45 am–12:00 pm

Track 1

Social Issues II: Surveillance and Censorship

Session Chair: Nguyen Phong Hoang

Salon AB

GFWeb: Measuring the Great Firewall's Web Censorship at Scale

Nguyen Phong Hoang, University of British Columbia and University of Chicago; Jakub Dalek and Masashi Crete-Nishihata, Citizen Lab - University of Toronto; Nicolas Christin, Carnegie Mellon University; Vinod Yegneswaran, SRI International; Michalis Polychronakis, Stony Brook University; Nick Feamster, University of Chicago

Available Media

Censorship systems such as the Great Firewall (GFW) have been continuously refined to enhance their filtering capabilities. However, most prior studies, and in particular the GFW, have been limited in scope and conducted over short time periods, leading to gaps in our understanding of the GFW's evolving Web censorship mechanisms over time. We introduce GFWeb, a novel system designed to discover domain blocklists used by the GFW for censoring Web access. GFWeb exploits GFW's bidirectional and loss-tolerant blocking behavior to enable testing hundreds of millions of domains on a monthly basis, thereby facilitating large-scale longitudinal measurement of HTTP and HTTPS blocking mechanisms.

Over the course of 20 months, GFWeb has tested a total of 1.02 billion domains, and detected 943K and 55K pay-level domains censored by the GFW's HTTP and HTTPS filters, respectively. To the best of our knowledge, our study represents the most extensive set of domains censored by the GFW ever discovered to date, many of which have never been detected by prior systems. Analyzing the longitudinal dataset collected by GFWeb, we observe that the GFW has been upgraded to mitigate several issues previously identified by the research community, including overblocking and failure in reassembling fragmented packets. More importantly, we discover that the GFW's bidirectional blocking is not symmetric as previously thought, i.e., it can only be triggered by certain domains when probed from inside the country. We discuss the implications of our work on existing censorship measurement and circumvention efforts. We hope insights gained from our study can help inform future research, especially in monitoring censorship and developing new evasion tools.

Snowflake, a censorship circumvention system using temporary WebRTC proxies

Cecylia Bocovich, Tor Project; Arlo Breault, Wikimedia Foundation; David Fifield and Serene, unaffiliated; Xiaokang Wang, Tor Project

Available Media

Snowflake is a system for circumventing Internet censorship. Its blocking resistance comes from the use of numerous, ultra-light, temporary proxies ("snowflakes"), which accept traffic from censored clients using peer-to-peer WebRTC protocols and forward it to a centralized bridge. The temporary proxies are simple enough to be implemented in JavaScript, in a web page or browser extension, making them much cheaper to run than a traditional proxy or VPN server. The large and changing pool of proxy addresses resists enumeration and blocking by a censor. The system is designed with the assumption that proxies may appear or disappear at any time. Clients discover proxies dynamically using a secure rendezvous protocol. When an in-use proxy goes offline, its client switches to another on the fly, invisibly to upper network layers.

Snowflake has been deployed with success in Tor Browser and Orbot for several years. It has been a significant circumvention tool during high-profile network disruptions, including in Russia in 2021 and Iran in 2022. In this paper, we explain the composition of Snowflake's many parts, give a history of deployment and blocking attempts, and reflect on implications for circumvention generally.

SpotProxy: Rediscovering the Cloud for Censorship Circumvention

Patrick Tser Jern Kon, University of Michigan; Sina Kamali, University of Waterloo; Jinyu Pei, Rice University; Diogo Barradas, University of Waterloo; Ang Chen, University of Michigan; Micah Sherr, Georgetown University; Moti Yung, Google and Columbia University

Available Media

Censorship circumvention is often fueled by supporters out of goodwill. However, hosting circumvention proxies can be costly, especially when they are placed in the cloud. We argue for re-examining cloud features and leveraging them to achieve novel circumvention benefits, even though these features are not explicitly engineered for censorship circumvention. SpotProxy is inspired by Spot VMs—cloud instances backed with excess resources, sold at a fraction of the cost of regular instances, that can be taken away at a moment's notice if higher-paying requests arrive. We observe that for circumvention proxies, Spot VMs not only translate to cost savings, but also create a high churn rate since proxies are constantly re-spawned at different IP addresses—making them more difficult for a censor to enumerate and block. SpotProxy pushes this observation to the extreme and designs a circumvention infrastructure that constantly searches for cheaper VMs and refreshes the fleet for anti-blocking, for spot and regular VMs alike. We adapt Wireguard and Snowflake for use with SpotProxy, and demonstrate that our active migration mechanism allows clients to seamlessly move between proxies without degrading their performance or disrupting existing connections. We show that SpotProxy leads to significant cost savings, and that SpotProxy's rejuvenation mechanism enables proxies to be replenished frequently with new addresses.

Bridging Barriers: A Survey of Challenges and Priorities in the Censorship Circumvention Landscape

Diwen Xue, Anna Ablove, and Reethika Ramesh, University of Michigan; Grace Kwak Danciu, Independent; Roya Ensafi, University of Michigan

Available Media

The ecosystem of censorship circumvention tools (CTs) remains one of the most opaque and least understood, overshadowed by the precarious legal status around their usage and operation, and the risks facing those directly involved. Used by hundreds of millions of users across the most restricted networks, these tools circulate not through advertisements but word-of-mouth, distributed not through appstores but underground networks, and adopted not out of trust but from the sheer necessity for information access.

This paper aims to elucidate the dynamics and challenges of the CT ecosystem, and the needs and priorities of its stakeholders. We perform the first multi-perspective study, surveying 12 leading CT providers that service upwards of 100 million users, combined with experiences from CT users in Russia and China. Beyond the commonly cited technical challenges and disruptions from censors, our study also highlights funding constraints, usability issues, misconceptions, and misbehaving players, all of which similarly plague the CT ecosystem. Having the unique opportunity to survey these at-risk CT stakeholders, we outline key future priorities for those involved. We hope our work encourages further research to advance our understanding of this complex and uniquely challenged ecosystem.

Fingerprinting Obfuscated Proxy Traffic with Encapsulated TLS Handshakes

Diwen Xue, University of Michigan; Michalis Kallitsis, Merit Network, Inc.; Amir Houmansadr, UMass Amherst; Roya Ensafi, University of Michigan

Available Media

The global escalation of Internet censorship by nation-state actors has led to an ongoing arms race between censors and obfuscated circumvention proxies. Research over the past decade has extensively examined various fingerprinting attacks against individual proxy protocols and their respective countermeasures. In this paper, however, we demonstrate the feasibility of a protocol-agnostic approach to proxy detection, enabled by the shared characteristic of nested protocol stacks inherent to all forms of proxying and tunneling activities. We showcase the practicality of such approach by identifying one specific fingerprint--encapsulated TLS handshakes--that results from nested protocol stacks, and building similarity-based classifiers to isolate this unique fingerprint within encrypted traffic streams.

Assuming the role of a censor, we build a detection framework and deploy it within a mid-size ISP serving upwards of one million users. Our evaluation demonstrates that the traffic of obfuscated proxies, even with random padding and multiple layers of encapsulations, can be reliably detected with minimal collateral damage by fingerprinting encapsulated TLS handshakes. While stream multiplexing shows promise as a viable countermeasure, we caution that existing obfuscations based on multiplexing and random padding alone are inherently limited, due to their inability to reduce the size of traffic bursts or the number of round trips within a connection. Proxy developers should be aware of these limitations, anticipate the potential exploitation of encapsulated TLS handshakes by the censors, and equip their tools with proactive countermeasures.

Track 2

AR and VR

Session Chair: Habiba Farrukh

Salon CD

When the User Is Inside the User Interface: An Empirical Study of UI Security Properties in Augmented Reality

Kaiming Cheng, Arkaprabha Bhattacharya, Michelle Lin, Jaewook Lee, Aroosh Kumar, Jeffery F. Tian, Tadayoshi Kohno, and Franziska Roesner, University of Washington

Available Media

Augmented reality (AR) experiences place users inside the user interface (UI), where they can see and interact with three-dimensional virtual content. This paper explores UI security for AR platforms, for which we identify three UI security-related properties: Same Space (how does the platform handle virtual content placed at the same coordinates?), Invisibility (how does the platform handle invisible virtual content?), and Synthetic Input (how does the platform handle simulated user input?). We demonstrate the security implications of different instantiations of these properties through five proof-of-concept attacks between distrusting AR application components (i.e., a main app and an included library) — including a clickjacking attack and an object erasure attack. We then empirically investigate these UI security properties on five current AR platforms: ARCore (Google), ARKit (Apple), Hololens (Microsoft), Oculus (Meta), and WebXR (browser). We find that all platforms enable at least three of our proof-of-concept attacks to succeed. We discuss potential future defenses, including applying lessons from 2D UI security and identifying new directions for AR UI security.

Can Virtual Reality Protect Users from Keystroke Inference Attacks?

Zhuolin Yang, Zain Sarwar, Iris Hwang, Ronik Bhaskar, Ben Y. Zhao, and Haitao Zheng, University of Chicago

Available Media

Virtual Reality (VR) has gained popularity by providing immersive and interactive experiences without geographical limitations. It also provides a sense of personal privacy through physical separation. In this paper, we show that despite assumptions of enhanced privacy, VR is unable to shield its users from side-channel attacks that steal private information. Ironically, this vulnerability arises from VR's greatest strength, its immersive and interactive nature. We demonstrate this by designing and implementing a new set of keystroke inference attacks in shared virtual environments, where an attacker (VR user) can recover the content typed by another VR user by observing their avatar. While the avatar displays noisy telemetry of the user's hand motion, an intelligent attacker can use that data to recognize typed keys and reconstruct typed content, without knowing the keyboard layout or gathering labeled data. We evaluate the proposed attacks using IRB-approved user studies across multiple VR scenarios. For 13 out of 15 tested users, our attacks accurately recognize 86%-98% of typed keys, and the recovered content retains up to 98% of the meaning of the original typed content. We also discuss potential defenses.

Remote Keylogging Attacks in Multi-user VR Applications

Zihao Su, University of California, Santa Barbara; Kunlin Cai, University of California, Los Angeles; Reuben Beeler, Lukas Dresel, Allan Garcia, and Ilya Grishchenko, University of California, Santa Barbara; Yuan Tian, University of California, Los Angeles; Christopher Kruegel and Giovanni Vigna, University of California, Santa Barbara

Available Media

As Virtual Reality (VR) applications grow in popularity, they have bridged distances and brought users closer together. However, with this growth, there have been increasing concerns about security and privacy, especially related to the motion data used to create immersive experiences. In this study, we highlight a significant security threat in multi-user VR applications, which are applications that allow multiple users to interact with each other in the same virtual space. Specifically, we propose a remote attack that utilizes the avatar rendering information collected from an adversary's game clients to extract user-typed secrets like credit card information, passwords, or private conversations. We do this by (1) extracting motion data from network packets, and (2) mapping motion data to keystroke entries. We conducted a user study to verify the attack's effectiveness, in which our attack successfully inferred 97.62% of the keystrokes. Besides, we performed an additional experiment to underline that our attack is practical, confirming its effectiveness even when (1) there are multiple users in a room, and (2) the attacker cannot see the victims. Moreover, we replicated our proposed attack on four applications to demonstrate the generalizability of the attack. These results underscore the severity of the vulnerability and its potential impact on millions of VR social platform users.

That Doesn't Go There: Attacks on Shared State in Multi-User Augmented Reality Applications

Carter Slocum, Yicheng Zhang, Erfan Shayegani, Pedram Zaree, and Nael Abu-Ghazaleh, University of California, Riverside; Jiasi Chen, University of Michigan

Available Media

Augmented Reality (AR) can enable shared virtual experiences between multiple users. In order to do so, it is crucial for multi-user AR applications to establish a consensus on the "shared state" of the virtual world and its augmentations through which users interact. Current methods to create and access shared state collect sensor data from devices (e.g., camera images), process them, and integrate them into the shared state. However, this process introduces new vulnerabilities and opportunities for attacks. Maliciously writing false data to "poison" the shared state is a major concern for the security of the downstream victims that depend on it. Another type of vulnerability arises when reading the shared state: by providing false inputs, an attacker can view hologram augmentations at locations they are not allowed to access. In this work, we demonstrate a series of novel attacks on multiple AR frameworks with shared states, focusing on three publicly accessible frameworks. We show that these frameworks, while using different underlying implementations, scopes, and mechanisms to read from and write to the shared state, have shared vulnerability to a unified threat model. Our evaluations of these state-of-the-art AR frameworks demonstrate reliable attacks both on updating and accessing the shared state across different systems. To defend against such threats, we discuss a number of potential mitigation strategies that can help enhance the security of multi-user AR applications and implement an initial prototype.

Penetration Vision through Virtual Reality Headsets: Identifying 360-degree Videos from Head Movements

Anh Nguyen, Xiaokuan Zhang, and Zhisheng Yan, George Mason University

Available Media

In this paper, we present the first contactless side-channel attack for identifying 360° videos being viewed in a Virtual Reality (VR) Head Mounted Display (HMD). Although the video content is displayed inside the HMD without any external exposure, we observe that user head movements are driven by the video content, which creates a unique side channel that does not exist in traditional 2D videos. By recording the user whose vision is blocked by the HMD via a malicious camera, an attacker can analyze the correlation between the user's head movements and the victim video to infer the video title.

To exploit this new vulnerability, we present INTRUDE, a system for identifying 360° videos from recordings of user head movements. INTRUDE is empowered by an HMD-based head movement estimation scheme to extract a head movement trace from the recording and a video saliency-based trace-fingerprint matching framework to infer the video title. Evaluation results show that INTRUDE achieves over 96% of accuracy for video identification and is robust under different recording environments. Moreover, INTRUDE maintains its effectiveness in the open-world identification scenario.

Track 3

User Studies III: Privacy I

Session Chair: Mainack Mondal

Salon E

"I'm not convinced that they don't collect more than is necessary": User-Controlled Data Minimization Design in Search Engines

Tanusree Sharma, University of Illinois at Urbana-Champaign; Lin Kyi, Max Planck Institute for Security and Privacy; Yang Wang, University of Illinois at Urbana-Champaign; Asia J. Biega, Max Planck Institute for Security and Privacy

Available Media

Data minimization is a legal and privacy-by-design principle mandating that online services collect only data that is necessary for pre-specified purposes. While the principle has thus far mostly been interpreted from a system-centered perspective, there is a lack of understanding about how data minimization could be designed from a user-centered perspective, and in particular, what factors might influence user decision-making with regard to the necessity of data for different processing purposes. To address this gap, in this paper, we gain a deeper understanding of users' design expectations and decision-making processes related to data minimization, focusing on a case study of search engines. We also elicit expert evaluations of the feasibility of user-generated design ideas. We conducted interviews with 25 end users and 10 experts from the EU and UK to provide concrete design recommendations for data minimization that incorporate user needs, concerns, and preferences. Our study (i) surfaces how users reason about the necessity of data in the context of search result quality, and (ii) examines the impact of several factors on user decision-making about data processing, including specific types of search data, or the volume and recency of data. Most participants emphasized the particular importance of data minimization in the context of sensitive searches, such as political, financial, or health-related search queries. In a think-aloud conceptual design session, participants recommended search profile customization as a solution for retaining data they considered necessary, as well as alert systems that would inform users to minimize data in instances of excessive collection. We propose actionable design features that could provide users with greater agency over their data through user-controlled data minimization, combined with relevant implementation insights from experts.

The Effect of Design Patterns on (Present and Future) Cookie Consent Decisions

Nataliia Bielova, Inria research centre at Université Côte d'Azur; Laura Litvine and Anysia Nguyen, Behavioural Insights Team (BIT); Mariam Chammat, Interministerial Directorate for Public Transformation (DITP); Vincent Toubiana, Commission Nationale de l'Informatique et des Libertés (CNIL); Estelle Hary, RMIT University

Available Media

Today most websites in the EU present users with a consent banner asking about the use of cookies or other tracking technologies. Data Protection Authorities (DPAs) need to ensure that users can express their true preferences when faced with these banners, while simultaneously satisfying the EU GDPR requirements. To address the needs of the French DPA, we conducted an online experiment among 3,947 participants in France exploring the impact of six different consent banner designs on the outcome of users' consent decision. We also assessed participants' knowledge and privacy preferences, as well as satisfaction with the banners. In contrast with previous results, we found that a "bright pattern" that highlights the decline option has a substantial effect on users' decisions. We also find that two new designs based on behavioral levers have the strongest effect on the outcome of the consent decision, and participants' satisfaction with the banners. Finally, our study provides novel evidence that the effect of design persists in a short time frame: designs can significantly affect users' future choices, even when faced with neutral banners.

Unpacking Privacy Labels: A Measurement and Developer Perspective on Google's Data Safety Section

Rishabh Khandelwal, Asmit Nayak, Paul Chung, and Kassem Fawaz, University of Wisconsin-Madison

Available Media

Google has mandated developers to use Data Safety Sections (DSS) to increase transparency in data collection and sharing practices. In this paper, we present a comprehensive analysis of Google's Data Safety Section (DSS) using both quantitative and qualitative methods. We conduct the first large-scale measurement study of DSS using apps from Android Play store (n=1.1M). We find that there are internal inconsistencies within the reported practices. We also find trends of both over and under-reporting practices in the DSSs. Finally, we conduct a longitudinal study of DSS to explore how the reported practices evolve over time, and find that the developers are still adjusting their practices. To contextualize these findings, we conduct a developer study, uncovering the process that app developers undergo when working with DSS. We highlight the challenges faced and strategies employed by developers for DSS submission, and the factors contributing to changes in the DSS. Our research contributes valuable insights into the complexities of implementing and maintaining privacy labels, underlining the need for better resources, tools, and guidelines to aid developers. This understanding is crucial as the accuracy and reliability of privacy labels directly impact their effectiveness.

Dissecting Privacy Perspectives of Websites Around the World: "Aceptar Todo, Alle Akzeptieren, Accept All..."

Aysun Ogut, Berke Turanlioglu, Doruk Can Metiner, Albert Levi, Cemal Yilmaz, and Orcun Cetin, Sabanci University, Tuzla, Istanbul, Turkiye; Selcuk Uluagac, Cyber-Physical Systems Security Lab, Florida International University, Miami, Florida, USA

Available Media

Privacy has become a significant concern as the processing, storage, and sharing of collected data expands. In order to take precautions against this increasing issue, countries and different government entities have enacted laws for the protection of privacy, and articles regarding acquiring consent from the user to collect data (i.e., via cookies) have been regulated such as the right of one to be informed and to manage their preferences. Even though there are many regulations, still many websites do not transparently provide their users with their privacy practices and cookie consent notices, and restrict one's rights or make it difficult to set/choose their privacy preferences. The main objective of this study is to analyze whether websites from around the world inform their users about the collection of their data and to identify how easy or difficult for users to set their privacy preferences in practice. While observing the differences between countries, we also aim to examine whether there is an effect of geographical location on privacy approaches and whether the applications and interpretations of countries that follow and comply with the same laws are similar. For this purpose, we have developed an automated tool to scan the privacy notices on the 500 most popular websites in different countries around the world. Our extensive analysis indicates that in some countries users are rarely informed and even in countries with high cookie consent notifications, offering the option to refuse is still very low despite the fact that it is part of their regulations. The highest rate of reject buttons on cookie banners in the countries studied is 35%. Overall, although the law gives the user the right to refuse consent and be informed, we have concluded that this does not apply in practice in most countries. Moreover, in many cases, the implementations are convoluted and not user-friendly at all.

Data Subjects' Reactions to Exercising Their Right of Access

Arthur Borem, Elleen Pan, Olufunmilola Obielodan, Aurelie Roubinowitz, and Luca Dovichi, University of Chicago; Michelle L. Mazurek, University of Maryland; Blase Ur, University of Chicago

Available Media

Recent privacy laws have strengthened data subjects' right to access personal data collected by companies. Prior work has found that data exports companies provide consumers in response to Data Subject Access Requests (DSARs) can be overwhelming and hard to understand. To identify directions for improving the user experience of data exports, we conducted an online study in which 33 participants explored their own data from Amazon, Facebook, Google, Spotify, or Uber. Participants articulated questions they hoped to answer using the exports. They also annotated parts of the export they found confusing, creepy, interesting, or surprising. While participants hoped to learn either about their own usage of the platform or how the company collects and uses their personal data, these questions were often left unanswered. Participants' annotations documented their excitement at finding data records that triggered nostalgia, but also shock and anger about the privacy implications of other data they saw. Having examining their data, many participants hoped to request the company erase some, but not all, of the data. We discuss opportunities for future transparency-enhancing tools and enhanced laws.

Track 4

ML V: Backdoor Defense

Session Chair: Songze Li

Salon F

Neural Network Semantic Backdoor Detection and Mitigation: A Causality-Based Approach

Bing Sun, Jun Sun, and Wayne Koh, Singapore Management University; Jie Shi, Huawei Singapore

Available Media

Different from ordinary backdoors in neural networks which are introduced with artificial triggers (e.g., certain specific patch) and/or by tampering the samples, semantic backdoors are introduced by simply manipulating the semantic, e.g., by labeling green cars as frogs in the training set. By focusing on samples with rare semantic features (such as green cars), the accuracy of the model is often minimally affected. Since the attacker is not required to modify the input sample during training nor inference time, semantic backdoors are challenging to detect and remove. Existing backdoor detection and mitigation techniques are shown to be ineffective with respect to semantic backdoors. In this work, we propose a method to systematically detect and remove semantic backdoors. Specifically we propose SODA (Semantic BackdOor Detection and MitigAtion) with the key idea of conducting lightweight causality analysis to identify potential semantic backdoor based on how hidden neurons contribute to the predictions and to remove the backdoor by adjusting the responsible neurons' contribution towards the correct predictions through optimization. SODA is evaluated with 21 neural networks trained on 6 benchmark datasets and 2 kinds of semantic backdoor attacks for each dataset. The results show that it effectively detects and removes semantic backdoors and preserves the accuracy of the neural networks.

On the Difficulty of Defending Contrastive Learning against Backdoor Attacks

Changjiang Li, Stony Brook University; Ren Pang, Bochuan Cao, Zhaohan Xi, and Jinghui Chen, Pennsylvania State University; Shouling Ji, Zhejiang University; Ting Wang, Stony Brook University

Available Media

Recent studies have shown that contrastive learning, like supervised learning, is highly vulnerable to backdoor attacks wherein malicious functions are injected into target models, only to be activated by specific triggers. However, thus far it remains under-explored how contrastive backdoor attacks fundamentally differ from their supervised counterparts, which impedes the development of effective defenses against the emerging threat.

This work represents a solid step toward answering this critical question. Specifically, we define TRL, a unified framework that encompasses both supervised and contrastive backdoor attacks. Through the lens of TRL, we uncover that the two types of attacks operate through distinctive mechanisms: in supervised attacks, the learning of benign and backdoor tasks tends to occur independently, while in contrastive attacks, the two tasks are deeply intertwined both in their representations and throughout their learning processes. This distinction leads to the disparate learning dynamics and feature distributions of supervised and contrastive attacks. More importantly, we reveal that the specificities of contrastive backdoor attacks entail important implications from a defense perspective: existing defenses for supervised attacks are often inadequate and not easily retrofitted to contrastive attacks. We also explore several promising alternative defenses and discuss their potential challenges. Our findings highlight the need for defenses tailored to the specificities of contrastive backdoor attacks, pointing to promising directions for future research.

Mudjacking: Patching Backdoor Vulnerabilities in Foundation Models

Hongbin Liu, Michael K. Reiter, and Neil Zhenqiang Gong, Duke University

Available Media

Foundation model has become the backbone of the AI ecosystem. In particular, a foundation model can be used as a general-purpose feature extractor to build various downstream classifiers. However, foundation models are vulnerable to backdoor attacks and a backdoored foundation model is a single-point-of-failure of the AI ecosystem, e.g., multiple downstream classifiers inherit the backdoor vulnerabilities simultaneously. In this work, we propose Mudjacking, the first method to patch foundation models to remove backdoors. Specifically, given a misclassified trigger-embedded input detected after a backdoored foundation model is deployed, Mudjacking adjusts the parameters of the foundation model to remove the backdoor. We formulate patching a foundation model as an optimization problem and propose a gradient descent based method to solve it. We evaluate Mudjacking on both vision and language foundation models, eleven benchmark datasets, five existing backdoor attacks, and thirteen adaptive backdoor attacks. Our results show that Mudjacking can remove backdoor from a foundation model while maintaining its utility.

Xplain: Analyzing Invisible Correlations in Model Explanation

Kavita Kumari and Alessandro Pegoraro, Technical University of Darmstadt; Hossein Fereidooni, Kobil; Ahmad-Reza Sadeghi, Technical University of Darmstadt

Available Media

Explanation methods analyze the features in backdoored input data that contribute to model misclassification. However, current methods like path techniques struggle to detect backdoor patterns in adversarial situations. They fail to grasp the hidden associations of backdoor features with other input features, leading to misclassification. Additionally, they suffer from irrelevant data attribution, imprecise feature connections, baseline dependence, and vulnerability to the "saturation effect".

To address these limitations, we propose Xplain. Our method aims to uncover hidden backdoor trigger patterns and the subtle relationships between backdoor features and other input objects, which are the main causes of model misclassification. Our algorithm improves existing path techniques by integrating an additional baseline into the Integrated Gradients (IG) formulation. This ensures that features selected in the baseline persist along the integration path, guaranteeing baseline independence. Additionally, we introduce quantitative noise to interpolate samples along the integration path, which reduces feature dependency and captures non-linear interactions. This approach effectively identifies the relevant features that significantly influence model predictions.

Furthermore, Xplain proposes sensitivity analysis to enhance AI system resilience against backdoor attacks. This uncovers clear connections between the backdoor and other input data features, thus shedding light on relevant interactions. We thoroughly test the effectiveness of Xplain on the Imagenet and the multimodal domain of the Visual Question Answering dataset, showing its superiority over current path methods such as Integrated Gradient (IG), left-IG, Guided IG, and Adversarial Gradient Integration (AGI) techniques.

Verify your Labels! Trustworthy Predictions and Datasets via Confidence Scores

Torsten Krauß, Jasper Stang, and Alexandra Dmitrienko, University of Würzburg

Available Media

Machine learning is a rapidly evolving technology with manifold benefits. At its core lies the mapping between samples and corresponding target labels (SL-Mappings). Such mappings can originate from labeled dataset samples or from prediction generated during model inference. The correctness of SL-Mappings is crucial, both during training and for model predictions, especially when considering poisoning attacks.

Existing standalone works from the dataset cleaning and prediction confidence scoring domains lack a dual-use tool offering an SL-Mappings score, which is impractical. Moreover, these works have drawbacks, e.g., dependence on specific model architectures and reliance on large datasets, which may not be accessible, or lack a meaningful confidence score.

In this paper, we introduce LabelTrust, a versatile tool designed to generate confidence scores for SL-Mappings. We propose pipelines facilitating dataset cleaning and confidence scoring, mitigating the limitations of existing standalone approaches from each domain. Thereby, LabelTrust leverages a Siamese network trained via few-shot learning, requiring minimal clean samples and is agnostic to datasets and model architectures. We demonstrate LabelTrust's efficacy in detecting poisoning attacks within samples and predictions alike, with a modest one-time training overhead of 34.56 seconds and an evaluation time of less than 1 second per SL-Mapping.

Track 5

ML VI: Digital Adversarial Attacks

Session Chair: Yingying Chen

Salon G

More Simplicity for Trainers, More Opportunity for Attackers: Black-Box Attacks on Speaker Recognition Systems by Inferring Feature Extractor

Yunjie Ge, Pinji Chen, Qian Wang, Lingchen Zhao, and Ningping Mou, Wuhan University; Peipei Jiang, Wuhan University; City University of Hong Kong; Cong Wang, City University of Hong Kong; Qi Li, Tsinghua University; Chao Shen, Xi'an Jiaotong University

Available Media

Recent studies have revealed that deep learning-based speaker recognition systems (SRSs) are vulnerable to adversarial examples (AEs). However, the practicality of existing black-box AE attacks is restricted by the requirement for extensive querying of the target system or the limited attack success rates (ASR). In this paper, we introduce VoxCloak, a new targeted AE attack with superior performance in both these aspects. Distinct from existing methods that optimize AEs by querying the target model, VoxCloak initially employs a small number of queries (e.g., a few hundred) to infer the feature extractor used by the target system. It then utilizes this feature extractor to generate any number of AEs locally without the need for further queries. We evaluate VoxCloak on four commercial speaker recognition (SR) APIs and seven voice assistants. On the SR APIs, VoxCloak surpasses the existing transfer-based attacks, improving ASR by 76.25% and signal-to-noise ratio (SNR) by 13.46 dB, as well as the decision-based attacks, requiring 33 times fewer queries and improving SNR by 7.87 dB while achieving comparable ASRs. On the voice assistants, VoxCloak outperforms the existing methods with a 49.40% improvement in ASR and a 15.79 dB improvement in SNR.

Transferability of White-box Perturbations: Query-Efficient Adversarial Attacks against Commercial DNN Services

Meng Shen and Changyue Li, School of Cyberspace Science and Technology, Beijing Institute of Technology, China; Qi Li, Institute for Network Sciences and Cyberspace, Tsinghua University, China; Hao Lu, School of Computer Science and Technology, Beijing Institute of Technology, China; Liehuang Zhu, School of Cyberspace Science and Technology, Beijing Institute of Technology, China; Ke Xu, Department of Computer Science, Tsinghua University, China

Available Media

Deep Neural Networks (DNNs) have been proven to be vulnerable to adversarial attacks. Existing decision-based adversarial attacks require large numbers of queries to find an effective adversarial example, resulting in a heavy query cost and also performance degradation under defenses. In this paper, we propose the Dispersed Sampling Attack (DSA), which is a query-efficient decision-based adversarial attack by exploiting the transferability of white-box perturbations. DSA can generate diverse examples with different locations in the embedding space, which provides more information about the adversarial region of substitute models and allows us to search for transferable perturbations. Specifically, DSA samples in a hypersphere centered on an original image, and progressively constrains the perturbation. Extensive experiments are conducted on public datasets to evaluate the performance of DSA in closed-set and open-set scenarios. DSA outperforms the state-of-the-art attacks in terms of both attack success rate (ASR) and average number of queries (AvgQ). Specifically, DSA achieves an ASR of about 90% with an AvgQ of 200 on 4 well-known commercial DNN services.

Adversarial Illusions in Multi-Modal Embeddings

Tingwei Zhang and Rishi Jha, Cornell University; Eugene Bagdasaryan, University of Massachusetts Amherst; Vitaly Shmatikov, Cornell Tech

Distinguished Paper Award Winner

Available Media

Multi-modal embeddings encode texts, images, thermal images, sounds, and videos into a single embedding space, aligning representations across different modalities (e.g., associate an image of a dog with a barking sound). In this paper, we show that multi-modal embeddings can be vulnerable to an attack we call "adversarial illusions." Given an image or a sound, an adversary can perturb it to make its embedding close to an arbitrary, adversary-chosen input in another modality.

These attacks are cross-modal and targeted: the adversary can align any image or sound with any target of his choice. Adversarial illusions exploit proximity in the embedding space and are thus agnostic to downstream tasks and modalities, enabling a wholesale compromise of current and future tasks, as well as modalities not available to the adversary. Using ImageBind and AudioCLIP embeddings, we demonstrate how adversarially aligned inputs, generated without knowledge of specific downstream tasks, mislead image generation, text generation, zero-shot classification, and audio retrieval.

We investigate transferability of illusions across different embeddings and develop a black-box version of our method that we use to demonstrate the first adversarial alignment attack on Amazon's commercial, proprietary Titan embedding. Finally, we analyze countermeasures and evasion attacks.

It Doesn't Look Like Anything to Me: Using Diffusion Model to Subvert Visual Phishing Detectors

Qingying Hao and Nirav Diwan, University of Illinois at Urbana-Champaign; Ying Yuan, University of Padua; Giovanni Apruzzese, University of Liechtenstein; Mauro Conti, University of Padua; Gang Wang, University of Illinois at Urbana-Champaign

Available Media

Visual phishing detectors rely on website logos as the invariant identity indicator to detect phishing websites that mimic a target brand's website. Despite their promising performance, the robustness of these detectors is not yet well understood. In this paper, we challenge the invariant assumption of these detectors and propose new attack tactics, LogoMorph, with the ultimate purpose of enhancing these systems. LogoMorph is rooted in a key insight: users can neglect large visual perturbations on the logo as long as the perturbation preserves the original logo's semantics. We devise a range of attack methods to create semantic-preserving adversarial logos, yielding phishing webpages that bypass state-of-the-art detectors. For text-based logos, we find that using alternative fonts can help to achieve the attack goal. For image-based logos, we find that an adversarial diffusion model can effectively capture the style of the logo while generating new variants with large visual differences. Practically, we evaluate LogoMorph with white-box and black-box experiments and test the resulting adversarial webpages against various visual phishing detectors end-to-end. User studies (n = 150) confirm the effectiveness of our adversarial phishing webpages on end users (with a detection rate of 0.59, barely better than a coin toss). We also propose and evaluate countermeasures, and share our code.

Invisibility Cloak: Proactive Defense Against Visual Game Cheating

Chenxin Sun, Kai Ye, Liangcai Su, Jiayi Zhang, and Chenxiong Qian, The University of Hong Kong

Available Media

The gaming industry has experienced remarkable innovation and rapid growth in recent years. However, this progress has been accompanied by a concerning increase in First-person Shooter game cheating, with aimbots being the most prevalent and harmful tool. Visual aimbots, in particular, utilize game visuals and integrated visual models to extract game information, providing cheaters with automatic shooting abilities. Unfortunately, existing anti-cheating methods have proven ineffective against visual aimbots. To combat visual aimbots, we introduce the first proactive defense framework against visual game cheating, called Invisibility Cloak. Our approach adds imperceptible perturbations to game visuals, making them unrecognizable to AI models. We conducted extensive experiments on popular games CrossFire (CF) and Counter-Strike 2 (CS2), and our results demonstrate that Invisibility Cloak achieves real-time re-rendering of high-quality game visuals while effectively impeding various mainstream visual cheating models. By deploying Invisibility Cloak online in both CF and CS2, we successfully eliminated almost all aiming and shooting behaviors associated with aimbots, significantly enhancing the gaming experience for legitimate players.

Track 6

Security Analysis III: Protocol

Session Chair: Catherine Meadows

Salon H

Logic Gone Astray: A Security Analysis Framework for the Control Plane Protocols of 5G Basebands

Kai Tu, Abdullah Al Ishtiaq, Syed Md Mukit Rashid, Yilu Dong, Weixuan Wang, Tianwei Wu, and Syed Rafiul Hussain, Pennsylvania State University

Distinguished Paper Award Winner

Available Media

We develop 5GBaseChecker— an efficient, scalable, and dynamic security analysis framework based on differential testing for analyzing 5G basebands' control plane protocol interactions. 5GBaseChecker first captures basebands' protocol behaviors as a finite state machine (FSM) through black-box automata learning. To facilitate efficient learning and improve scalability, 5GBaseChecker introduces novel hybrid and collaborative learning techniques. 5GBaseChecker then identifies input sequences for which the extracted FSMs provide deviating outputs. Finally, 5GBaseChecker leverages these deviations to efficiently identify the security properties from specifications and use those to triage if the deviations found in 5G basebands violate any properties. We evaluated 5GBaseChecker with 17 commercial 5G basebands and 2 open-source UE implementations and uncovered 22 implementation-level issues, including 13 exploitable vulnerabilities and 2 interoperability issues.

SPF Beyond the Standard: Management and Operational Challenges in Practice and Practical Recommendations

Md. Ishtiaq Ashiq and Weitong Li, Virginia Tech; Tobias Fiebig, Max-Planck-Institut für Informatik; Taejoong Chung, Virginia Tech

Available Media

Since its inception in the 1970s, email has emerged as an irreplaceable medium for global communication. Despite its ubiquity, the system is plagued by security vulnerabilities, such as email spoofing. Among the various countermeasures, the Sender Policy Framework (SPF) remains a seminal and commonly deployed solution, working by specifying a list of authorized IP addresses for sending email.

While SPF might seem simple on the surface, the practical management of its records proves to be challenging; for example, although syntactical errors are uncommon (0.4%), evaluation-phase challenges are prevalent (7.7%), leading to potential disruptions in email delivery.

In our paper, we conduct a comprehensive study on the SPF extension, drawing from 17 months of weekly data snapshots that span 176 million domains across four top-level domains; we delve into the reasons behind such prevalent evaluation errors. Simultaneously, we undertake an ethical methodology to explore how SMTP servers validate SPF records and evaluate the effectiveness of widely-used software implementations. Our study unveils potential attack vectors that could be exploited for DNS amplification attacks or disrupt mail distribution; for instance, we demonstrate how an attacker could temporarily impede email reception by exploiting flaws in SPF validation mechanisms. We also conduct a qualitative study among email administrators to gain insights into the practical implementation and usage of SPF and SPF validators. Based on our findings, we provide recommendations designed to reconcile these discrepancies and bolster the SPF ecosystem's overall security.

A Formal Analysis of SCTP: Attack Synthesis and Patch Verification

Jacob Ginesin, Max von Hippel, Evan Defloor, and Cristina Nita-Rotaru, Northeastern University; Michael Tüxen, FH Münster

Available Media

SCTP is a transport protocol offering features such as multi-homing, multi-streaming, and message-oriented delivery. Its two main implementations were subjected to conformance tests using the PacketDrill tool. Conformance testing is not exhaustive and a recent vulnerability (CVE-2021-3772) showed SCTP is not immune to attacks. Changes addressing the vulnerability were implemented, but the question remains whether other flaws might persist in the protocol design.

We study the security of the SCTP design, taking a rigorous approach rooted in formal methods. We create a formal Promela model of SCTP, and define 10 properties capturing the essential protocol functionality based on its RFC specification and consultation with the lead RFC author. Then we show using the SPIN model checker that our model satisfies these properties. We next define 4 representative attacker models – Off-Path, where the attacker is an outsider that can spoof the port and IP of a peer; Evil-Server, where the attacker is a malicious peer; Replay, where an attacker can capture and replay, but not modify, packets; and On-Path, where the attacker controls the channel between peers. SCTP was designed to be secure against Off-Path attackers, and we study the additional models in order to understand how its security degrades for successively more powerful attacker types. We modify an attack synthesis tool designed for transport protocols, KORG, to support our SCTP model and 4 attacker models.

We synthesize the vulnerability reported in CVE-2021- 3772 in the Off-Path attacker model, when the patch is disabled, and we show that when enabled, the patch eliminates the vulnerability. We also manually identify two ambiguities in the RFC, and using KORG, we show that each, if misinterpreted, opens the protocol to a new Off-Path attack. We show that SCTP is vulnerable to a variety of attacks when it is misused in the Evil-Server, Replay, or On-Path attacker models (for which it was not designed). We discuss these and, when possible, mitigations thereof. Finally, we propose two RFC errata – one to eliminate each ambiguity – of which so far, the SCTP RFC committee has accepted one.

Athena: Analyzing and Quantifying Side Channels of Transport Layer Protocols

Feiyang Yu, Duke University; Quan Zhou and Syed Rafiul Hussain, Pennsylvania State University; Danfeng Zhang, Duke University

Available Media

Recent research has shown a growing number of side-channel vulnerabilities in transport layer protocols, such as TCP and UDP. Those side channels can be exploited by adversaries to launch nefarious attacks. In this paper, we present Athena, an automated tool for detecting, quantifying and explaining side-channel vulnerabilities in vanilla implementations of transport layer protocols. Unlike prior tools, Athena adopts a novel graph-based analysis, making it scalable enough to be the first side-channel analysis tool that can comprehensively analyze the TCP and UDP implementations in several operating systems with significantly higher coverage than the state-of-the-art. Moreover, Athena uses an entropy-based algorithm to identify the most important vulnerabilities. Evaluation on several benchmarks including Linux, FreeBSD, OpenBSD and two open-source IPv4 implementations suggests that Athena can narrow down critical side channels to a single digit (among over 1000 candidates) with a low false positive rate. Besides covering known side channels, Athena also discovers 30 new potential attack surfaces.

Shaken, not Stirred - Automated Discovery of Subtle Attacks on Protocols using Mix-Nets

Jannik Dreier, Université de Lorraine, CNRS, Inria, LORIA; Pascal Lafourcade and Dhekra Mahmoud, Université de Clermont Auvergne, LIMOS

Available Media

Mix-Nets are used to provide anonymity by passing a list of inputs through a collection of mix servers. Each server mixes the entries to create a new anonymized list, so that the correspondence between the output and the input is hidden. These Mix-Nets are used in numerous protocols in which the anonymity of participants is required, for example voting or electronic exam protocols. Some of these protocols have been proven secure using automated tools such as the cryptographic protocol verifier ProVerif, although they use the Mix-Net incorrectly. We propose a more detailed formal model of exponentiation and re-encryption Mix-Nets in the applied Π-Calculus, the language used by ProVerif, and show that using this model we can automatically discover attacks based on the incorrect use of the Mix-Net. In particular, we (re-)discover attacks on four cryptographic protocols using ProVerif: we show that an electronic exam protocol, two electronic voting protocols, and the "Crypto Santa" protocol do not satisfy the desired privacy properties. We then fix the vulnerable protocols by adding missing zero-knowledge proofs and analyze the resulting protocols using ProVerif. Again, in addition to the common abstract modeling of Zero Knowledge Proofs (ZKP), we also use a special model corresponding to weak (malleable) ZKPs. We show that in this case all attacks persist, and that we can again (re)discover these attacks automatically.

Track 7

Cryptographic Protocols II

Session Chair: Wouter Lueks

Salon IJK

Rabbit-Mix: Robust Algebraic Anonymous Broadcast from Additive Bases

Chongwon Cho and Samuel Dittmer, Stealth Software Technologies Inc.; Yuval Ishai, Technion; Steve Lu, Stealth Software Technologies Inc.; Rafail Ostrovsky, UCLA

Available Media

We present Rabbit-Mix, a robust algebraic mixing-based anonymous broadcast protocol in the client-server model. Rabbit-Mix is the first practical sender-anonymous broadcast protocol satisfying both robustness and 100% message delivery assuming a (strong) honest majority of servers. It presents roughly 3x improvement in comparison to Blinder (CCS 2020), a previous anonymous broadcast protocol in the same model, in terms of the number of algebraic operations and communication, while at the same time eliminating the non-negligible failure probability of Blinder. To obtain these improvements, we combine the use of Newton's identities for mixing with a novel way of exploiting an algebraic structure in the powers of field elements, based on an {\em additive 2-basis}, to compactly encode and decode client messages. We also introduce a simple and efficient distributed protocol to verify the well-formedness of client input encodings, which should consist of shares of multiple arithmetic progressions tied together.

PerfOMR: Oblivious Message Retrieval with Reduced Communication and Computation

Zeyu Liu, Yale University; Eran Tromer, Boston University; Yunhao Wang, Yale University

Available Media

Anonymous message delivery, as in privacy-preserving blockchain and private messaging applications, needs to protect recipient metadata: eavesdroppers should not be able to link messages to their recipients. This raises the question: how can untrusted servers assist in delivering the pertinent messages to each recipient, without learning which messages are addressed to whom?

Recent work constructed Oblivious Message Retrieval (OMR) protocols that outsource the message detection and retrieval in a privacy-preserving way, using homomorphic encryption. Their construction exhibits significant costs in computation per message scanned (∼0.1 second), as well as in the size of the associated messages (∼1kB overhead) and public keys (∼132kB).

This work constructs more efficient OMR schemes, by replacing the LWE-based clue encryption of prior works with a Ring-LWE variant, and utilizing the resulting flexibility to improve several components of the scheme. We thus devise, analyze, and benchmark two protocols:

The first protocol focuses on improving the detector runtime, using a new retrieval circuit that can be homomorphically evaluated 15x faster than the prior work.

The second protocol focuses on reducing the communication costs, by designing a different homomorphic decryption circuit that allows the parameter of the Ring-LWE encryption to be set such that the public key size is about 235x smaller than the prior work, and the message size is roughly 1.6x smaller. The runtime of this second construction is ∼40.0ms per message, still more than 2.5x faster than prior works.

Fast RS-IOP Multivariate Polynomial Commitments and Verifiable Secret Sharing

Zongyang Zhang, Weihan Li, Yanpei Guo, and Kexin Shi, Beihang University; Sherman S. M. Chow, The Chinese University of Hong Kong; Ximeng Liu, Fuzhou University; Jin Dong, Beijing Academy of Blockchain and Edge Computing

Available Media

Supporting proofs of evaluations, polynomial commitment schemes (PCS) are crucial in secure distributed systems. Schemes based on fast Reed–Solomon interactive oracle proofs (RS-IOP) of proximity have recently emerged, offering transparent setup, plausible post-quantum security, efficient operations, and, notably, sublinear proof size and verification. Manifesting a new paradigm, PCS with one-to-many proof can enhance the performance of (asynchronous) verifiable secret sharing ((A)VSS), a cornerstone in distributed computing, for proving multiple evaluations to multiple verifiers. Current RS-IOP-based multivariate PCS, including HyperPlonk (Eurocrypt '23) and Virgo (S&P '20), however, only offer quasi-linear prover complexity in the polynomial size.

We propose PolyFRIM, a fast RS-IOP-based multivariate PCS with optimal linear prover complexity, 5-25× faster than prior arts while ensuring competent proof size and verification. Heeding the challenging absence of FFT circuits for multivariate evaluation, PolyFRIM surpasses Zhang et al.'s (Usenix Sec. '22) one-to-many univariate PCS, accelerating proving by 4-7× and verification by 2-4× with 25% shorter proof. Leveraging PolyFRIM, we propose an AVSS scheme FRISS with a better efficiency tradeoff than prior arts from multivariate PCS, including Bingo (Crypto '23) and Haven (FC '21).

Abuse Reporting for Metadata-Hiding Communication Based on Secret Sharing

Saba Eskandarian, University of North Carolina at Chapel Hill

Available Media

As interest in metadata-hiding communication grows in both research and practice, a need exists for stronger abuse reporting features on metadata-hiding platforms. While message franking has been deployed on major end-to-end encrypted platforms as a lightweight and effective abuse reporting feature, there is no comparable technique for metadata-hiding platforms. Existing efforts to support abuse reporting in this setting, such as asymmetric message franking or the Hecate scheme, require order of magnitude increases in client and server computation or fundamental changes to the architecture of messaging systems. As a result, while metadata-hiding communication inches closer to practice, critical content moderation concerns remain unaddressed.

This paper demonstrates that, for broad classes of metadata-hiding schemes, lightweight abuse reporting can be deployed with minimal changes to the overall architecture of the system. Our insight is that much of the structure needed to support abuse reporting already exists in these schemes. By taking a non-generic approach, we can reuse this structure to achieve abuse reporting with minimal overhead. In particular, we show how to modify schemes based on secret sharing user inputs to support a message franking-style protocol. Compared to prior work, our shared franking technique more than halves the time to prepare a franked message and gives order of magnitude reductions in server-side message processing times, as well as in the time to decrypt a message and verify a report.

SOAP: A Social Authentication Protocol

Felix Linker and David Basin, Department of Computer Science, ETH Zurich

Available Media

Social authentication has been suggested as a usable authentication ceremony to replace manual key authentication in messaging applications. Using social authentication, chat partners authenticate their peers using digital identities managed by identity providers. In this paper, we formally define social authentication, present a protocol called SOAP that largely automates social authentication, formally prove SOAP's security, and demonstrate SOAP's practicality in two prototypes. One prototype is web-based, and the other is implemented in the open-source Signal messaging application.

Using SOAP, users can significantly raise the bar for compromising their messaging accounts. In contrast to the default security provided by messaging applications such as Signal and WhatsApp, attackers must compromise both the messaging account and all identity provider-managed identities to attack a victim. In addition to its security and automation, SOAP is straightforward to adopt as it is built on top of the well-established OpenID Connect protocol.

12:00 pm–1:30 pm

Symposium Luncheon and Test of Time Award Presentation

Franklin Hall

1:30 pm–2:45 pm

Track 1

User Studies IV: Policies and Best Practices I

Session Chair: Daniel Votipka

Salon AB

How WEIRD is Usable Privacy and Security Research?

Ayako A. Hasegawa and Daisuke Inoue, NICT; Mitsuaki Akiyama, NTT

Available Media

In human factor fields such as human-computer interaction (HCI) and psychology, researchers have been concerned that participants mostly come from WEIRD (Western, Educated, Industrialized, Rich, and Democratic) countries. This WEIRD skew may hinder understanding of diverse populations and their cultural differences. The usable privacy and security (UPS) field has inherited many research methodologies from research on human factor fields. We conducted a literature review to understand the extent to which participant samples in UPS papers were from WEIRD countries and the characteristics of the methodologies and research topics in each user study recruiting Western or non-Western participants. We found that the skew toward WEIRD countries in UPS is greater than that in HCI. Geographic and linguistic barriers in the study methods and recruitment methods may cause researchers to conduct user studies locally. In addition, many papers did not report participant demographics, which could hinder the replication of the reported studies, leading to low reproducibility. To improve geographic diversity, we provide the suggestions including facilitate replication studies, address geographic and linguistic issues of study/recruitment methods, and facilitate research on the topics for non-WEIRD populations.

Security and Privacy Software Creators' Perspectives on Unintended Consequences

Harshini Sri Ramulu, Paderborn University & The George Washington University; Helen Schmitt, Paderborn University; Dominik Wermke, North Carolina State University; Yasemin Acar, Paderborn University & The George Washington University

Available Media

Security & Privacy (S&P) software is created to have positive impacts on people: to protect them from surveillance and attacks, enhance their privacy, and keep them safe. Despite these positive intentions, S&P software can have unintended consequences, such as enabling and protecting criminals, misleading people into using the software with a false sense of security, and being inaccessible to users without strong technical backgrounds or with specific accessibility needs. In this study, through 14 semi-structured expert interviews with S&P software creators, we explore whether and how S&P software creators foresee and mitigate unintended consequences. We find that unintended consequences are often overlooked and ignored. When addressed, they are done in unstructured ways—often ad hoc and just based on user feedback—thereby shifting the burden to users. To reduce this burden on users and more effectively create positive change, we recommend S&P software creators to proactively consider and mitigate unintended consequences through increasing awareness and education, promoting accountability at the organizational level to mitigate issues, and using systematic toolkits for anticipating impacts.

Engaging Company Developers in Security Research Studies: A Comprehensive Literature Review and Quantitative Survey

Raphael Serafini, Stefan Albert Horstmann, and Alena Naiakshina, Ruhr University Bochum

Available Media

Previous research demonstrated that company developers excel compared to freelancers and computer science students, with the corporate environment significantly influencing security and privacy behavior. Still, the challenge of recruiting a substantial number of company developers persists, primarily due to a lack of knowledge on how to motivate their participation in empirical research studies. To bridge this gap, we performed a literature review and identified a conspicuous absence of information regarding compensation and study length in the domain of security developer studies. To support researchers struggling with the recruitment of company developers, we conducted an extensive quantitative survey with 340 professionals. Our study revealed that 62.5% of developers prioritize security tasks over software engineering tasks, and 96.5% are willing to participate in security studies. Developers consistently ranked security higher than other barriers and motivators. However, repeat participants perceived security tasks as more challenging than first-time participants despite having 40% more general experience and 50% more security-related experience. Further, we discuss Qualtrics as a potential recruitment channel for engaging company developers, acknowledging various challenges. Based on our findings, we provide recommendations for recruiting a high number of company developers.

"What Keeps People Secure is That They Met The Security Team": Deconstructing Drivers And Goals of Organizational Security Awareness

Jonas Hielscher, Ruhr University Bochum; Simon Parkin, Delft University of Technology

Available Media

Security awareness campaigns in organizations now collectively cost billions of dollars annually. There is increasing focus on ensuring certain security behaviors among employees. On the surface, this would imply a user-centered view of security in organizations. Despite this, the basis of what security awareness managers do and what decides this are unclear. We conducted n=15 semi-structured interviews with full-time security awareness managers, with experience across various national and international companies in European countries, with thousands of employees. Through thematic analysis, we identify that success in awareness management is fragile while having the potential to improve; there are a range of restrictions, and mismatched drivers and goals for security awareness, affecting how it is structured, delivered, measured, and improved. We find that security awareness as a practice is underspecified, and split between messaging around secure behaviors and connecting to employees, with a lack of recognition for the measures that awareness managers regard as important. We discuss ways forward, including alternative indicators of success, and security usability advocacy for employees.

Unveiling the Hunter-Gatherers: Exploring Threat Hunting Practices and Challenges in Cyber Defense

Priyanka Badva, Kopo M. Ramokapane, Eleonora Pantano, and Awais Rashid, University of Bristol

Available Media

The dynamic landscape of cyber threats constantly adapts its attack patterns, successfully evading traditional defense mechanisms and operating undetected until its objectives are fulfilled. In response to these elusive threats, threat hunting has become a crucial advanced defense technique against sophisticated and concealed cyber adversaries. However, despite its significance, there remains a lack of deep understanding of the best practices and challenges associated with effective threat hunting. To address this gap, we conducted semi-structured interviews with 22 experienced threat hunters to gain deeper insights into their daily practices, challenges, and strategies to overcome them. Our findings show that threat hunters deploy various approaches, often mixing them. They argue that flexibility in their approach helps them identify subtle threat indicators that might otherwise go undetected if using only one method. Their everyday challenges range from technical challenges to people and organizational culture challenges. Based on these findings, we provide empirical insights for improving threat-hunting best practices.

Track 2

Side Channel IV

Session Chair: Shuo Wang

Salon CD

Pixel Thief: Exploiting SVG Filter Leakage in Firefox and Chrome

Sioli O'Connell, The University of Adelaide; Lishay Aben Sour and Ron Magen, Ben Gurion University of the Negev; Daniel Genkin, Georgia Institute of Technology; Yossi Oren, Ben-Gurion University of the Negev and Intel Corporation; Hovav Shacham, UT Austin; Yuval Yarom, Ruhr University Bochum

Available Media

Web privacy is challenged by pixel-stealing attacks, which allow attackers to extract content from embedded iframes and to detect visited links. To protect against multiple pixelstealing attacks that exploited timing variations in SVG filters, browser vendors repeatedly adapted their implementations to eliminate timing variations. In this work we demonstrate that past efforts are still not sufficient.

We show how web-based attackers can mount cache-based side-channel attacks to monitor data-dependent memory accesses in filter rendering functions. We identify conditions under which browsers elect the non-default CPU implementation of SVG filters, and develop techniques for achieving access to the high-resolution timers required for cache attacks. We then develop efficient techniques to use the pixel-stealing attack for text recovery from embedded pages and to achieve high-speed history sniffing. To the best of our knowledge, our attack is the first to leak multiple bits per screen refresh, achieving an overall rate of 267 bits per second.

Sync+Sync: A Covert Channel Built on fsync with Storage

Qisheng Jiang and Chundong Wang, ShanghaiTech University

Available Media

Scientists have built a variety of covert channels for secretive information transmission with CPU cache and main memory. In this paper, we turn to a lower level in the memory hierarchy, i.e., persistent storage. Most programs store intermediate or eventual results in the form of files and some of them call fsync to synchronously persist a file with storage device for orderly persistence. Our quantitative study shows that one program would undergo significantly longer response time for fsync call if the other program is concurrently calling fsync, although they do not share any data. We further find that, concurrent fsync calls contend at multiple levels of storage stack due to sharing software structures (e.g., Ext4's journal) and hardware resources (e.g., disk's I/O dispatch queue).

We accordingly build a covert channel named Sync+Sync. Sync+Sync delivers a transmission bandwidth of 20,000 bits per second at an error rate of about 0.40% with an ordinary solid-state drive. Sync+Sync can be conducted in cross-disk partition, cross-file system, cross-container, cross-virtual machine, and even cross-disk drive fashions, without sharing data between programs. Next, we launch side-channel attacks with Sync+Sync and manage to precisely detect operations of a victim database (e.g., insert/update and B-Tree node split). We also leverage Sync+Sync to distinguish applications and websites with high accuracy by detecting and analyzing their fsync frequencies and flushed data volumes. These attacks are useful to support further fine-grained information leakage.

What Was Your Prompt? A Remote Keylogging Attack on AI Assistants

Roy Weiss, Daniel Ayzenshteyn, Guy Amit, and Yisroel Mirsky, Ben Gurion University of the Negev

Available Media

AI assistants are becoming an integral part of society, used for asking advice or help in personal and confidential issues. In this paper, we unveil a novel side-channel that can be used to read encrypted responses from AI Assistants over the web: the token-length side-channel. The side-channel reveals the character-lengths of a response's tokens (akin to word lengths). We found that many vendors, including OpenAI and Microsoft, had this side-channel prior to our disclosure.

However, inferring a response's content with this side-channel is challenging. This is because, even with knowledge of token-lengths, a response can have hundreds of words resulting in millions of grammatically correct sentences. In this paper, we show how this can be overcome by (1) utilizing the power of a large language model (LLM) to translate these token-length sequences, (2) providing the LLM with inter-sentence context to narrow the search space and (3) performing a known-plaintext attack by fine-tuning the model on the target model's writing style.

Using these methods, we were able to accurately reconstruct 27% of an AI assistant's responses and successfully infer the topic from 53% of them. To demonstrate the threat, we performed the attack on OpenAI's ChatGPT-4 and Microsoft's Copilot on both browser and API traffic.

NetShaper: A Differentially Private Network Side-Channel Mitigation System

Amir Sabzi, Rut Vora, Swati Goswami, Margo Seltzer, Mathias Lécuyer, and Aastha Mehta, University of British Columbia

Available Media

The widespread adoption of encryption in network protocols has significantly improved the overall security of many Internet applications. However, these protocols cannot prevent network side-channel leaks—leaks of sensitive information through the sizes and timing of network packets. We present NetShaper, a system that mitigates such leaks based on the principle of traffic shaping. NetShaper's traffic shaping provides differential privacy guarantees while adapting to the prevailing workload and congestion condition, and allows configuring a tradeoff between privacy guarantees, bandwidth and latency overheads. Furthermore, NetShaper provides a modular and portable tunnel endpoint design that can support diverse applications. We present a middlebox-based implementation of NetShaper and demonstrate its applicability in a video streaming and a web service application.

SoK: Neural Network Extraction Through Physical Side Channels

Péter Horváth, Dirk Lauret, Zhuoran Liu, and Lejla Batina, Radboud University

Available Media

Deep Neural Networks (DNNs) are widely used in various applications and are typically deployed on hardware accelerators. Physical Side-Channel Analysis (SCA) on DNN implementations is getting more attention from both industry and academia because of the potential to severely jeopardize the confidentiality of DNN Intellectual Property (IP) and the data privacy of end users. Current physical SCA attacks on DNNs are highly platform dependent and employ distinct threat models for different attack objectives and analysis tools, necessitating a general revision of attack methodology and assumptions. To this end, we provide a taxonomy of previous physical SCA attacks on DNNs and systematize findings toward model extraction and input recovery. Specifically, we discuss the dependencies of threat models on attack objectives and analysis methods, for which we present a novel systematic attack framework composed of fundamental stages derived from various attacks. Following the framework, we provide an in-depth analysis of common SCA attacks for each attack objective and reveal practical limitations, validated by experiments on a state-of-the-art commercial DNN accelerator. Based on our findings, we identify challenges and suggest future directions.

Track 3

Cloud Security

Session Chair: Morley Mao

Salon E

ACAI: Protecting Accelerator Execution with Arm Confidential Computing Architecture

Supraja Sridhara, Andrin Bertschi, Benedict Schlüter, Mark Kuhne, Fabio Aliberti, and Shweta Shinde, ETH Zurich

Available Media

Trusted execution environments in several existing and upcoming CPUs demonstrate the success of confidential computing, with the caveat that tenants cannot securely use accelerators such as GPUs and FPGAs. In this paper, we reconsider the Arm Confidential Computing Architecture (CCA) design, an upcoming TEE feature in Armv9-A, to address this gap. We observe that CCA offers the right abstraction and mechanisms to allow confidential VMs to use accelerators as a first-class abstraction. We build ACAI, a CCA-based solution, with a principled approach of extending CCA security invariants to device-side access to address several critical security gaps. Our experimental results on GPU and FPGA demonstrate the feasibility of ACAI while maintaining security guarantees.

ChainPatrol: Balancing Attack Detection and Classification with Performance Overhead for Service Function Chains Using Virtual Trailers

Momen Oqaily and Hinddeep Purohit, CIISE, Concordia University; Yosr Jarraya, Ericsson Security Research; Lingyu Wang, CIISE, Concordia University; Boubakr Nour and Makan Pourzandi, Ericsson Security Research; Mourad Debbabi, CIISE, Concordia University

Available Media

Network functions virtualization enables tenants to outsource their service function chains (SFCs) to third-party clouds for better agility and cost-effectiveness. However, outsourcing may limit tenants' ability to directly inspect cloud-level deployments to detect attacks on SFC forwarding paths, such as network function bypass or traffic injection. Existing solutions requiring direct cloud access are unsuitable for outsourcing, and adding a cryptographic trailer to every packet may incur significant performance overhead over large flows. In this paper, we propose ChainPatrol, a lightweight solution for tenants to continuously detect and classify cloud-level attacks on SFCs. Our main idea is to "virtualize'' cryptographic trailers by encoding them as side-channel watermarks, such that they can be transmitted without adding extra bits to packets. We tackle several key challenges like encoding virtual trailers within the limited side channel capacity, minimizing packet delay, and tolerating unexpected network jitters. We implement our solution on Amazon EC2, and our experiments with real-life data and applications demonstrate that ChainPatrol can achieve a better balance between security (e.g., 100% detection accuracy and 70% classification accuracy) and overhead (e.g., almost zero increased traffic and negligible end-to-end delay) than existing works (e.g., up to 45% overhead reduction compared to a state-of-the-art solution).

HECKLER: Breaking Confidential VMs with Malicious Interrupts

Benedict Schlüter, Supraja Sridhara, Mark Kuhne, Andrin Bertschi, and Shweta Shinde, ETH Zurich

Available Media

Hardware-based Trusted execution environments (TEEs) offer an isolation granularity of virtual machine abstraction. They provide confidential VMs (CVMs) that host security-sensitive code and data. AMD SEV-SNP and Intel TDX enable CVMs and are now available on popular cloud platforms. The untrusted hypervisor in these settings is in control of several resource management and configuration tasks, including interrupts. We present HECKLER, a new attack wherein the hypervisor injects malicious non-timer interrupts to break the confidentiality and integrity of CVMs. Our insight is to use the interrupt handlers that have global effects, such that we can manipulate a CVM's register states to change the data and control flow. With AMD SEV-SNP and Intel TDX, we demonstrate HECKLER on OpenSSH and sudo to bypass authentication. On AMD SEV-SNP we break execution integrity of C, Java, and Julia applications that perform statistical and text analysis. We explain the gaps in current defenses and outline guidelines for future defenses.

Stateful Least Privilege Authorization for the Cloud

Leo Cao, Luoxi Meng, Deian Stefan, and Earlence Fernandes, UC San Diego

Available Media

Architecting an authorization protocol that enforces least privilege in the cloud is challenging. For example, when Zoom integrates with Google Calendar, Zoom obtains a bearer token—a credential that grants broad access to user data on the server. Widely-used authorization protocols like OAuth create overprivileged credentials because they do not provide developers of client apps and servers the tools to request and enforce minimal access. In the status quo, these overprivileged credentials are vulnerable to abuse when stolen or leaked. We introduce an authorization framework that enables creating and using bearer tokens that are least privileged. Our core insight is that the client app developer always knows their minimum privilege requirements when requesting access to user resources on a server. Our framework allows client app developers to write small programs in WebAssembly that customize and attenuate the privilege of OAuth-like bearer tokens. The server executes these programs to enforce that requests are least privileged. Building on this primary mechanism, we introduce a new class of stateful least privilege policies—authorization rules that can depend on a log of actions a client has taken on a server. We instantiate our authorization model for the popular OAuth protocol. Using open source client apps, we show how they can reduce their privilege using a variety of stateful policies enabled by our work.

GraphGuard: Private Time-Constrained Pattern Detection Over Streaming Graphs in the Cloud

Songlei Wang and Yifeng Zheng, Harbin Institute of Technology; Xiaohua Jia, Harbin Institute of Technology and City University of Hong Kong

Available Media

Streaming graphs have seen wide adoption in diverse scenarios due to their superior ability to capture temporal interactions among entities. With the proliferation of cloud computing, it has become increasingly common to utilize the cloud for storing and querying streaming graphs. Among others, streaming graphs-based time-constrained pattern detection, which aims to continuously detect subgraphs matching a given query pattern within a sliding time window, benefits various applications such as credit card fraud detection and cyber-attack detection. Deploying such services on the cloud, however, entails severe security and privacy risks. This paper presents GraphGuard, the first system for privacy-preserving outsourcing of time-constrained pattern detection over streaming graphs. GraphGuard is constructed from a customized synergy of insights on graph modeling, lightweight secret sharing, edge differential privacy, and data encoding and padding, safeguarding the confidentiality of edge/vertex labels and the connections between vertices in the streaming graph and query patterns. We implement and evaluate GraphGuard on several real-world graph datasets. The evaluation results show that GraphGuard takes only a few seconds to securely process an encrypted query pattern over an encrypted snapshot of streaming graphs within a time window of size 50,000. Compared to a baseline built on generic secure multiparty computation, GraphGuard achieves up to 60× improvement in query latency and up to 98% savings in communication.

Track 4

Blockchain I

Session Chair: Fan Zhang

Salon F

Mempool Privacy via Batched Threshold Encryption: Attacks and Defenses

Arka Rai Choudhuri, NTT Research; Sanjam Garg, Julien Piet, and Guru-Vamsi Policharla, University of California, Berkeley

Available Media

With the rising popularity of DeFi applications it is important to implement protections for regular users of these DeFi platforms against large parties with massive amounts of resources allowing them to engage in market manipulation strategies such as frontrunning/backrunning. Moreover, there are many situations (such as recovery of funds from vulnerable smart contracts) where a user may not want to reveal their transaction until it has been executed. As such, it is clear that preserving the privacy of transactions in the mempool is an important goal.

In this work we focus on achieving mempool transaction privacy through a new primitive that we term batched-threshold encryption, which is a variant of threshold encryption with strict efficiency requirements to better model the needs of resource constrained environments such as blockchains. Unlike the naive use of threshold encryption, which requires communication proportional to O(nB) to decrypt B transactions with a committee of n parties, our batched-threshold encryption scheme only needs O(n) communication. We additionally discuss pitfalls in prior approaches that use (vanilla) threshold encryption for mempool privacy.

To show that our scheme is concretely efficient, we implement our scheme and find that transactions can be encrypted in under 6 ms, independent of committee size, and the communication required to decrypt an entire batch of B transactions is 80 bytes per party, independent of the number of transactions B, making it an attractive choice when communication is very expensive. If deployed on Ethereum, which processes close to 500 transaction per block, it takes close to 2.8 s for each committee member to compute a partial decryption and under 3.5 s to decrypt all transactions for a block in single-threaded mode.

Speculative Denial-of-Service Attacks In Ethereum

Aviv Yaish, The Hebrew University; Kaihua Qin and Liyi Zhou, Imperial College London, UC Berkeley RDI; Aviv Zohar, The Hebrew University; Arthur Gervais, University College London, UC Berkeley RDI

Available Media

Transaction fees compensate actors for resources expended on transactions and can only be charged from transactions included in blocks. But, the expressiveness of Turing-complete contracts implies that verifying if transactions can be included requires executing them on the current blockchain state.

In this work, we show that adversaries can craft malicious transactions that decouple the work imposed on blockchain actors from the compensation offered in return. We introduce three attacks: (i) ConditionalExhaust, a conditional resource-exhaustion attack against blockchain actors. (ii) MemPurge, an attack for evicting transactions from actors' mempools. (iii) GhostTX, an attack on the reputation system used in Ethereum's proposer-builder separation ecosystem.

We evaluate our attacks on an Ethereum testnet and find that by combining ConditionalExhaust and MemPurge, adversaries can simultaneously burden victims' computational resources and clog their mempools to the point where victims are unable to include transactions in blocks. Thus, victims create empty blocks, thereby hurting the system's liveness. The attack's expected cost is $376, but becomes cheaper if adversaries are validators. For other attackers, costs decrease if censorship is prevalent in the network.

ConditionalExhaust and MemPurge are made possible by inherent features of Turing-complete blockchains, and potential mitigations may result in reducing a ledger's scalability.

GuideEnricher: Protecting the Anonymity of Ethereum Mixing Service Users with Deep Reinforcement Learning

Ravindu De Silva, Wenbo Guo, Nicola Ruaro, Ilya Grishchenko, Christopher Kruegel, and Giovanni Vigna, University of California, Santa Barbara

Available Media

Mixing services are widely employed to enhance anonymity on public blockchains. However, recent research has shown that user identities and transaction associations can be derived even with mixing services. This is mainly due to the lack of guidelines for properly using these services. In fact, mixing service developers often provide guidebooks with lists of actions that might break anonymity, and hence, should be avoided. However, such guidebooks remain incomplete, leaving users unaware of potential actions that might compromise their anonymity. This highlights the necessity for providing users with a more comprehensive guidebook. Unfortunately, existing methods for compiling anonymity compromising patterns rely on postmortem analyses, and they cannot proactively discover patterns before the mixing service is deployed.

We introduce GuideEnricher, a proactive approach for extending user guidebooks with limited human intervention. Our key novelty is a deep reinforcement learning (DRL) agent, which automatically explores patterns for transferring tokens via a mixing service. We introduce two customized designs to better guide the agent in discovering yet-unknown anonymity-compromising patterns: design proper tasks for the agent that possibly lead to compromised anonymity, and include a rule-based detector to detect the known patterns. We train the agent to finish the task while evading the detector. Using a trained agent, we conduct a second analysis step, employing clustering methods and manual inspection, to extract yet unknown patterns from the agent's actions. Through extensive evaluation, we demonstrate that GuideEnricher can train effective agents under multiple mixing services. We show that our agents facilitate the discovery of yet-unknown anonymity-compromising patterns. Furthermore, we demonstrate that GuideEnricher can continuously enrich the guidebook via an iterative update of the detector and our DRL agents.

All Your Tokens are Belong to Us: Demystifying Address Verification Vulnerabilities in Solidity Smart Contracts

Tianle Sun, Huazhong University of Science and Technology; Ningyu He, Peking University; Jiang Xiao, Huazhong University of Science and Technology; Yinliang Yue, Zhongguancun Laboratory; Xiapu Luo, The Hong Kong Polytechnic University; Haoyu Wang, Huazhong University of Science and Technology

Available Media

In Ethereum, the practice of verifying the validity of the passed addresses is a common practice, which is a crucial step to ensure the secure execution of smart contracts. Vulnerabilities in the process of address verification can lead to great security issues, and anecdotal evidence has been reported by our community. However, this type of vulnerability has not been well studied. To fill the void, in this paper, we aim to characterize and detect this kind of emerging vulnerability. We design and implement AVVERIFIER, a lightweight taint analyzer based on static EVM opcode simulation. Its three-phase detector can progressively rule out false positives and false negatives based on the intrinsic characteristics. Upon a well-established and unbiased benchmark, AVVERIFIER can improve efficiency 2 to 5 times than the SOTA while maintaining a 94.3% precision and 100% recall. After a large-scale evaluation of over 5 million Ethereum smart contracts, we have identified 812 vulnerable smart contracts that were undisclosed by our community before this work, and 348 open source smart contracts were further verified, whose largest total value locked is over $11.2 billion. We further deploy AVVERIFIER as a real-time detector on Ethereum and Binance Smart Chain, and the results suggest that AVVERIFIER can raise timely warnings once contracts are deployed.

Using My Functions Should Follow My Checks: Understanding and Detecting Insecure OpenZeppelin Code in Smart Contracts

Han Liu, East China Normal University, Shanghai Key Laboratory of Trustworthy Computing; Daoyuan Wu, The Hong Kong University of Science and Technology; Yuqiang Sun, Nanyang Technological University; Haijun Wang, Xi'an Jiaotong University; Kaixuan Li, East China Normal University, Shanghai Key Laboratory of Trustworthy Computing; Yang Liu, Nanyang Technological University; Yixiang Chen, East China Normal University, Shanghai Key Laboratory of Trustworthy Computing

Available Media

OpenZeppelin is a popular framework for building smart contracts. It provides common libraries (e.g., SafeMath), implementations of Ethereum standards (e.g., ERC20), and reusable components for access control and upgradability. However, unlike traditional software libraries, which are typically imported as static linking libraries or dynamic loading libraries, OpenZeppelin is utilized by Solidity contracts in the form of source code. As a result, developers often make custom modifications to their copies of OpenZeppelin code, which may lead to unintended security consequences.

In this paper, we conduct the first systematic study on the security of OpenZeppelin code used in real-world contracts. Specifically, we focus on the security checks in the official OpenZeppelin library and examine whether they are faithfully enforced in the relevant OpenZeppelin functions of real contracts. To this end, we propose a novel tool named ZepScope that comprises two components: MINER and CHECKER. First, MINER analyzes the official OpenZeppelin functions to extract the facts of explicit checks (i.e., the checks defined within the functions) and implicit checks (i.e., the conditions of calling the functions). Second, based on the facts extracted by MINER, CHECKER examines real contracts to identify their OpenZeppelin functions, match their checks with those in the facts, and validate the consequences for those inconsistent checks. By overcoming multiple challenges in developing ZepScope, we obtain not only the first taxonomy of OpenZeppelin checks but also the comprehensive results of checking the top 35,882 contracts from three mainstream blockchains.

Track 5

ML VII: Adversarial Attack Defense

Session Chair: Lujo Bauer

Salon G

Correction-based Defense Against Adversarial Video Attacks via Discretization-Enhanced Video Compressive Sensing

Wei Song, Cong Cong, Haonan Zhong, and Jingling Xue, UNSW Sydney

Available Media

We introduce SECVID, a correction-based framework that defends video recognition systems against adversarial attacks without prior adversarial knowledge. It uses discretization-enhanced video compressive sensing in a black-box preprocessing module, transforming videos into a sparse domain to disperse and neutralize perturbations. While SECVID's discretized compression disrupts perturbation continuity, its reconstruction process minimizes adversarial elements, causing only minor distortions to the original videos. Though not completely restoring adversarial videos, SECVID significantly enhances their quality, enabling accurate classification by SECVID-enhanced video classifiers and preventing adversarial attacks. Tested on C3D and I3D with the UCF-101 and HMDB-51 datasets against five types of advanced video attacks, SECVID outperforms existing defenses, improving detection accuracy by 38.5% to 866.2%. Specifically designed for high-risk environments, SECVID addresses trade-offs like minor accuracy reduction, additional pre-processing training, and longer inference times, with potential optimization through selective security impacting strategies.

Rethinking the Invisible Protection against Unauthorized Image Usage in Stable Diffusion

Shengwei An, Lu Yan, Siyuan Cheng, Guangyu Shen, Kaiyuan Zhang, Qiuling Xu, Guanhong Tao, and Xiangyu Zhang, Purdue University

Available Media

Advancements in generative AI models like Stable Diffusion, DALL·E 2, and Midjourney have revolutionized digital creativity, enabling the generation of authentic-looking images from text and altering existing images with ease. Yet, their capacity poses significant ethical challenges, including replicating an artist's style without consent, the creation of counterfeit images, and potential reputational damage through manipulated content. Protection techniques have emerged to combat misuse by injecting imperceptible noises into images. This paper introduces Insight, a novel approach that challenges the robustness of these protections by aligning protected image features with human visual perception. By using a photo as a reference, approximating the human eye's perspective, Insight effectively neutralizes protective perturbations, enabling the generative model to recapture authentic features. Our extensive evaluation across 3 datasets and 10 protection techniques demonstrates its superiority over existing methods in overcoming protective measures, emphasizing the need for stronger safeguards in digital content generation.

Splitting the Difference on Adversarial Training

Matan Levi and Aryeh Kontorovich, Ben-Gurion University of the Negev

Available Media

The existence of adversarial examples points to a basic weakness of deep neural networks. One of the most effective defenses against such examples, adversarial training, entails training models with some degree of robustness, usually at the expense of a degraded natural accuracy. Most adversarial training methods aim to learn a model that finds, for each class, a common decision boundary encompassing both the clean and perturbed examples. In this work, we take a fundamentally different approach by treating the perturbed examples of each class as a separate class to be learned, effectively splitting each class into two classes: "clean" and "adversarial." This split doubles the number of classes to be learned, but at the same time considerably simplifies the decision boundaries. We provide a theoretical plausibility argument that sheds some light on the conditions under which our approach can be expected to be beneficial. Likewise, we empirically demonstrate that our method learns robust models while attaining optimal or near-optimal natural accuracy, e.g., on CIFAR-10 we obtain near-optimal natural accuracy of 95.01% alongside significant robustness across multiple tasks. The ability to achieve such near-optimal natural accuracy, while maintaining a significant level of robustness, makes our method applicable to real-world applications where natural accuracy is at a premium. As a whole, our main contribution is a general method that confers a significant level of robustness upon classifiers with only minor or negligible degradation of their natural accuracy.

Machine Learning needs Better Randomness Standards: Randomised Smoothing and PRNG-based attacks

Pranav Dahiya, University of Cambridge; Ilia Shumailov, University of Oxford; Ross Anderson, University of Cambridge & University of Edinburgh

Available Media

Randomness supports many critical functions in the field of machine learning (ML) including optimisation, data selection, privacy, and security. ML systems outsource the task of generating or harvesting randomness to the compiler, the cloud service provider or elsewhere in the toolchain. Yet there is a long history of attackers exploiting poor randomness, or even creating it—as when the NSA put backdoors in random number generators to break cryptography. In this paper we consider whether attackers can compromise an ML system using only the randomness on which they commonly rely. We focus our effort on Randomised Smoothing, a popular approach to train certifiably robust models, and to certify specific input datapoints of an arbitrary model. We choose Randomised Smoothing since it is used for both security and safety—to counteract adversarial examples and quantify uncertainty respectively. Under the hood, it relies on sampling Gaussian noise to explore the volume around a data point to certify that a model is not vulnerable to adversarial examples. We demonstrate an entirely novel attack, where an attacker backdoors the supplied randomness to falsely certify either an overestimate or an underestimate of robustness for up to 81 times. We demonstrate that such attacks are possible, that they require very small changes to randomness to succeed, and that they are hard to detect. As an example, we hide an attack in the random number generator and show that the randomness tests suggested by NIST fail to detect it. We advocate updating the NIST guidelines on random number testing to make them more appropriate for safety-critical and security-critical machine-learning applications.

PatchCURE: Improving Certifiable Robustness, Model Utility, and Computation Efficiency of Adversarial Patch Defenses

Chong Xiang, Tong Wu, and Sihui Dai, Princeton University; Jonathan Petit, Qualcomm Technologies, Inc.; Suman Jana, Columbia University; Prateek Mittal, Princeton University

Available Media

State-of-the-art defenses against adversarial patch attacks can now achieve strong certifiable robustness with a marginal drop in model utility. However, this impressive performance typically comes at the cost of 10-100x more inference-time computation compared to undefended models — the research community has witnessed an intense three-way trade-off between certifiable robustness, model utility, and computation efficiency. In this paper, we propose a defense framework named PatchCURE to approach this trade-off problem. PatchCURE provides sufficient "knobs" for tuning defense performance and allows us to build a family of defenses: the most robust PatchCURE instance can match the performance of any existing state-of-the-art defense (without efficiency considerations); the most efficient PatchCURE instance has similar inference efficiency as undefended models. Notably, PatchCURE achieves state-of-the-art robustness and utility performance across all different efficiency levels, e.g., 16-23% absolute clean accuracy and certified robust accuracy advantages over prior defenses when requiring computation efficiency to be close to undefended models. The family of PatchCURE defenses enables us to flexibly choose appropriate defenses to satisfy given computation and/or utility constraints in practice.

Track 6

Language-Based Security

Session Chair: Nathan Burow

Salon H

GHunter: Universal Prototype Pollution Gadgets in JavaScript Runtimes

Eric Cornelissen, Mikhail Shcherbakov, and Musard Balliu, KTH Royal Institute of Technology

Available Media

Prototype pollution is a recent vulnerability that affects JavaScript code, leading to high impact attacks such as arbitrary code execution and privilege escalation. The vulnerability is rooted in JavaScript's prototype-based inheritance, enabling attackers to inject arbitrary properties into an object's prototype at runtime. The impact of prototype pollution depends on the existence of otherwise benign pieces of code (gadgets), which inadvertently read from these attacker-controlled properties to execute security-sensitive operations. While prior works primarily study gadgets in third-party libraries and client-side applications, gadgets in JavaScript runtime environments are arguably more impactful as they affect any application that executes on these runtimes.

In this paper we design, implement, and evaluate a pipeline, GHunter, to systematically detect gadgets in V8-based JavaScript runtimes with prime focus on Node.js and Deno. GHunter supports a lightweight dynamic taint analysis to automatically identify gadget candidates which we validate manually to derive proof-of-concept exploits. We implement GHunter by modifying the V8 engine and the targeted runtimes along with features for facilitating manual validation. Driven by the comprehensive test suites of Node.js and Deno, we use GHunter in a systematic study of gadgets in these runtimes. We identified a total of 56 new gadgets in Node.js and 67 gadgets in Deno, pertaining to vulnerabilities such as arbitrary code execution (19), privilege escalation (31), path traversal (13), and more. Moreover, we systematize, for the first time, existing mitigations for prototype pollution and gadgets in terms of development guidelines. We collect a list of vulnerable applications and revisit the fixes through the lens of our guidelines. Through this exercise, we also identified one high-severity CVE leading to remote code execution, which was due to incorrectly fixing a gadget.

MetaSafe: Compiling for Protecting Smart Pointer Metadata to Ensure Safe Rust Integrity

Martin Kayondo and Inyoung Bang, Seoul National University; Yeongjun Kwak and Hyungon Moon, UNIST; Yunheung Paek, Seoul National University

Available Media

Rust is a programming language designed with a focus on memory safety. It introduces new concepts such as ownership and performs static bounds checks at compile time to ensure spatial and temporal memory safety. For memory operations or data types whose safety the compiler cannot prove at compile time, Rust either explicitly excludes such portions of the program, termed unsafe Rust, from static analysis, or it relies on runtime enforcement using smart pointers. Existing studies have shown that potential memory safety bugs in such unsafe Rust can bring down the entire program, proposing in-process isolation or compartmentalization as a remedy. However, in this study, we show that the safe Rust remains susceptible to memory safety bugs even with the proposed isolation applied. The smart pointers upon which safe Rust's memory safety is built rely on metadata often stored alongside program data, possibly within reach of attackers. Manipulating this metadata, an attacker can nullify safe Rust's memory safety checks dependent on it, causing memory access bugs and exploitation. In response to this issue, we propose MetaSafe, a mechanism that safeguards smart pointer metadata from such attacks. MetaSafe stores smart pointer metadata in a gated memory region where only a predefined set of metadata management functions can write, ensuring that each smart pointer update does not cause safe Rust's memory safety violation. We have implemented MetaSafe by extending the official Rust compiler and evaluated it with a variety of micro- and application benchmarks. The overhead of MetaSafe is found to be low; it incurs a 3.5% average overhead on the execution time of a web browser benchmarks.

RustSan: Retrofitting AddressSanitizer for Efficient Sanitization of Rust

Kyuwon Cho, Jongyoon Kim, Kha Dinh Duy, Hajeong Lim, and Hojoon Lee, Sungkyunkwan University

Available Media

Rust is gaining traction as a safe systems programming language with its strong type and memory safety guarantees. However, Rust's guarantees are not infallible. The use of unsafe Rust, a subvariant of Rust, allows the programmer to temporarily escape the strict Rust language semantics to trade security for flexibility. Memory errors within unsafe blocks in Rust have far-reaching ramifications for the program's safety. As a result, the conventional dynamic memory error detection (e.g., fuzzing) has been adapted as a common practice for Rust and proved its effectiveness through a trophy case of discovered CVEs.

RUSTSAN is a retrofitted design of AddressSanitizer (ASan) for efficient dynamic memory error detection of Rust programs. Our observation is that a significant portion of instrumented memory access sites in a Rust program compiled with ASan is redundant, as the Rust security guarantees can still be valid at the site. RUSTSAN identifies and instruments the sites that definitely or may undermine Rust security guarantees while lifting instrumentation on safe sites. To this end, RUSTSAN employs a cross-IR program analysis for accurate tracking of unsafe sites and also extends ASan's shadow memory scheme for checking non-uniform memory access validation necessary for Rust. We conduct a comprehensive evaluation of RUSTSAN in terms of detection capability and performance using 57 Rust crates. RUSTSAN successfully detected all 31 tested cases of CVE-issued memory errors. Also, RUSTSAN shows an average of 62.3% performance increase against ASan in general benchmarks that involved 20 Rust crates. In the fuzzing experiment with 6 crates, RUSTSAN marked an average of 23.52%, and up to 57.08% of performance improvement.

FV8: A Forced Execution JavaScript Engine for Detecting Evasive Techniques

Nikolaos Pantelaios and Alexandros Kapravelos, North Carolina State University

Available Media

Evasion techniques allow malicious code to never be observed. This impacts significantly the detection capabilities of tools that rely on either dynamic or static analysis, as they never get to process the malicious code. The dynamic nature of JavaScript, where code is often injected dynamically, makes evasions particularly effective. Yet, we lack tools that can detect evasive techniques in a challenging environment such as JavaScript.

In this paper, we present FV8, a modified V8 JavaScript engine designed to identify evasion techniques in JavaScript code. FV8 selectively enforces code execution on APIs that conditionally inject dynamic code, thus enhancing code coverage and consequently improving visibility into malicious code. We integrate our tool in both the Node.js engine and the Chromium browser, compelling code execution in npm packages and Chrome browser extensions. Our tool increases code coverage by 11% compared to default V8 and detects 28 unique evasion categories, including five previously unreported techniques. In data confirmed as malicious from both ecosystems, our tool identifies 1,443 (14.6%) npm packages and 164 (82%) extensions containing at least one type of evasion. In previously unexamined extensions (39,592), our tool discovered 16,471 injected third-party scripts, and a total of 8,732,120 lines of code executed due to our forced execution instrumentation. Furthermore, it tagged a total of 423 extensions as both evasive and malicious and we manually verify 110 extensions (26%) to actually be malicious, impacting two million users. Our tool is open-source and serves both as an in-browser and standalone dynamic analysis tool, capable of detecting evasive code, bypassing obfuscation in certain cases, offering improved access to malicious code, and supporting recursive analysis of dynamic code injections.

DONAPI: Malicious NPM Packages Detector using Behavior Sequence Knowledge Mapping

Cheng Huang, Nannan Wang, Ziyan Wang, Siqi Sun, Lingzi Li, Junren Chen, Qianchong Zhao, Jiaxuan Han, and Zhen Yang, Sichuan University; Lei Shi, Huawei Technologies

Available Media

With the growing popularity of modularity in software development comes the rise of package managers and language ecosystems. Among them, npm stands out as the most extensive package manager, hosting more than 2 million third-party open-source packages that greatly simplify the process of building code. However, this openness also brings security risks, as evidenced by numerous package poisoning incidents.

In this paper, we synchronize a local package cache containing more than 3.4 million packages in near real-time to give us access to more package code details. Further, we perform manual inspection and API call sequence analysis on packages collected from public datasets and security reports to build a hierarchical classification framework and behavioral knowledge base covering different sensitive behaviors. In addition, we propose the DONAPI, an automatic malicious npm packages detector that combines static and dynamic analysis. It makes preliminary judgments on the degree of maliciousness of packages by code reconstruction techniques and static analysis, extracts dynamic API call sequences to confirm and identify obfuscated content that static analysis can not handle alone, and finally tags malicious software packages based on the constructed behavior knowledge base. To date, we have identified and manually confirmed 325 malicious samples and discovered 2 unusual API calls and 246 API call sequences that have not appeared in known samples.

Track 7

Zero-Knowledge Proof II

Session Chair: Aysajan Abidin

Salon IJK

Election Eligibility with OpenID: Turning Authentication into Transferable Proof of Eligibility

Véronique Cortier, Alexandre Debant, Anselme Goetschmann, and Lucca Hirschi, Université de Lorraine, CNRS, Inria, LORIA, France

Available Media

Eligibility checks are often abstracted away or omitted in voting protocols, leading to situations where the voting server can easily stuff the ballot box. One reason for this is the difficulty of bootstraping the authentication material for voters without relying on trusting the voting server.

In this paper, we propose a new protocol that solves this problem by building on OpenID, a widely deployed authentication protocol. Instead of using it as a standard authentication means, we turn it into a mechanism that delivers transferable proofs of eligibility. Using zk-SNARK proofs, we show that this can be done without revealing any compromising information, in particular, protecting everlasting privacy. Our approach remains efficient and can easily be integrated into existing protocols, as we have done for the Belenios voting protocol. We provide a full-fledged proof of concept along with benchmarks showing our protocol could be realistically used in large-scale elections.

Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs

Sebastian Angel, Eleftherios Ioannidis, and Elizabeth Margolin, University of Pennsylvania; Srinath Setty, Microsoft Research; Jess Woods, University of Pennsylvania

Available Media

This paper presents Reef, a system for generating publicly verifiable succinct non-interactive zero-knowledge proofs that a committed document matches or does not match a regular expression. We describe applications such as proving the strength of passwords, the provenance of email despite redactions, the validity of oblivious DNS queries, and the existence of mutations in DNA. Reef supports the Perl Compatible Regular Expression syntax, including wildcards, alternation, ranges, capture groups, Kleene star, negations, and lookarounds. Reef introduces a new type of automata, Skipping Alternating Finite Automata (SAFA), that skips irrelevant parts of a document when producing proofs without undermining soundness, and instantiates SAFA with a lookup argument. Our experimental evaluation confirms that Reef can generate proofs for documents with 32M characters; the proofs are small and cheap to verify (under a second).

Scalable Zero-knowledge Proofs for Non-linear Functions in Machine Learning

Meng Hao, Hanxiao Chen, and Hongwei Li, School of Computer Science and Engineering, University of Electronic Science and Technology of China; Chenkai Weng, Northwestern University; Yuan Zhang and Haomiao Yang, School of Computer Science and Engineering, University of Electronic Science and Technology of China; Tianwei Zhang, Nanyang Technological University

Available Media

Zero-knowledge (ZK) proofs have been recently explored for the integrity of machine learning (ML) inference. However, these protocols suffer from high computational overhead, with the primary bottleneck stemming from the evaluation of non-linear functions. In this paper, we propose the first systematic ZK proof framework for non-linear mathematical functions in ML using the perspective of table lookup. The key challenge is that table lookup cannot be directly applied to non-linear functions in ML since it would suffer from inefficiencies due to the intolerably large table. Therefore, we carefully design several important building blocks, including digital decomposition, comparison, and truncation, such that they can effectively utilize table lookup with a quite small table size while ensuring the soundness of proofs. Based on these building blocks, we implement complex mathematical operations and further construct ZK proofs for current mainstream non-linear functions in ML such as ReLU, sigmoid, and normalization. The extensive experimental evaluation shows that our framework achieves 50∼179× runtime improvement compared to the state-of-the-art work, while maintaining a similar level of communication efficiency.

ZKSMT: A VM for Proving SMT Theorems in Zero Knowledge

Daniel Luick, John C. Kolesar, and Timos Antonopoulos, Yale University; William R. Harris and James Parker, Galois, Inc.; Ruzica Piskac, Yale University; Eran Tromer, Boston University; Xiao Wang and Ning Luo, Northwestern University

Available Media

Verification of program safety is often reducible to proving the unsatisfiability (i.e., validity) of a formula in Satisfiability Modulo Theories (SMT): Boolean logic combined with theories that formalize arbitrary first-order fragments. Zero-knowledge (ZK) proofs allow SMT formulas to be validated without revealing the underlying formulas or their proofs to other parties, which is a crucial building block for proving the safety of proprietary programs. Recently, Luo et al. (CCS 2022) studied the simpler problem of proving the unsatisfiability of pure Boolean formulas but does not support proofs generated by SMT solvers. This work presents ZKSMT, a novel framework for proving the validity of SMT formulas in ZK. We design a virtual machine (VM) tailored to efficiently represent the verification process of SMT validity proofs in ZK. Our VM can support the vast majority of popular theories when proving program safety while being complete and sound. To demonstrate this, we instantiate the commonly used theories of equality and linear integer arithmetic in our VM with theory-specific optimizations for proving them in ZK. ZKSMT achieves high practicality even when running on realistic SMT formulas generated by Boogie, a common tool for software verification. It achieves a three-order-of-magnitude improvement compared to a baseline that executes the proof verification code in a general ZK system.

SoK: What Don't We Know? Understanding Security Vulnerabilities in SNARKs

Stefanos Chaliasos, Imperial College London; Jens Ernstberger, Technical University of Munich; David Theodore, Ethereum Foundation; David Wong, zkSecurity; Mohammad Jahanara, Scroll Foundation; Benjamin Livshits, Imperial College London & Matter Labs

Available Media

Zero-knowledge proofs (ZKPs) have evolved from being a theoretical concept providing privacy and verifiability to having practical, real-world implementations, with SNARKs (Succinct Non-Interactive Argument of Knowledge) emerging as one of the most significant innovations. Prior work has mainly focused on designing more efficient SNARK systems and providing security proofs for them. Many think of SNARKs as "just math," implying that what is proven to be correct and secure is correct in practice. In contrast, this paper focuses on assessing end-to-end security properties of real-life SNARK implementations. We start by building foundations with a system model and by establishing threat models and defining adversarial roles for systems that use SNARKs. Our study encompasses an extensive analysis of 141 actual vulnerabilities in SNARK implementations, providing a detailed taxonomy to aid developers and security researchers in understanding the security threats in systems employing SNARKs. Finally, we evaluate existing defense mechanisms and offer recommendations for enhancing the security of SNARK-based systems, paving the way for more robust and reliable implementations in the future.

2:45 pm–3:15 pm

Coffee and Tea Break

Grand Ballroom Foyer

3:15 pm–4:15 pm

Track 1

Measurement III: Auditing and Best Practices I

Session Chair: Blase Ur

Salon AB

SecurityNet: Assessing Machine Learning Vulnerabilities on Public Models

Boyang Zhang, Zheng Li, Ziqing Yang, Xinlei He, Michael Backes, Mario Fritz, and Yang Zhang, CISPA Helmholtz Center for Information Security

Available Media

While advanced machine learning (ML) models are deployed in numerous real-world applications, previous works demonstrate these models have security and privacy vulnerabilities. Various empirical research has been done in this field. However, most of the experiments are performed on target ML models trained by the security researchers themselves. Due to the high computational resource requirement for training advanced models with complex architectures, researchers generally choose to train a few target models using relatively simple architectures on typical experiment datasets. We argue that to understand ML models' vulnerabilities comprehensively, experiments should be performed on a large set of models trained with various purposes (not just the purpose of evaluating ML attacks and defenses). To this end, we propose using publicly available models with weights from the Internet (public models) for evaluating attacks and defenses on ML models. We establish a database, namely SecurityNet, containing 910 annotated image classification models. We then analyze the effectiveness of several representative attacks/defenses, including model stealing attacks, membership inference attacks, and backdoor detection on these public models. Our evaluation empirically shows the performance of these attacks/defenses can vary significantly on public models compared to self-trained models. We share SecurityNet with the research community and advocate researchers to perform experiments on public models to better demonstrate their proposed methods' effectiveness in the future.

How does Endpoint Detection use the MITRE ATT&CK Framework?

Apurva Virkud, Muhammad Adil Inam, Andy Riddle, Jason Liu, Gang Wang, and Adam Bates, University of Illinois Urbana-Champaign

Available Media

MITRE ATT&CK is an open-source taxonomy of adversary tactics, techniques, and procedures based on real-world observations. Increasingly, organizations leverage ATT&CK technique "coverage" as the basis for evaluating their security posture, while Endpoint Detection and Response (EDR) and Security Indicator and Event Management (SIEM) products integrate ATT&CK into their design as well as marketing. However, the extent to which ATT&CK coverage is suitable to serve as a security metric remains unclear— Does ATT&CK coverage vary meaningfully across different products? Is it possible to achieve total coverage of ATT&CK? Do endpoint products that detect the same attack behaviors even claim to cover the same ATT&CK techniques?

In this work, we attempt to answer these questions by conducting a comprehensive (and, to our knowledge, the first) analysis of endpoint detection products' use of MITRE ATT&CK. We begin by evaluating 3 ATT&CK-annotated detection rulesets from major commercial providers (Carbon Black, Splunk, Elastic) and a crowdsourced ruleset (Sigma) to identify commonalities and underutilized regions of the ATT&CK matrix. We continue by performing a qualitative analysis of unimplemented ATT&CK techniques to determine their feasibility as detection rules. Finally, we perform a consistency analysis of ATT&CK labeling by examining 37 specific threat entities for which at least 2 products include specific detection rules. Combined, our findings highlight the limitations of overdepending on ATT&CK coverage when evaluating security posture; most notably, many techniques are unrealizable as detection rules, and coverage of an ATT&CK technique does not consistently imply coverage of the same real-world threats.

Digital Discrimination of Users in Sanctioned States: The Case of the Cuba Embargo

Anna Ablove, Shreyas Chandrashekaran, Hieu Le, Ram Sundara Raman, and Reethika Ramesh, University of Michigan; Harry Oppenheimer, Georgia Institute of Technology; Roya Ensafi, University of Michigan

Distinguished Paper Award Winner

Available Media

We present one of the first in-depth and systematic end-user centered investigations into the effects of sanctions on geoblocking, specifically in the case of Cuba. We conduct network measurements on the Tranco Top 10K domains and complement our findings with a small-scale user study with a questionnaire. We identify 546 domains subject to geoblocking across all layers of the network stack, ranging from DNS failures to HTTP(S) response pages with a variety of status codes. Through this work, we discover a lack of user-facing transparency; we find 88% of geoblocked domains do not serve informative notice of why they are blocked. Further, we highlight a lack of measurement-level transparency, even among HTTP(S) blockpage responses. Notably, we identify 32 instances of blockpage responses served with 200 OK status codes, despite not returning the requested content. Finally, we note the inefficacy of current improvement strategies and make recommendations to both service providers and policymakers to reduce Internet fragmentation.

A Broad Comparative Evaluation of Software Debloating Tools

Michael D. Brown and Adam Meily, Trail of Bits; Brian Fairservice, Akshay Sood, and Jonathan Dorn, GrammaTech; Eric Kilmer and Ronald Eytchison, Trail of Bits

Available Media

Software debloating tools seek to improve program security and performance by removing unnecessary code, called bloat. While many techniques have been proposed, several barriers to their adoption have emerged. Namely, debloating tools are highly specialized, making it difficult for adopters to find the right type of tool for their needs. This is further hindered by a lack of established metrics and comparative evaluations between tools. To close this information gap, we surveyed 10 years of debloating literature and several tools currently under commercial development to taxonomize knowledge about the debloating ecosystem. We then conducted a broad comparative evaluation of 10 debloating tools to determine their relative strengths and weaknesses. Our evaluation, conducted on a diverse set of 20 benchmark programs, measures tools across 12 performance, security, and correctness metrics.

Our evaluation surfaces several concerning findings that contradict the prevailing narrative in the debloating literature. First, debloating tools lack the maturity required to be used on real-world software, evidenced by a slim 22% overall success rate for creating passable debloated versions of medium- and high-complexity benchmarks. Second, debloating tools struggle to produce sound and robust programs. Using our novel differential fuzzing tool, DIFFER, we discovered that only 13% of our debloating attempts produced a sound and robust debloated program. Finally, our results indicate that debloating tools typically do not improve the performance or security posture of debloated programs by a significant degree according to our evaluation metrics. We believe that our contributions in this paper will help potential adopters better understand the landscape of tools and will motivate future research and development of more capable debloating tools. To this end, we have made our benchmark set, data, and custom tools publicly available.

Track 2

Hardware Security III: Signals

Session Chair: Aurélien Francillon

Salon CD

LaserAdv: Laser Adversarial Attacks on Speech Recognition Systems

Guoming Zhang, Xiaohui Ma, Huiting Zhang, and Zhijie Xiang, Shandong University; Xiaoyu Ji, Zhejiang University; Yanni Yang, Xiuzhen Cheng, and Pengfei Hu, Shandong University

Available Media

Audio adversarial perturbations are imperceptible to humans but can mislead machine learning models, posing a security threat to automatic speech recognition (ASR) systems. Existing methods aim to minimize perturbation values, use acoustic masking, or mimic environmental sounds to render them undetectable. However, these perturbations, being audible frequency range sounds, are still audibly detectable. The slow propagation and rapid attenuation of sound limit their temporal sensitivity and attack range. In this study, we propose LaserAdv, a method that employs lasers to launch adversarial attacks, thereby overcoming the aforementioned challenges due to the superior properties of lasers. In the presence of victim speech, laser adversarial perturbations are superimposed on the speech rather than simply drowning it out, so LaserAdv has higher attack efficiency and longer attack range than LightCommands. LaserAdv introduces a selective amplitude enhancement method based on time-frequency interconversion (SAE-TFI) to deal with distortion. Meanwhile, to simultaneously achieve inaudible, targeted, universal, synchronization-free (over 0.5 s), long-range, and black-box attacks in the physical world, we introduced a series of strategies into the objective function. Our experimental results show that a single perturbation can cause DeepSpeech, Whisper and iFlytek, to misinterpret any of the 12,260 voice commands as the target command with accuracy of up to 100%, 92% and 88%, respectively. The attack distance can be up to 120 m.

MicGuard: A Comprehensive Detection System against Out-of-band Injection Attacks for Different Level Microphone-based Devices

Tiantian Liu, Feng Lin, Zhongjie Ba, Li Lu, Zhan Qin, and Kui Ren, Zhejiang University and Hangzhou High-Tech Zone (Binjiang) Institute of Blockchain and Data Security

Available Media

The integration of microphones into sensors and systems, serving as input interfaces to intelligent applications and industrial manufacture, has raised growing public concerns regarding their input perception. Studies have uncovered the threat of out-of-band injection attacks on microphones, encompassing ultrasound, laser, and electromagnetic attacks, injecting commands or interferences for malicious intentions. However, existing efforts are limited to defense against ultrasound injections, overlooking the risks posed by other out-of-band injections. To address this gap, this paper proposes MicGuard, a comprehensive passive detection system against out-of-band attacks. Without relying on prior information from attacking and victim devices, the key insight of MicGuard is to utilize carrier traces and spectral chaos observed by remaining injection phenomena across different levels of devices. The carrier traces are used in a prejudgment to fast reject partial injected signals, and the following memory-based detection model to distinguish anomaly based on the quantified chaotic entropy extracted from publicly available audio datasets. MicGuard is evaluated on a wide range of microphone-based devices including sensors, recorders, smartphones, and tablets, achieving an average AUC of 98% with high robustness and universality.

VoltSchemer: Use Voltage Noise to Manipulate Your Wireless Charger

Zihao Zhan and Yirui Yang, University of Florida; Haoqi Shan, University of Florida, CertiK; Hanqiu Wang, Yier Jin, and Shuo Wang, University of Florida

Available Media

Wireless charging is becoming an increasingly popular charging solution in portable electronic products for a more convenient and safer charging experience than conventional wired charging. However, our research identified new vulnerabilities in wireless charging systems, making them susceptible to intentional electromagnetic interference. These vulnerabilities facilitate a set of novel attack vectors, enabling adversaries to manipulate the charger and perform a series of attacks.

In this paper, we propose VoltSchemer, a set of innovative attacks that grant attackers control over commercial-off-the-shelf wireless chargers merely by modulating the voltage from the power supply. These attacks represent the first of its kind, exploiting voltage noises from the power supply to manipulate wireless chargers without necessitating any malicious modifications to the chargers themselves. The significant threats imposed by VoltSchemer are substantiated by three practical attacks, where a charger can be manipulated to: control voice assistants via inaudible voice commands, damage devices being charged through overcharging or overheating, and bypass Qi-standard specified foreign-object-detection mechanism to damage valuable items exposed to intense magnetic fields.

We demonstrate the effectiveness and practicality of the VoltSchemer attacks with successful attacks on 9 top-selling COTS wireless chargers. Furthermore, we discuss the security implications of our findings and suggest possible countermeasures to mitigate potential threats.

VibSpeech: Exploring Practical Wideband Eavesdropping via Bandlimited Signal of Vibration-based Side Channel

Chao Wang, Feng Lin, Hao Yan, and Tong Wu, Zhejiang University; Wenyao Xu, University at Buffalo, the State University of New York; Kui Ren, Zhejiang University

Available Media

Vibration-based side channel is an ever-present threat to speech privacy. However, due to the target's frequency response with a rapid decay or limited sampling rate of malicious sensors, the acquired vibration signals are often distorted and narrowband, which fails an intelligible speech recovery. This paper tries to answer that when the side-channel data has only a very limited bandwidth (<500Hz), is it feasible to achieve a wideband eavesdropping based on a practical assumption? Our answer is YES based on the assumption that a short utterance (2s-4s) of the victim is exposed to the attacker. What is most surprising is that the attack can recover speech with a bandwidth of up to 8kHz. This covers almost all phonemes (voiced and unvoiced) in human speech and causes practical threat. The core idea of the attack is using vocal-tract features extracted from the victim's utterance to compensate for the side-channel data. To demonstrate the threat, we proposed a vocal-guided attack scheme called VibSpeech and built a prototype based on a mmWave sensor to penetrate soundproof walls for vibration sensing. We solved challenges of vibration artifact suppression and a generalized scheme free of any target's training data. We evaluated VibSpeech with extensive experiments and validated it on the IMU-based method. The results indicated that VibSpeech can recover intelligible speech with an average MCD/SNR of 3.9/5.4dB.

Track 3

System Security III: Memory I

Session Chair: Manuel Egele

Salon E

CAMP: Compiler and Allocator-based Heap Memory Protection

Zhenpeng Lin, Zheng Yu, Ziyi Guo, Simone Campanoni, Peter Dinda, and Xinyu Xing, Northwestern University

Available Media

The heap is a critical and widely used component of many applications. Due to its dynamic nature, combined with the complexity of heap management algorithms, it is also a frequent target for security exploits. To enhance the heap's security, various heap protection techniques have been introduced, but they either introduce significant runtime overhead or have limited protection. We present CAMP, a new sanitizer for detecting and capturing heap memory corruption. CAMP leverages a compiler and a customized memory allocator. The compiler adds boundary-checking and escape-tracking instructions to the target program, while the memory allocator tracks memory ranges, coordinates with the instrumentation, and neutralizes dangling pointers. With the novel error detection scheme, CAMP enables various compiler optimization strategies and thus eliminates redundant and unnecessary check instrumentation. This design minimizes runtime overhead without sacrificing security guarantees. Our evaluation and comparison of CAMP with existing tools, using both real-world applications and SPEC CPU benchmarks, show that it provides even better heap corruption detection capability with lower runtime overhead.

GPU Memory Exploitation for Fun and Profit

Yanan Guo, University of Rochester; Zhenkai Zhang, Clemson University; Jun Yang, University of Pittsburgh

Available Media

As modern applications increasingly rely on GPUs to accelerate the computation, it has become very critical to study and understand the security implications of GPUs. In this work, we conduct a thorough examination of buffer overflows on modern GPUs. Specifically, we demonstrate that, due to GPU's unique memory system, GPU programs suffer from different and more complex buffer overflow vulnerabilities compared to CPU programs, contradicting the conclusions of prior studies. In addition, despite the critical role GPUs play in modern computing, GPU systems are missing essential memory protection mechanisms. Consequently, when buffer overflow vulnerabilities are exploited by an attacker, they can lead to both code injection attacks and code reuse attacks, including return-oriented programming (ROP). Our results show that these attacks pose a significant security risk to modern GPU applications.

SLUBStick: Arbitrary Memory Writes through Practical Software Cross-Cache Attacks within the Linux Kernel

Lukas Maar, Stefan Gast, Martin Unterguggenberger, Mathias Oberhuber, and Stefan Mangard, Graz University of Technology

Available Media

While the number of vulnerabilities in the Linux kernel has increased significantly in recent years, most have limited capabilities, such as corrupting a few bytes in restricted allocator caches. To elevate their capabilities, security researchers have proposed software cross-cache attacks, exploiting the memory reuse of the kernel allocator. However, such cross-cache attacks are impractical due to their low success rate of only 40 %, with failure scenarios often resulting in a system crash.

In this paper, we present SLUBStick, a novel kernel exploitation technique elevating a limited heap vulnerability to an arbitrary memory read-and-write primitive. SLUBStick operates in multiple stages: Initially, it exploits a timing side channel of the allocator to perform a cross-cache attack reliably. Concretely, exploiting the side-channel leakage pushes the success rate to above 99 % for frequently used generic caches. SLUBStick then exploits code patterns prevalent in the Linux kernel to convert a limited heap vulnerability into a page table manipulation, thereby granting the capability to read and write memory arbitrarily. We demonstrate the applicability of SLUBStick by systematically analyzing two Linux kernel versions, v5.19 and v6.2. Lastly, we evaluate SLUBStick with a synthetic vulnerability and 9 real-world CVEs, showcasing privilege escalation and container escape in the Linux kernel with state-of-the-art kernel defenses enabled.

Detecting Kernel Memory Bugs through Inconsistent Memory Management Intention Inferences

Dinghao Liu, Zhipeng Lu, and Shouling Ji, Zhejiang University; Kangjie Lu, University of Minnesota; Jianhai Chen and Zhenguang Liu, Zhejiang University; Dexin Liu, Peking University; Renyi Cai, Alibaba Cloud Computing Co., Ltd; Qinming He, Zhejiang University

Available Media

Modern operating system kernels, typically written in low-level languages such as C and C++, are tasked with managing extensive memory resources. Memory-related errors, such as memory leak and memory corruption, are common occurrences and constantly introduced. Traditional detection methods often rely on taint analysis, which suffers from scalability issue (i.e., path explosion) when applied to complex OS kernels. Recent research has pivoted towards leveraging techniques like function pairing or similarity analysis to overcome this challenge. These approaches identify memory errors by referencing code that is either frequently used or semantically similar. However, these techniques have limitations when applied to customized code, which may lack a sufficient corpus of code snippets to facilitate effective function pairing or similarity analysis. This deficiency hinders their applicability in kernel analysis where unique or proprietary code is prevalent.

In this paper, we propose a novel methodology for detecting memory bugs based on inconsistent memory management intentions (IMMI). Our insight is that many memory bugs, despite their varied manifestations, stem from a common underlying issue: the ambiguity in ownership and lifecycle management of memory objects, especially when these objects are passed across various functions. Memory bugs emerge when the mem- ory management strategies of the caller and callee functions misalign for a given memory object. IMMI aims to model and clarify these inconsistent intentions, thereby mitigating the prevalence of such bugs. Our methodology offers two primary advantages over existing techniques: (1) It utilizes a fine-grained memory management model that obviates the need for extensive data-flow tracking, and (2) it does not rely on similarity analysis or the identification of function pairs, making it highly effective in the context of customized code. To enhance the capabilities of IMMI, we have integrated a large language model (LLM) to assist in the interpretation of implicit kernel resource management mechanisms. We have implemented IMMI and evaluated it against the Linux kernel. IMMI effectively found 80 new memory bugs (including 23 memory corruptions and 57 memory leaks) with 35% false positive rate. Most of them are missed by the state-of-the-art memory bug detection tools.

Track 4

Web Security II: Privacy

Session Chair: Ben Stock

Salon F

Near-Optimal Constrained Padding for Object Retrievals with Dependencies

Pranay Jain, Duke University; Andrew C. Reed, United States Military Academy; Michael K. Reiter, Duke University

Available Media

The sizes of objects retrieved over the network are powerful indicators of the objects retrieved and are ingredients in numerous types of traffic analysis, such as webpage fingerprinting. We present an algorithm by which a benevolent object store computes a memoryless padding scheme to pad objects before sending them, in a way that bounds the information gain that the padded sizes provide to the network observer about the objects being retrieved. Moreover, our algorithm innovates over previous works in two critical ways. First, the computed padding scheme satisfies constraints on the padding overhead: no object is padded to more than c x its original size, for a tunable factor c > 1. Second, the privacy guarantees of the padding scheme allow for object retrievals that are not independent, as could be caused by hyperlinking. We show in empirical tests that our padding schemes improve dramatically over previous schemes for padding dependent object retrievals, providing better privacy at substantially lower padding overhead, and over known techniques for padding independent object retrievals subject to padding overhead constraints.

PURL: Safe and Effective Sanitization of Link Decoration

Shaoor Munir and Patrick Lee, University of California, Davis; Umar Iqbal, Washington University in St. Louis; Zubair Shafiq, University of California, Davis; Sandra Siby, Imperial College London

Available Media

While privacy-focused browsers have taken steps to block third-party cookies and mitigate browser fingerprinting, novel tracking techniques that can bypass existing countermeasures continue to emerge. Since trackers need to share information from the client-side to the server-side through link decoration regardless of the tracking technique they employ, a promising orthogonal approach is to detect and sanitize tracking information in decorated links. To this end, we present PURL (pronounced purel-l), a machine learning approach that leverages a cross-layer graph representation of webpage execution to safely and effectively sanitize link decoration. Our evaluation shows that PURL significantly outperforms existing countermeasures in terms of accuracy and reducing website breakage while being robust to common evasion techniques. PURL's deployment on a sample of top-million websites shows that link decoration is abused for tracking on nearly three-quarters of the websites, often to share cookies, email addresses, and fingerprinting information.

Fledging Will Continue Until Privacy Improves: Empirical Analysis of Google's Privacy-Preserving Targeted Advertising

Giuseppe Calderonio, Mir Masood Ali, and Jason Polakis, University of Illinois Chicago

Available Media

Google recently announced plans to phase out third-party cookies and is currently in the process of rolling out the Chrome Privacy Sandbox, a collection of APIs and web standards that offer privacy-preserving alternatives to existing technologies, particularly for the digital advertising ecosystem. This includes FLEDGE, also referred to as the Protected Audience, which provides the necessary mechanisms for effectively conducting real-time bidding and ad auctions directly within users' browsers. FLEDGE is designed to eliminate the invasive data collection and pervasive tracking practices used for remarketing and targeted advertising. In this paper, we provide a study of the FLEDGE ecosystem both before and after its official deployment in Chrome. We find that even though multiple prominent ad platforms have entered the space, Google ran 99.8% of the auctions we observed, highlighting its dominant role. Subsequently, we provide the first in-depth empirical analysis of FLEDGE, and uncover a series of severe design and implementation flaws. We leverage those for conducting 12 novel attacks, including tracking, cross-site leakage, service disruption, and pollution attacks. While FLEDGE aims to enhance user privacy, our research demonstrates that it is currently exposing users to significant risks, and we outline mitigations for addressing the issues that we have uncovered. We have also responsibly disclosed our findings to Google so as to kickstart remediation efforts. We believe that our research highlights the dire need for more in-depth investigations of the entire Privacy Sandbox, due to the massive impact it will have on user privacy.

Stop, Don't Click Here Anymore: Boosting Website Fingerprinting By Considering Sets of Subpages

Asya Mitseva and Andriy Panchenko, Brandenburg University of Technology (BTU Cottbus, Germany)

Available Media

A type of traffic analysis, website fingerprinting (WFP), aims to reveal the website a user visits over an encrypted and anonymized connection by observing and analyzing data flow patterns. Its efficiency against anonymization networks such as Tor has been widely studied, resulting in methods that have steadily increased in both complexity and power. While modern WFP attacks have proven to be highly accurate in laboratory settings, their real-world feasibility is highly debated. These attacks also exclude valuable information by ignoring typical user browsing behavior: users often visit multiple pages of a single website sequentially, e.g., by following links.

In this paper, we aim to provide a more realistic assessment of the degree to which Tor users are exposed to WFP. We propose both a novel WFP attack and efficient strategies for adapting existing methods to account for sequential visits of pages within a website. While existing WFP attacks fail to detect almost any website in real-world settings, our novel methods achieve F1-scores of 1.0 for more than half of the target websites. Our attacks remain robust against state-of-the-art WFP defenses, achieving 2.5 to 5 times the accuracy of prior work, and in some cases even rendering the defenses useless. Our methods enable to estimate and to communicate to the user the risk of successive page visits within a website (even in the presence of noise pages) to stop before the WFP attack reaches a critical level of confidence.

Track 5

ML VIII: Backdoors and Federated Learning

Session Chair: Xiaofeng Wang

Salon G

Lurking in the shadows: Unveiling Stealthy Backdoor Attacks against Personalized Federated Learning

Xiaoting Lyu, Beijing Jiaotong University; Yufei Han, INRIA; Wei Wang, Jingkai Liu, and Yongsheng Zhu, Beijing Jiaotong University; Guangquan Xu, Tianjin University; Jiqiang Liu, Beijing Jiaotong University; Xiangliang Zhang, University of Notre Dame

Available Media

Federated Learning (FL) is a collaborative machine learning technique where multiple clients work together with a central server to train a global model without sharing their private data. However, the distribution shift across non-IID datasets of clients poses a challenge to this one-model-fits-all method hindering the ability of the global model to effectively adapt to each client's unique local data. To echo this challenge, personalized FL (PFL) is designed to allow each client to create personalized local models tailored to their private data.

While extensive research has scrutinized backdoor risks in FL, it has remained underexplored in PFL applications. In this study, we delve deep into the vulnerabilities of PFL to backdoor attacks. Our analysis showcases a tale of two cities. On the one hand, the personalization process in PFL can dilute the backdoor poisoning effects injected into the personalized local models. Furthermore, PFL systems can also deploy both server-end and client-end defense mechanisms to strengthen the barrier against backdoor attacks. On the other hand, our study shows that PFL fortified with these defense methods may offer a false sense of security. We propose PFedBA, a stealthy and effective backdoor attack strategy applicable to PFL systems. PFedBA ingeniously aligns the backdoor learning task with the main learning task of PFL by optimizing the trigger generation process. Our comprehensive experiments demonstrate the effectiveness of PFedBA in seamlessly embedding triggers into personalized local models. PFedBA yields outstanding attack performance across 10 state-of-the-art PFL algorithms, defeating the existing 6 defense mechanisms. Our study sheds light on the subtle yet potent backdoor threats to PFL systems, urging the community to bolster defenses against emerging backdoor challenges.

ACE: A Model Poisoning Attack on Contribution Evaluation Methods in Federated Learning

Zhangchen Xu, Fengqing Jiang, and Luyao Niu, University of Washington; Jinyuan Jia, Pennsylvania State University; Bo Li, University of Chicago; Radha Poovendran, University of Washington

Available Media

In Federated Learning (FL), a set of clients collaboratively train a machine learning model (called global model) without sharing their local training data. The local training data of clients is typically non-i.i.d. and heterogeneous, resulting in varying contributions from individual clients to the final performance of the global model. In response, many contribution evaluation methods were proposed, where the server could evaluate the contribution made by each client and incentivize the high-contributing clients to sustain their long-term participation in FL. Existing studies mainly focus on developing new metrics or algorithms to better measure the contribution of each client. However, the security of contribution evaluation methods of FL operating in adversarial environments is largely unexplored. In this paper, we propose the first model poisoning attack on contribution evaluation methods in FL, termed ACE. Specifically, we show that any malicious client utilizing ACE could manipulate the parameters of its local model such that it is evaluated to have a high contribution by the server, even when its local training data is indeed of low quality. We perform both theoretical analysis and empirical evaluations of ACE. Theoretically, we show our design of ACE can effectively boost the malicious client's perceived contribution when the server employs the widely-used cosine distance metric to measure contribution. Empirically, our results show ACE effectively and efficiently deceive five state-of-the-art contribution evaluation methods. In addition, ACE preserves the accuracy of the final global models on testing inputs. We also explore six countermeasures to defend ACE. Our results show they are inadequate to thwart ACE, highlighting the urgent need for new defenses to safeguard the contribution evaluation methods in FL.

BackdoorIndicator: Leveraging OOD Data for Proactive Backdoor Detection in Federated Learning

Songze Li, Southeast University; Yanbo Dai, HKUST(GZ)

Available Media

In a federated learning (FL) system, decentralized data owners (clients) could upload their locally trained models to a central server, to jointly train a global model. Malicious clients may plant backdoors into the global model through uploading poisoned local models, causing misclassification to a target class when encountering attacker-defined triggers. Existing backdoor defenses show inconsistent performance under different system and adversarial settings, especially when the malicious updates are made statistically close to the benign ones. In this paper, we first reveal the fact that planting subsequent backdoors with the same target label could significantly help to maintain the accuracy of previously planted backdoors, and then propose a novel proactive backdoor detection mechanism for FL named BackdoorIndicator, which has the server inject indicator tasks into the global model leveraging out-of-distribution (OOD) data, and then utilizing the fact that any backdoor samples are OOD samples with respect to benign samples, the server, who is completely agnostic of the potential backdoor types and target labels, can accurately detect the presence of backdoors in uploaded models, via evaluating the indicator tasks. We perform systematic and extensive empirical studies to demonstrate the consistently superior performance and practicality of BackdoorIndicator over baseline defenses, across a wide range of system and adversarial settings.

UBA-Inf: Unlearning Activated Backdoor Attack with Influence-Driven Camouflage

Zirui Huang, Yunlong Mao, and Sheng Zhong, Nanjing University

Available Media

Machine-Learning-as-a-Service (MLaaS) is an emerging product to meet the market demand. However, end users are required to upload data to the remote server when using MLaaS, raising privacy concerns. Since the right to be forgotten came into effect, data unlearning has been widely supported in on-cloud products for removing users' private data from remote datasets and machine learning models. Plenty of machine unlearning methods have been proposed recently to erase the influence of forgotten data. Unfortunately, we find that machine unlearning makes the on-cloud model highly vulnerable to backdoor attacks. In this paper, we report a new threat against models with unlearning enabled and implement an Unlearning Activated Backdoor Attack with Influence-driven camouflage (UBA-Inf). Unlike conventional backdoor attacks, UBA-Inf provides a new backdoor approach for effectiveness and stealthiness by activating the camouflaged backdoor through machine unlearning. The proposed approach can be implemented using off-the-shelf backdoor generating algorithms. Moreover, UBA-Inf is an "on-demand" attack, offering fine-grained control of backdoor activation through unlearning requests, overcoming backdoor vanishing and exposure problems. By extensively evaluating UBA-Inf, we conclude that UBA-Inf is a powerful backdoor approach that improves stealthiness, robustness, and persistence.

Track 6

Software Security + ML 1

Session Chair: Tiffany Bao

Salon H

Racing on the Negative Force: Efficient Vulnerability Root-Cause Analysis through Reinforcement Learning on Counterexamples

Dandan Xu, SKLOIS, Institute of Information Engineering, Chinese Academy of Sciences, China, and School of Cyber Security, University of Chinese Academy of Sciences, China; Di Tang, Yi Chen, and XiaoFeng Wang, Indiana University Bloomington; Kai Chen, SKLOIS, Institute of Information Engineering, Chinese Academy of Sciences, China, and School of Cyber Security, University of Chinese Academy of Sciences, China; Haixu Tang, Indiana University Bloomington; Longxing Li, SKLOIS, Institute of Information Engineering, Chinese Academy of Sciences, China, and School of Cyber Security, University of Chinese Academy of Sciences, China

Available Media

Root-Cause Analysis (RCA) is crucial for discovering security vulnerabilities from fuzzing outcomes. Automating this process through triaging the crashes observed during the fuzzing process, however, is considered to be challenging. Particularly, today's statistical RCA approaches are known to be exceedingly slow, often taking tens of hours or even a week to analyze a crash. This problem comes from the biased sampling such approaches perform. More specifically, given an input inducing a crash in a program, these approaches sample around the input by mutating it to generate new test cases; these cases are used to fuzz the program, in a hope that a set of program elements (blocks, instructions or predicates) on the execution path of the original input can be adequately sampled so their correlations with the crash can be determined. This process, however, tends to generate the input samples more likely causing the crash, with their execution paths involving a similar set of elements, which become less distinguishable until a large number of samples have been made. We found that this problem can be effectively addressed by sampling around "counterexamples'', the inputs causing a significant change to the current estimates of correlations. These inputs though still involving the elements often do not lead to the crash. They are found to be effective in differentiating program elements, thereby accelerating the RCA process. Based upon the understanding, we designed and implemented a reinforcement learning (RL) technique that rewards the operations involving counterexamples. By balancing random sampling with the exploitation on the counterexamples, our new approach, called RACING, is shown to substantially elevate the scalability and the accuracy of today's statistical RCA, outperforming the state-of-the-art by more than an order of magnitude.

Uncovering the Limits of Machine Learning for Automatic Vulnerability Detection

Niklas Risse and Marcel Böhme, MPI-SP, Germany

Available Media

Recent results of machine learning for automatic vulnerability detection (ML4VD) have been very promising. Given only the source code of a function f, ML4VD techniques can decide if f contains a security flaw with up to 70% accuracy. However, as evident in our own experiments, the same top-performing models are unable to distinguish between functions that contain a vulnerability and functions where the vulnerability is patched. So, how can we explain this contradiction and how can we improve the way we evaluate ML4VD techniques to get a better picture of their actual capabilities?

In this paper, we identify overfitting to unrelated features and out-of-distribution generalization as two problems, which are not captured by the traditional approach of evaluating ML4VD techniques. As a remedy, we propose a novel benchmarking methodology to help researchers better evaluate the true capabilities and limits of ML4VD techniques. Specifically, we propose (i) to augment the training and validation dataset according to our cross-validation algorithm, where a semantic preserving transformation is applied during the augmentation of either the training set or the testing set, and (ii) to augment the testing set with code snippets where the vulnerabilities are patched.

Using six ML4VD techniques and two datasets, we find (a) that state-of-the-art models severely overfit to unrelated features for predicting the vulnerabilities in the testing data, (b) that the performance gained by data augmentation does not generalize beyond the specific augmentations applied during training, and (c) that state-of-the-art ML4VD techniques are unable to distinguish vulnerable functions from their patches.

Improving ML-based Binary Function Similarity Detection by Assessing and Deprioritizing Control Flow Graph Features

Jialai Wang, Tsinghua University; Chao Zhang, Tsinghua University and Zhongguancun Laboratory; Longfei Chen and Yi Rong, Tsinghua University; Yuxiao Wu, Huazhong University of Science and Technology; Hao Wang, Wende Tan, and Qi Li, Tsinghua University; Zongpeng Li, Tsinghua University and Quancheng Labs

Available Media

Machine learning-based binary function similarity detection (ML-BFSD) has witnessed significant progress recently. They often choose control flow graph (CFG) as an important feature to learn out of functions, as CFGs characterize the control dependencies between basic code blocks. However, the exact role of CFGs in model decisions is not explored, and the extent to which CFGs might lead to model errors is unknown. This work takes a first step towards assessing the role of CFGs in ML-BFSD solutions both theoretically and practically, and promotes their performance accordingly. First, we adapt existing explanation methods to interpreting ML-BFSD solutions, and theoretically reveal that existing models heavily rely on CFG features. Then, we design a solution deltaCFG to manipulate CFGs and practically demonstrate the lack of robustness of existing models. We have extensively evaluated deltaCFG on 11 state-of-the-art (SOTA) ML-BFSD solutions, and find that the models' results would flip if we manipulate the query functions' CFGs but keep semantics, showing that most models have bias on CFG features. Our theoretic and practical assessment solutions can also serve as a robustness validator for the development of future ML-BFSD solutions. Lastly, we present a solution to utilize deltaCFG to augment training data, which helps deprioritize CFG features and enhance the performance of existing ML-BFSD solutions. Evaluation results show that, MRR, Recall@1, AUC and F1 score of existing models are improved by up to 10.1%, 12.7%, 5.1%, and 27.2% respectively, proving that reducing the models' bias on CFG features could improve their performance.

TYGR: Type Inference on Stripped Binaries using Graph Neural Networks

Chang Zhu, Arizona State University; Ziyang Li and Anton Xue, University of Pennsylvania; Ati Priya Bajaj, Wil Gibbs, and Yibo Liu, Arizona State University; Rajeev Alur, University of Pennsylvania; Tiffany Bao, Arizona State University; Hanjun Dai, Google; Adam Doupé, Arizona State University; Mayur Naik, University of Pennsylvania; Yan Shoshitaishvili and Ruoyu Wang, Arizona State University; Aravind Machiry, Purdue University

Available Media

Binary type inference is a core research challenge in binary program analysis and reverse engineering. It concerns identifying the data types of registers and memory values in a stripped executable (or object file), whose type information is discarded during compilation. Current methods rely on either manually crafted inference rules, which are brittle and demand significant effort to update, or machine learning-based approaches that suffer from low accuracy.

In this paper we propose TYGR, a graph neural network based solution that encodes data-flow information for inferring both basic and struct variable types in stripped binary programs. To support different architectures and compiler optimizations, TYGR was implemented on top of the ANGR binary analysis platform and uses an architecture-agnostic data-flow analysis to extract a graph-based intra-procedural representation of data-flow information.

We noticed a severe lack of diversity in existing binary executables datasets and created TyDa, a large dataset of diverse binary executables. The sole publicly available dataset, provided by STATEFORMER, contains only 1% of the total number of functions in TyDa. TYGR is trained and evaluated on a subset of TyDa and generalizes to the rest of the dataset. TYGR demonstrates an overall accuracy of 76.6% and struct type accuracy of 45.2% on the x64 dataset across four optimization levels (O0-O3). TYGR outperforms existing works by a minimum of 26.1% in overall accuracy and 10.2% in struct accuracy.

Track 7

Crypto III: Password and Secret Key

Session Chair: Kassem Fawaz

Salon IJK

MFKDF: Multiple Factors Knocked Down Flat

Matteo Scarlata and Matilda Backendal, ETH Zurich; Miro Haller, UC San Diego

Available Media

Nair and Song (USENIX 2023) introduce the concept of a Multi-Factor Key Derivation Function (MFKDF), along with constructions and a security analysis. MFKDF integrates dynamic authentication factors, such as HOTP and hardware tokens, into password-based key derivation. The aim is to improve the security of password-derived keys, which can then be used for encryption or as an alternative to multi-factor authentication. The authors claim an exponential security improvement compared to traditional password-based key derivation functions (PBKDF).

We show that the MFKDF constructions proposed by Nair and Song fall short of the stated security goals. Underspecified cryptographic primitives and the lack of integrity of the MFKDF state lead to several attacks, ranging from full key recovery when an HOTP factor is compromised, to bypassing factors entirely or severely reducing their entropy. We reflect on the different threat models of key-derivation and authentication, and conclude that MFKDF is always weaker than plain PBKDF and multi-factor authentication in each setting.

LaKey: Efficient Lattice-Based Distributed PRFs Enable Scalable Distributed Key Management

Matthias Geihs, Torus Labs; Hart Montgomery, Linux Foundation

Available Media

Distributed key management (DKM) services are multi-party services that allow their users to outsource the generation, storage, and usage of cryptographic private keys, while guaranteeing that none of the involved service providers learn the private keys in the clear. This is typically achieved through distributed key generation (DKG) protocols, where the service providers generate the keys on behalf of the users in an interactive protocol, and each of the servers stores a share of each key as the result. However, with traditional DKM systems, the key material stored by each server grows linearly with the number of users.

An alternative approach to DKM is via distributed key derivation (DKD) where the user key shares are derived on-demand from a constant-size (in the number of users) secret-shared master key and the corresponding user's identity, which is achieved by employing a suitable distributed pseudorandom function (dPRF). However, existing suitable dPRFs require on the order of 100 interaction rounds between the servers and are therefore insufficient for settings with high network latency and where users demand real-time interaction.

To resolve the situation, we initiate the study of lattice-based distributed PRFs, with a particular focus on their application to DKD. Concretely, we show that the LWE-based PRF presented by Boneh et al. at CRYPTO'13 can be turned into a distributed PRF suitable for DKD that runs in only 8 online rounds, which is an improvement over the start-of-the-art by an order of magnitude. We further present optimizations of this basic construction. We show a new construction with improved communication efficiency proven secure under the same "standard" assumptions. Then, we present even more efficient constructions, running in as low as 5 online rounds, from non-standard, new lattice-based assumptions. We support our findings by implementing and evaluating our protocol using the MP-SPDZ framework (Keller, CCS '20). Finally, we give a formal definition of our DKD in the UC framework and prove a generic construction (for which our construction qualifies) secure in this model.

Exploiting Leakage in Password Managers via Injection Attacks

Andrés Fábrega, Armin Namavari, and Rachit Agarwal, Cornell University; Ben Nassi, Cornell Tech, Technion - Israel Institute of Technology; Thomas Ristenpart, Cornell University, Cornell Tech

Available Media

This work explores injection attacks against password managers. In this setting, the adversary (only) controls their own application client, which they use to ''inject" chosen payloads to a victim's client via, for example, sharing credentials with them. The injections are interleaved with adversarial observations of some form of protected state (such as encrypted vault exports or the network traffic received by the application servers), from which the adversary backs out confidential information. We uncover a series of general design patterns in popular password managers that lead to vulnerabilities allowing an adversary to efficiently recover passwords, URLs, usernames, and attachments. We develop general attack templates to exploit these design patterns and experimentally showcase their practical efficacy via analysis of ten distinct password manager applications. We disclosed our findings to these vendors, many of which deployed mitigations.

OPTIKS: An Optimized Key Transparency System

Julia Len, Cornell Tech; Melissa Chase, Esha Ghosh, Kim Laine, and Radames Cruz Moreno, Microsoft Research

Available Media

Key Transparency (KT) refers to a public key distribution system with transparency mechanisms proving its correct operation, i.e., proving that it reports consistent values for each user's public key. While prior work on KT systems have offered new designs to tackle this problem, relatively little attention has been paid on the issue of scalability. Indeed, it is not straightforward to actually build a scalable and practical KT system from existing constructions, which may be too complex, inefficient, or non-resilient against machine failures.

In this paper, we present OPTIKS, a full featured and optimized KT system that focuses on scalability. Our system is simpler and more performant than prior work, supporting smaller storage overhead while still meeting strong notions of security and privacy. Our design also incorporates a crash-tolerant and scalable server architecture, which we demonstrate by presenting extensive benchmarks. Finally, we address several real-world problems in deploying KT systems that have received limited attention in prior work, including account decommissioning and user-to-device mapping.

4:15 pm–4:30 pm

Short Break

Grand Ballroom Foyer

4:30 pm–5:30 pm

Track 1

Social Issues III: Social Media Platform

Session Chair: Shuang Hao

Salon AB

Understanding the Security and Privacy Implications of Online Toxic Content on Refugees

Arjun Arunasalam, Purdue University; Habiba Farrukh, University of California, Irvine; Eliz Tekcan and Z. Berkay Celik, Purdue University

Available Media

Deteriorating conditions in regions facing social and political turmoil have resulted in the displacement of huge populations known as refugees. Technologies such as social media have helped refugees adapt to challenges in their new homes. While prior works have investigated refugees' computer security and privacy (S&P) concerns, refugees' increasing exposure to toxic content and its implications have remained largely unexplored. In this paper, we answer how toxic content can influence refugees' S&P actions, goals, and barriers, and how their experiences shape these factors. Through semi-structured interviews with refugee liaisons (n=12), focus groups (n=9, 27 participants), and an online survey (n=29) with refugees, we discover unique attack contexts (e.g., participants are targeted after responding to posts directed against refugees) and how intersecting identities (e.g., LGBTQ+, women) exacerbate attacks. In response to attacks, refugees take immediate actions (e.g., selective blocking) or long-term behavioral shifts (e.g., ensuring uploaded photos are void of landmarks) These measures minimize vulnerability and discourage attacks, among other goals, while participants acknowledge barriers to measures (e.g., anonymity impedes family reunification). Our findings highlight lessons in better equipping refugees to manage toxic content attacks.

Understanding Help-Seeking and Help-Giving on Social Media for Image-Based Sexual Abuse

Miranda Wei, University of Washington / Google; Sunny Consolvo and Patrick Gage Kelley, Google; Tadayoshi Kohno, University of Washington; Tara Matthews and Sarah Meiklejohn, Google; Franziska Roesner, University of Washington; Renee Shelby, Kurt Thomas, and Rebecca Umbach, Google

Available Media

Image-based sexual abuse (IBSA), like other forms of technology-facilitated abuse, is a growing threat to people's digital safety. Attacks include unwanted solicitations for sexually explicit images, extorting people under threat of leaking their images, or purposefully leaking images to enact revenge or exert control. In this paper, we explore how people seek and receive help for IBSA on social media. Specifically, we identify over 100,000 Reddit posts that engage relationship and advice communities for help related to IBSA. We draw on a stratified sample of 261 posts to qualitatively examine how various types of IBSA unfold, including the mapping of gender, relationship dynamics, and technology involvement to different types of IBSA. We also explore the support needs of victim-survivors experiencing IBSA and how communities help victim-survivors navigate their abuse through technical, emotional, and relationship advice. Finally, we highlight sociotechnical gaps in connecting victim-survivors with important care, regardless of whom they turn to for help.

Enabling Contextual Soft Moderation on Social Media through Contrastive Textual Deviation

Pujan Paudel, Mohammad Hammas Saeed, Rebecca Auger, Chris Wells, and Gianluca Stringhini, Boston University

Available Media

Automated soft moderation systems are unable to ascertain if a post supports or refutes a false claim, resulting in a large number of contextual false positives. This limits their effectiveness, for example undermining trust in health experts by adding warnings to their posts or resorting to vague warnings instead of granular fact-checks, which result in desensitizing users. In this paper, we propose to incorporate stance detection into existing automated soft-moderation pipelines, with the goal of ruling out contextual false positives and providing more precise recommendations for social media content that should receive warnings. We develop a textual deviation task called Contrastive Textual Deviation (CTD), and show that it outperforms existing stance detection approaches when applied to soft moderation. We then integrate CTD into the state-of-the-art system for automated soft moderation Lambretta, showing that our approach can reduce contextual false positives from 20% to 2.1%, providing another important building block towards deploying reliable automated soft moderation tools on social media.

The Imitation Game: Exploring Brand Impersonation Attacks on Social Media Platforms

Bhupendra Acharya, CISPA Helmholtz Center for Information Security; Dario Lazzaro, University of Genoa; Efrén López-Morales, Texas A&M University-Corpus Christi; Adam Oest and Muhammad Saad, PayPal Inc.; Antonio Emanuele Cinà, University of Genoa; Lea Schönherr and Thorsten Holz, CISPA Helmholtz Center for Information Security

Available Media

The rise of social media users has led to an increase in customer support services offered by brands on various platforms. Unfortunately, attackers also use this as an opportunity to trick victims through fake profiles that imitate official brand accounts. In this work, we provide a comprehensive overview of such brand impersonation attacks on social media.

We analyze the fake profile creation and user engagement processes on X, Instagram, Telegram, and YouTube and quantify their impact. Between May and October 2023, we collected 1.3 million user profiles, 33 million posts, and publicly available profile metadata, wherein we found 349,411 squatted accounts targeting 2,625 of 2,847 major international brands. Analyzing profile engagement and user creation techniques, we show that squatting profiles persistently perform various novel attacks in addition to classic abuse such as social engineering, phishing, and copyright infringement. By sharing our findings with the top 100 brands and collaborating with one of them, we further validate the real-world implications of such abuse. Our research highlights a weakness in the ability of social media platforms to protect brands and users from attacks based on username squatting. Alongside strategies such as customer education and clear indicators of trust, our detection model can be used by platforms as a countermeasure to proactively detect abusive accounts.

Track 2

Wireless Security I: Cellular and Bluetooth

Session Chair: Syed Rafiul Hussain

Salon CD

Hermes: Unlocking Security Analysis of Cellular Network Protocols by Synthesizing Finite State Machines from Natural Language Specifications

Abdullah Al Ishtiaq, Sarkar Snigdha Sarathi Das, Syed Md Mukit Rashid, Ali Ranjbar, Kai Tu, Tianwei Wu, Zhezheng Song, Weixuan Wang, Mujtahid Akon, Rui Zhang, and Syed Rafiul Hussain, Pennsylvania State University

Available Media

In this paper, we present Hermes, an end-to-end framework to automatically generate formal representations from natural language cellular specifications. We first develop a neural constituency parser, NEUTREX, to process transition-relevant texts and extract transition components (i.e., states, conditions, and actions). We also design a domain-specific language to translate these transition components to logical formulas by leveraging dependency parse trees. Finally, we compile these logical formulas to generate transitions and create the formal model as finite state machines. To demonstrate the effectiveness of Hermes, we evaluate it on 4G NAS, 5G NAS, and 5G RRC specifications and obtain an overall accuracy of 81-87%, which is a substantial improvement over the state-of-the-art. Our security analysis of the extracted models uncovers 3 new vulnerabilities and identifies 19 previous attacks in 4G and 5G specifications, and 7 deviations in commercial 4G basebands.

On the Criticality of Integrity Protection in 5G Fronthaul Networks

Jiarong Xing, Rice University; Sophia Yoo, Princeton University; Xenofon Foukas, Microsoft; Daehyeok Kim, The University of Texas at Austin; Michael K. Reiter, Duke University

Available Media

The modern 5G fronthaul, which connects the base stations to radio units in cellular networks, is designed to deliver microsecond-level performance guarantees using Ethernet-based protocols. Unfortunately, due to potential performance overheads, as well as misconceptions about the low risk and impact of possible attacks, integrity protection is not considered a mandatory feature in the 5G fronthaul standards. In this work, we show how vulnerabilities from the lack of protection can be exploited, making attacks easier and more powerful than ever. We present a novel class of powerful attacks and a set of traditional attacks, which can both be fully launched from software over open packet-based interfaces, to cause performance degradation or denial of service to users over large geographical regions. Our attacks do not require a physical radio presence or signal-based attack mechanisms, do not affect the network's operation (e.g., not crashing the radios), and are highly severe (e.g., impacting multiple cells). We demonstrate the impact of our attacks in an end-to-end manner on a commercial-grade, multi-cell 5G testbed, showing that adversaries can degrade performance of connected users by more than 80%, completely block a selected subset of users from ever attaching to the cell, or even generate signaling storm attacks of more than 2500 signaling messages per minute, with just two compromised cells and four mobile users. We also present an analysis of countermeasures that meet the strict performance requirements of the fronthaul.

SIMurai: Slicing Through the Complexity of SIM Card Security Research

Tomasz Piotr Lisowski, University of Birmingham; Merlin Chlosta, CISPA Helmholtz Center for Information Security; Jinjin Wang and Marius Muench, University of Birmingham

Available Media

SIM cards are widely regarded as trusted entities within mobile networks. But what if they were not trustworthy? In this paper, we argue that malicious SIM cards are a realistic threat, and demonstrate that they can launch impactful attacks against mobile devices and their basebands.

We design and implement SIMURAI, a software platform for security-focused SIM exploration and experimentation. At its core, SIMURAI features a flexible software implementation of a SIM. In contrast to existing SIM research tooling that typically involves physical SIM cards, SIMURAI adds flexibility by enabling deliberate violation of application-level and transmission-level behavior—a valuable asset for further exploration of SIM features and attack capabilities.

We integrate the platform into common cellular security test beds, demonstrating that smartphones can successfully connect to mobile networks using our software SIM. Additionally, we integrate SIMURAI with emulated baseband firmwares and carry out a fuzzing campaign that leads to the discovery of two high-severity vulnerabilities on recent flagship smartphones. We also demonstrate how rogue carriers and attackers with physical access can trigger these vulnerabilities with ease, emphasizing the need to recognize hostile SIMs in cellular security threat models.

Finding Traceability Attacks in the Bluetooth Low Energy Specification and Its Implementations

Jianliang Wu, Purdue University & Simon Fraser University; Patrick Traynor, University of Florida; Dongyan Xu, Dave (Jing) Tian, and Antonio Bianchi, Purdue University

Available Media

Bluetooth Low Energy (BLE) provides an efficient and convenient means for connecting a wide range of devices and peripherals. While its designers attempted to make tracking devices difficult through the use of MAC address randomization, a comprehensive analysis of the untraceability for the entire BLE protocol has not previously been conducted. In this paper, we create a formal model for BLE untraceability to reason about additional ways in which the specification allows for user tracking. Our model, implemented using ProVerif, transforms the untraceability problem into a reachability problem, and uncovers four previously unknown issues, namely IRK (Identity Resolving Key) reuse, BD_ADDR (MAC Address of Bluetooth Classic) reuse, CSRK (Connection Signature Resolving Key) reuse, and ID_ADDR (Identity Address) reuse, enabling eight passive or active tracking attacks against BLE. We then build another formal model using Diff-Equivalence (DE) as a comparison to our reachability model. Our evaluation of the two models demonstrates the soundness of our reachability model, whereas the DE model is neither sound nor complete. We further confirm these vulnerabilities in 13 different devices, ranging from embedded systems to laptop computers, with each device having at least 2 of the 4 issues. We finally provide mitigations for both developers and end users. In so doing, we demonstrate that BLE systems remain trackable under several common scenarios.

Track 3

Mobile Security II

Session Chair: Sven Bugiel

Salon E

Defects-in-Depth: Analyzing the Integration of Effective Defenses against One-Day Exploits in Android Kernels

Lukas Maar, Graz University of Technology; Florian Draschbacher, Graz University of Technology and A-SIT Austria, Graz; Lukas Lamster and Stefan Mangard, Graz University of Technology

Available Media

With the mobile phone market exceeding one billion units sold in 2023, ensuring the security of these devices is critical. However, recent research has revealed worrying delays in the deployment of security-critical kernel patches, leaving devices vulnerable to publicly known one-day exploits. While the mainline Android kernel has seen an increase in defense mechanisms, their integration and effectiveness in vendor-supplied kernels are unknown at a large scale.

In this paper, we systematically analyze publicly available one-day exploits targeting the Android kernel over the past three years. We identify multiple exploitation flows representing vulnerability-agnostic strategies to gain high privileges. We then demonstrate that integrating defense-in-depth mechanisms from the mainline Android kernel could mitigate 84.6 % of these exploitation flows. In a subsequent analysis of 994 devices, we reveal a widespread absence of effective defenses across vendors. Depending on the vendor, only 28.8 % to 54.6 % of exploitation flows are mitigated, indicating a 4.62 to 2.951 times worse scenario than the mainline kernel.

Further delving into defense mechanisms, we reveal weaknesses in vendor-specific defenses and advanced exploitation techniques bypassing defense implementations. As these developments pose additional threats, we discuss potential solutions. Lastly, we discuss factors contributing to the absence of effective defenses and offer improvement recommendations. We envision that our findings will guide the inclusion of effective defenses, ultimately enhancing Android security.

Exploring Covert Third-party Identifiers through External Storage in the Android New Era

Zikan Dong, Beijing University of Posts and Telecommunications; Tianming Liu, Monash University/Huazhong University of Science and Technology; Jiapeng Deng and Haoyu Wang, Huazhong University of Science and Technology; Li Li, Beihang University; Minghui Yang and Meng Wang, OPPO; Guosheng Xu, Beijing University of Posts and Telecommunications; Guoai Xu, Harbin Institute of Technology, Shenzhen

Available Media

Third-party tracking plays a vital role in the mobile app ecosystem, which relies on identifiers to gather user data across multiple apps. In the early days of Android, tracking SDKs could effortlessly access non-resettable hardware identifiers for third-party tracking. However, as privacy concerns mounted, Google has progressively restricted device identifier usage through Android system updates. In the new era, tracking SDKs are only allowed to employ user-resettable identifiers which users can also opt out of, prompting SDKs to seek alternative methods for reliable user identification across apps. In this paper, we systematically explore the practice of third-party tracking SDKs covertly storing their own generated identifiers on external storage, thereby circumventing Android's identifier usage restriction and posing a considerable threat to user privacy. We devise an analysis pipeline for an extensive large-scale investigation of this phenomenon, leveraging kernel-level instrumentation and UI testing techniques to automate the recording of app file operations at runtime. Applying our pipeline to 8,000 Android apps, we identified 17 third-party tracking SDKs that store identifiers on external storage. Our analysis reveals that these SDKs employ a range of storage techniques, including hidden files and attaching to existing media files, to make their identifiers more discreet and persistent. We also found that most SDKs lack adequate security measures, compromising the confidentiality and integrity of identifiers and enabling deliberate attacks. Furthermore, we examined the impact of Scoped Storage - Android's latest defense mechanism for external storage on these covert third-party identifiers, and proposed a viable exploit that breaches such a defense mechanism. Our work underscores the need for greater scrutiny of third-party tracking practices and better solutions to safeguard user privacy in the Android ecosystem.

PURE: Payments with UWB RElay-protection

Daniele Coppola, Giovanni Camurati, Claudio Anliker, Xenia Hofmeier, Patrick Schaller, David Basin, and Srdjan Capkun, ETH Zurich

Distinguished Paper Award Winner

Available Media

Contactless payments are now widely used and are expected to reach $10 trillion worth of transactions by 2027. Although convenient, contactless payments are vulnerable to relay attacks that enable attackers to execute fraudulent payments. A number of countermeasures have been proposed to address this issue, including Mastercard's relay protection mechanism. These countermeasures, although effective against some Commercial off-the-shelf (COTS) relays, fail to prevent physical-layer relay attacks.

In this work, we leverage the Ultra-Wide Band (UWB) radios incorporated in major smartphones, smartwatches, tags and accessories, and introduce PURE, the first UWB-based relay protection that integrates smoothly into existing contactless payment standards, and prevents even the most sophisticated physical layer attacks. PURE extends EMV payment protocols that are executed between cards and terminals, and does not require any modification to the backend of the issuer, acquirer, or payment network. PURE further tailors UWB ranging to the payment environment (i.e., wireless channels) to achieve both reliability and resistance to all known physical-layer distance reduction attacks against UWB 802.15.4z. We implement PURE within the EMV standard on modern smartphones, and evaluate its performance in a realistic deployment. Our experiments show that PURE provides a sub-meter relay protection with minimal execution overhead (41 ms). We formally verify the security of PURE's integration within Mastercard's EMV protocol using the Tamarin prover.

Do You See How I Pose? Using Poses as an Implicit Authentication Factor for QR Code Payment

Chuxiong Wu and Qiang Zeng, George Mason University

Available Media

QR code payment has gained enormous popularity in the realm of mobile transactions, but concerns regarding its security keep growing. To bolster the security of QR code payment, we propose pQRAuth, an innovative implicit second-factor authentication approach that exploits smartphone poses. In the proposed approach, when a consumer presents a payment QR code on her smartphone to a merchant's QR code scanner, the scanner's camera captures not only the QR code itself but also the smartphone's poses. By utilizing poses as an additional factor, in conjunction with QR code decoding, the scanner verifies the authenticity of the smartphone presenting the QR code. Our comprehensive evaluation demonstrates the effectiveness of pQRAuth, affirming its security, accuracy and robustness.

Track 4

Measurement IV: Web

Session Chair: Aurore Fass

Salon F

Simulated Stress: A Case Study of the Effects of a Simulated Phishing Campaign on Employees' Perception, Stress and Self-Efficacy

Markus Schöps, Marco Gutfleisch, Eric Wolter, and M. Angela Sasse, Ruhr University Bochum

Available Media

Many organizations are concerned about being attacked by phishing emails and buy Simulated Phishing Campaigns (SPC) to measure and reduce their employees' susceptibility to these attacks. Whilst some prior studies reported reduced click rates after SPCs, others have raised concerns that it may have undesirable side effects: causing some employees stress, and/or reducing their self-efficacy. This would be counterproductive, since stress and self-efficacy play a key role in learning and behavior change. We report the first study in which stress and self-efficacy were measured with n = 408 employees immediately after they clicked on or reported a simulated phishing email they received as part of an SPC in a large organization. To obtain richer data how employees experienced the SPC, we conducted semi-structured interviews with n = 21 employees. We find that participants who clicked on and reported simulated phishing emails generally perceived SPCs as positive and effective, even though recent research casts doubt on this effectiveness. We further find that participants who clicked on simulated phishing emails had significantly higher stress levels and significantly lower phishing self-efficacy than participants who reported them. We further discuss the impact of our findings and conclude that the effect of SPCs on the perceived stress of employees is an important relationship that needs to be investigated in future studies.

Arcanum: Detecting and Evaluating the Privacy Risks of Browser Extensions on Web Pages and Web Content

Qinge Xie, Manoj Vignesh Kasi Murali, Paul Pearce, and Frank Li, Georgia Institute of Technology

Available Media

Modern web browsers support rich extension ecosystems that provide users with customized and flexible browsing experiences. Unfortunately, the flexibility of extensions also introduces the potential for abuse, as an extension with sufficient permissions can access and surreptitiously leak sensitive and private browsing data to the extension's authors or third parties. Prior work has explored such extension behavior, but has been limited largely to meta-data about browsing rather than the contents of web pages, and is also based on older versions of browsers, web standards, and APIs, precluding its use for analysis in a modern setting.

In this work, we develop Arcanum, a dynamic taint tracking system for modern Chrome extensions designed to monitor the flow of user content from web pages. Arcanum defines a variety of taint sources and sinks, allowing researchers to taint specific parts of pages at runtime via JavaScript, and works on modern extension APIs, JavaScript APIs, and versions of Chromium. We deploy Arcanum to test all functional extensions currently in the Chrome Web Store for the automated exfiltration of user data across seven sensitive websites: Amazon, Facebook, Gmail, Instagram, LinkedIn, Outlook, and PayPal. We observe significant privacy risks across thousands of extensions, including hundreds of extensions automatically extracting user content from within web pages, impacting millions of users. Our findings demonstrate the importance of user content within web pages, and the need for stricter privacy controls on extensions.

Smudged Fingerprints: Characterizing and Improving the Performance of Web Application Fingerprinting

Brian Kondracki and Nick Nikiforakis, Stony Brook University

Distinguished Paper Award Winner

Available Media

Open-source web applications have given everyone the ability to deploy complex web applications on their site(s), ranging from blogs and personal clouds, to server administration tools and webmail clients. Given that there exists millions of deployments of this software in the wild, the ability to fingerprint a particular release of a web application residing at a web endpoint is of interest to both attackers and defenders alike.

In this work, we study modern web application fingerprinting techniques and identify their inherent strengths and weaknesses. We design WASABO, a web application testing framework and use it to measure the performance of six web application fingerprinting tools against 1,360 releases of popular web applications. While 94.8% of all web application releases were correctly labeled by at least one fingerprinting tool in ideal conditions, many tools are unable to produce a single version prediction for a particular release. This leads to instances where a release is labeled as multiple disparate versions, resulting in administrator confusion on the security posture of an unknown web application.

We also measure the accuracy of each tool against real-world deployments of the studied web applications, observing up to an 80% drop-off in performance compared to our offline results. To identify causes for this performance degradation, as well as to improve the robustness of these tools in the wild, we design a web-application-agnostic middleware which applies a series of transformations to the traffic of each fingerprinting tool. Overall, we are able to improve the performance of popular web application fingerprinting tools by up to 22.9%, without any modification to the evaluated tools.

Does Online Anonymous Market Vendor Reputation Matter?

Alejandro Cuevas and Nicolas Christin, Carnegie Mellon University

Available Media

Reputation is crucial for trust in underground markets such as online anonymous marketplaces (OAMs), where there is little recourse against unscrupulous vendors. These markets rely on eBay-like feedback scores and forum reviews as reputation signals to ensure market safety, driving away dishonest vendors and flagging low-quality or dangerous products. Despite their importance, there has been scant work exploring the correlation (or lack thereof) between reputation signals and vendor success. To fill this gap, we study vendor success from two angles: (i) longevity and (ii) future financial success, by studying eight OAMs from 2011 to 2023. We complement market data with social network features extracted from a OAM forum, and by qualitatively coding reputation signals from over 15,000 posts and comments across two subreddits. Using survival analysis techniques and simple Random Forest models, we show that feedback scores (including those imported from other markets) can explain vendors' longevity, but fail to predict vendor disappearance in the short term. Further, feedback scores are not the main predictors of future financial success. Rather, vendors who quickly generate revenue when they start on a market typically end up acquiring the most wealth overall. We show that our models generalize across different markets and time periods spanning over a decade. Our findings provide empirical insights into early identification of potential high-scale vendors, effectiveness of "reputation poisoning" strategies, and how reputation systems could contribute to harm reduction in OAMs. We find in particular that, despite their coarseness, existing reputation signals are useful to identify potentially dishonest sellers, and highlight some possible improvements.

Track 5

LLM II: Jailbreaking

Session Chair: Hongxin Hu

Salon G

LLM-Fuzzer: Scaling Assessment of Large Language Model Jailbreaks

Jiahao Yu, Northwestern University; Xingwei Lin, Ant Group; Zheng Yu and Xinyu Xing, Northwestern University

Available Media

The jailbreak threat poses a significant concern for Large Language Models (LLMs), primarily due to their potential to generate content at scale. If not properly controlled, LLMs can be exploited to produce undesirable outcomes, including the dissemination of misinformation, offensive content, and other forms of harmful or unethical behavior. To tackle this pressing issue, researchers and developers often rely on red-team efforts to manually create adversarial inputs and prompts designed to push LLMs into generating harmful, biased, or inappropriate content. However, this approach encounters serious scalability challenges.

To address these scalability issues, we introduce an automated solution for large-scale LLM jailbreak susceptibility assessment called LLM-Fuzzer. Inspired by fuzz testing, LLM-Fuzzer uses human-crafted jailbreak prompts as starting points. By employing carefully customized seed selection strategies and mutation mechanisms, LLM-Fuzzer generates additional jailbreak prompts tailored to specific LLMs. Our experiments show that LLM-Fuzzer-generated jailbreak prompts demonstrate significantly increased exploitability and transferability. This highlights that many open-source and commercial LLMs suffer from severe jailbreak issues, even after safety fine-tuning.

Don't Listen To Me: Understanding and Exploring Jailbreak Prompts of Large Language Models

Zhiyuan Yu, Washington University in St. Louis; Xiaogeng Liu, University of Wisconsin, Madison; Shunning Liang, Washington University in St. Louis; Zach Cameron, John Burroughs School; Chaowei Xiao, University of Wisconsin, Madison; Ning Zhang, Washington University in St. Louis

Distinguished Paper Award Winner

Available Media

Recent advancements in generative AI have enabled ubiquitous access to large language models (LLMs). Empowered by their exceptional capabilities to understand and generate human-like text, these models are being increasingly integrated into our society. At the same time, there are also concerns on the potential misuse of this powerful technology, prompting defensive measures from service providers. To overcome such protection, jailbreaking prompts have recently emerged as one of the most effective mechanisms to circumvent security restrictions and elicit harmful content originally designed to be prohibited.

Due to the rapid development of LLMs and their ease of access via natural languages, the frontline of jailbreak prompts is largely seen in online forums and among hobbyists. To gain a better understanding of the threat landscape of semantically meaningful jailbreak prompts, we systemized existing prompts and measured their jailbreak effectiveness empirically. Further, we conducted a user study involving 92 participants with diverse backgrounds to unveil the process of manually creating jailbreak prompts. We observed that users often succeeded in jailbreak prompts generation regardless of their expertise in LLMs. Building on the insights from the user study, we also developed a system using AI as the assistant to automate the process of jailbreak prompt generation.

Malla: Demystifying Real-world Large Language Model Integrated Malicious Services

Zilong Lin, Jian Cui, Xiaojing Liao, and XiaoFeng Wang, Indiana University Bloomington

Available Media

The underground exploitation of large language models (LLMs) for malicious services (i.e., Malla) is witnessing an uptick, amplifying the cyber threat landscape and posing questions about the trustworthiness of LLM technologies. However, there has been little effort to understand this new cybercrime, in terms of its magnitude, impact, and techniques. In this paper, we conduct the first systematic study on 212 real-world Mallas, uncovering their proliferation in underground marketplaces and exposing their operational modalities. Our study discloses the Malla ecosystem, revealing its significant growth and impact on today's public LLM services. Through examining 212 Mallas, we uncovered eight backend LLMs used by Mallas, along with 182 prompts that circumvent the protective measures of public LLM APIs. We further demystify the tactics employed by Mallas, including the abuse of uncensored LLMs and the exploitation of public LLM APIs through jailbreak prompts. Our findings enable a better understanding of the real-world exploitation of LLMs by cybercriminals, offering insights into strategies to counteract this cybercrime.

Making Them Ask and Answer: Jailbreaking Large Language Models in Few Queries via Disguise and Reconstruction

Tong Liu and Yingjie Zhang, Institute of Information Engineering, Chinese Academy of Sciences and School of Cyber Security, University of Chinese Academy of Sciences; Zhe Zhao, RealAI; Yinpeng Dong, RealAI and Tsinghua University; Guozhu Meng and Kai Chen, Institute of Information Engineering, Chinese Academy of Sciences and School of Cyber Security, University of Chinese Academy of Sciences

Available Media

In recent years, large language models (LLMs) have demonstrated notable success across various tasks, but the trustworthiness of LLMs is still an open problem. One specific threat is the potential to generate toxic or harmful responses. Attackers can craft adversarial prompts that induce harmful responses from LLMs. In this work, we pioneer a theoretical foundation in LLMs security by identifying bias vulnerabilities within the safety fine-tuning and design a black-box jailbreak method named DRA (Disguise and Reconstruction Attack), which conceals harmful instructions through disguise and prompts the model to reconstruct the original harmful instruction within its completion. We evaluate DRA across various open-source and closed-source models, showcasing state-of-the-art jailbreak success rates and attack efficiency. Notably, DRA boasts a 91.1% attack success rate on OpenAI GPT-4 chatbot.

Track 6

Fuzzing III: Network

Session Chair: Kevin Borgolte

Salon H

ResolverFuzz: Automated Discovery of DNS Resolver Vulnerabilities with Query-Response Fuzzing

Qifan Zhang and Xuesong Bai, University of California, Irvine; Xiang Li, Tsinghua University; Haixin Duan, Tsinghua University; Zhongguancun Laboratory; Quan Cheng Laboratory; Qi Li, Tsinghua University; Zhou Li, University of California, Irvine

Available Media

Domain Name System (DNS) is a critical component of the Internet. DNS resolvers, which act as the cache between DNS clients and DNS nameservers, are the central piece of the DNS infrastructure, essential to the scalability of DNS. However, finding the resolver vulnerabilities is non-trivial, and this problem is not well addressed by the existing tools. To list a few reasons, first, most of the known resolver vulnerabilities are non-crash bugs that cannot be directly detected by the existing oracles (or sanitizers). Second, there lacks rigorous specifications to be used as references to classify a test case as a resolver bug. Third, DNS resolvers are stateful, and stateful fuzzing is still challenging due to the large input space.

In this paper, we present a new fuzzing system termed ResolverFuzz to address the aforementioned challenges related to DNS resolvers, with a suite of new techniques being developed. First, ResolverFuzz performs constrained stateful fuzzing by focusing on the short query-response sequence, which has been demonstrated as the most effective way to find resolver bugs, based on our study of the published DNS CVEs. Second, to generate test cases that are more likely to trigger resolver bugs, we combine probabilistic context-free grammar (PCFG) based input generation with byte-level mutation for both queries and responses. Third, we leverage differential testing and clustering to identify non-crash bugs like cache poisoning bugs. We evaluated ResolverFuzz against 6 mainstream DNS software under 4 resolver modes. Overall, we identify 23 vulnerabilities that can result in cache poisoning, resource consumption, and crash attacks. After responsible disclosure, 19 of them have been confirmed or fixed, and 15 CVE numbers have been assigned.

Understanding Ethereum Mempool Security under Asymmetric DoS by Symbolized Stateful Fuzzing

Yibo Wang and Yuzhe Tang, Syracuse University; Kai Li, San Diego State University; Wanning Ding and Zhihua Yang, Syracuse University

Available Media

In blockchains, mempool controls transaction flow before consensus, denial of whose service hurts the health and security of blockchain networks. This paper presents MPFUZZ, the first mempool fuzzer to find asymmetric DoS bugs by exploring the space of symbolized mempool states and optimistically estimating the promisingness of an intermediate state in reaching bug oracles. Compared to the baseline blockchain fuzzers, MPFUZZ achieves a > 100× speedup in f inding known DETER exploits. Running MPFUZZ on major Ethereum clients leads to discovering new mempool vulnerabilities, which exhibit a wide variety of sophisticated patterns, including stealthy mempool eviction and mempool locking. Rule-based mitigation schemes are proposed against all newly discovered vulnerabilities.

Atropos: Effective Fuzzing of Web Applications for Server-Side Vulnerabilities

Emre Güler and Sergej Schumilo, Ruhr University Bochum; Moritz Schloegel, Nils Bars, Philipp Görz, and Xinyi Xu, CISPA Helmholtz Center for Information Security; Cemal Kaygusuz, Ruhr University Bochum; Thorsten Holz, CISPA Helmholtz Center for Information Security

Available Media

Server-side web applications are still predominantly implemented in the PHP programming language. Even nowadays, PHP-based web applications are plagued by many different types of security vulnerabilities, ranging from SQL injection to file inclusion and remote code execution. Automated security testing methods typically focus on static analysis and taint analysis. These methods are highly dependent on accurate modeling of the PHP language and often suffer from (potentially many) false positive alerts. Interestingly, dynamic testing techniques such as fuzzing have not gained acceptance in web applications testing, even though they avoid these common pitfalls and were rapidly adopted in other domains, e. g., for testing native applications written in C/C++.

In this paper, we present ATROPOS, a snapshot-based, feedback-driven fuzzing method tailored for PHP-based web applications. Our approach considers the challenges associated with web applications, such as maintaining session state and generating highly structured inputs. Moreover, we propose a feedback mechanism to automatically infer the key-value structure used by web applications. Combined with eight new bug oracles, each covering a common class of vulnerabilities in server-side web applications, ATROPOS is the first approach to fuzz web applications effectively and efficiently. Our evaluation shows that ATROPOS significantly outperforms the current state of the art in web application testing. In particular, it finds, on average, at least 32% more bugs, while not reporting a single false positive on different test suites. When analyzing real-world web applications, we identify seven previously unknown vulnerabilities that can be exploited even by unauthenticated users.

From One Thousand Pages of Specification to Unveiling Hidden Bugs: Large Language Model Assisted Fuzzing of Matter IoT Devices

Xiaoyue Ma, Lannan Luo, and Qiang Zeng, George Mason University

Available Media

Matter is an IoT connectivity standard backed by over two hundred companies. Since the release of its specification in October 2022, numerous IoT devices have become Matter-compatible. Identifying bugs and vulnerabilities in Matter devices is thus an emerging important problem. This paper introduces mGPTFuzz, the first Matter fuzzer in the literature. Our approach harnesses the extensive and detailed information within the Matter specification to guide the generation of test inputs. However, due to the sheer volume of the Matter specification, surpassing one thousand pages, manually converting human-readable content to machine-readable information is tedious, time-consuming and error-prone. To overcome this challenge, we leverage a large language model to successfully automate the conversion process. mGPTFuzz conducts stateful analysis, which generates message sequences to uncover bugs that would be challenging to discover otherwise. The evaluation involves 23 various Matter devices and discovers 147 new bugs, with three CVEs assigned. In comparison, a state-of-the-art IoT fuzzer finds zero bugs from these devices.

Track 7

Differential Privacy II

Session Chair: Ashraf Matrawy

Salon IJK

DAAP: Privacy-Preserving Model Accuracy Estimation on Unlabeled Datasets Through Distribution-Aware Adversarial Perturbation

Guodong Cao, Wuhan University; Zhibo Wang, Zhejiang University; Yunhe Feng, University of North Texas; Xiaowei Dong, Wuhan University

Available Media

In the dynamic field of deep learning, accurately estimating model performance while ensuring data privacy against diverse and unlabeled test datasets presents a critical challenge. This is primarily due to the significant distributional shifts between training and test datasets, which complicates model evaluation. Traditional methods for assessing model accuracy often require direct access to the entire test dataset, posing significant risks of data leakage and model theft. To address these issues, we propose a novel approach: Distribution-Aware Adversarial Perturbation (DAAP). This method is designed to estimate the accuracy of deep learning models on unlabeled test datasets without compromising privacy. Specifically, DAAP leverages a publicly available dataset as an intermediary to bridge the gap between the model and the test data, effectively circumventing direct interaction and mitigating privacy concerns. By strategically applying adversarial perturbations, DAAP minimizes the distributional discrepancies between datasets, enabling precise estimation of model performance on unseen test data. We present two specialized strategies for white-box and black-box model contexts: the former focuses on reducing output entropy disparities, while the latter manipulates distribution discriminators. Overall, the DAAP introduces a novel framework for privacy-preserving accuracy estimation in model evaluation. This novel approach not only addresses critical challenges related to data privacy and distributional shifts but also enhances the reliability and integrity of model performance assessments. Our extensive evaluation on the CIFAR-10-C, CIFAR-100-C, and CelebA datasets demonstrates the effectiveness of DAAP in accurately estimating performance while safeguarding both data and model privacy.

Closed-Form Bounds for DP-SGD against Record-level Inference Attacks

Giovanni Cherubin, Microsoft Security Response Center; Boris Köpf, Microsoft Azure Research; Andrew Paverd, Microsoft Security Response Center; Shruti Tople, Microsoft Azure Research; Lukas Wutschitz, Microsoft M365 Research; Santiago Zanella-Béguelin, Microsoft Azure Research

Available Media

Machine learning models trained with differentially-private (DP) algorithms such as DP-SGD enjoy resilience against a wide range of privacy attacks. Although it is possible to derive bounds for some attacks based solely on an (ε,δ)-DP guarantee, meaningful bounds require a small enough privacy budget (i.e., injecting a large amount of noise), which results in a large loss in utility. This paper presents a new approach to evaluate the privacy of machine learning models against specific record-level threats, such as membership and attribute inference, without the indirection through DP. We focus on the popular DP-SGD algorithm, and derive simple closed-form bounds. Our proofs model DP-SGD as an information theoretic channel whose inputs are the secrets that an attacker wants to infer (e.g., membership of a data record) and whose outputs are the intermediate model parameters produced by iterative optimization. We obtain bounds for membership inference that match state-of-the-art techniques, whilst being orders of magnitude faster to compute. Additionally, we present a novel data-dependent bound against attribute inference. Our results provide a direct, interpretable, and practical way to evaluate the privacy of trained models against specific inference threats without sacrificing utility.

PrivImage: Differentially Private Synthetic Image Generation using Diffusion Models with Semantic-Aware Pretraining

Kecen Li, Institute of Automation, Chinese Academy of Sciences and University of Chinese Academy of Sciences; Chen Gong, University of Virginia; Zhixiang Li, University of Bristol; Yuzhong Zhao, University of Chinese Academy of Sciences; Xinwen Hou, Institute of Automation, Chinese Academy of Sciences; Tianhao Wang, University of Virginia

Available Media

Differential Privacy (DP) image data synthesis, which leverages the DP technique to generate synthetic data to replace the sensitive data, allowing organizations to share and utilize synthetic images without privacy concerns. Previous methods incorporate the advanced techniques of generative models and pre-training on a public dataset to produce exceptional DP image data, but suffer from problems of unstable training and massive computational resource demands. This paper proposes a novel DP image synthesis method, termed PRIVIMAGE, which meticulously selects pre-training data, promoting the efficient creation of DP datasets with high fidelity and utility. PRIVIMAGE first establishes a semantic query function using a public dataset. Then, this function assists in querying the semantic distribution of the sensitive dataset, facilitating the selection of data from the public dataset with analogous semantics for pre-training. Finally, we pre-train an image generative model using the selected data and then fine-tune this model on the sensitive dataset using Differentially Private Stochastic Gradient Descent (DP-SGD). PRIVIMAGE allows us to train a lightly parameterized generative model, reducing the noise in the gradient during DP-SGD training and enhancing training stability. Extensive experiments demonstrate that PRIVIMAGE uses only 1% of the public dataset for pre-training and 7.6% of the parameters in the generative model compared to the state-of-the-art method, whereas achieves superior synthetic performance and conserves more computational resources. On average, PRIVIMAGE achieves 6.8% lower FID and 13.2% higher Classification Accuracy than the state-of-the-art method. The replication package and datasets can be accessed online.

"What do you want from theory alone?" Experimenting with Tight Auditing of Differentially Private Synthetic Data Generation

Meenatchi Sundaram Muthu Selva Annamalai, University College London; Georgi Ganev, University College London and Hazy; Emiliano De Cristofaro, University of California, Riverside

Available Media

Differentially private synthetic data generation (DP-SDG) algorithms are used to release datasets that are structurally and statistically similar to sensitive data while providing formal bounds on the information they leak. However, bugs in algorithms and implementations may cause the actual information leakage to be higher. This prompts the need to verify whether the theoretical guarantees of state-of-the-art DP-SDG implementations also hold in practice. We do so via a rigorous auditing process: we compute the information leakage via an adversary playing a distinguishing game and running membership inference attacks (MIAs). If the leakage observed empirically is higher than the theoretical bounds, we identify a DP violation; if it is non-negligibly lower, the audit is loose.

We audit six DP-SDG implementations using different datasets and threat models and find that black-box MIAs commonly used against DP-SDGs are severely limited in power, yielding remarkably loose empirical privacy estimates. We then consider MIAs in stronger threat models, i.e., passive and active white-box, using both existing and newly proposed attacks. Overall, we find that, currently, we do not only need white-box MIAs but also worst-case datasets to tightly estimate the privacy leakage from DP-SDGs. Finally, we show that our automated auditing procedure finds both known DP violations (in 4 out of the 6 implementations) as well as a new one in the DPWGAN implementation that was successfully submitted to the NIST DP Synthetic Data Challenge.

The source code needed to reproduce our experiments is available from https://github.com/spalabucr/synth-audit.

6:00 pm–7:30 pm

Poster Session and Happy Hour

Franklin Hall

Check out the cool new ideas and the latest preliminary work on display at the Poster Session and Happy Hour. Take advantage of the opportunity to mingle with colleagues who may share your area of interest while enjoying complimentary food and drinks. View the list of accepted posters.

Friday, August 16

8:00 am–9:00 am

Continental Breakfast

Grand Ballroom Foyer

9:00 am–10:15 am

Track 1

User Studies V: Policies and Best Practices II

Session Chair: Shaanan Cohney

Salon AB

"I Don't Know If We're Doing Good. I Don't Know If We're Doing Bad": Investigating How Practitioners Scope, Motivate, and Conduct Privacy Work When Developing AI Products

Hao-Ping (Hank) Lee, Carnegie Mellon University; Lan Gao and Stephanie Yang, Georgia Institute of Technology; Jodi Forlizzi and Sauvik Das, Carnegie Mellon University

Distinguished Paper Award Winner

Available Media

How do practitioners who develop consumer AI products scope, motivate, and conduct privacy work? Respecting privacy is a key principle for developing ethical, human-centered AI systems, but we cannot hope to better support practitioners without answers to that question. We interviewed 35 industry AI practitioners to bridge that gap. We found that practitioners viewed privacy as actions taken against pre-defined intrusions that can be exacerbated by the capabilities and requirements of AI, but few were aware of AI-specific privacy intrusions documented in prior literature. We found that their privacy work was rigidly defined and situated, guided by compliance with privacy regulations and policies, and generally demotivated beyond meeting minimum requirements. Finally, we found that the methods, tools, and resources they used in their privacy work generally did not help address the unique privacy risks introduced or exacerbated by their use of AI in their products. Collectively, these findings reveal the need and opportunity to create tools, resources, and support structures to improve practitioners' awareness of AI-specific privacy risks, motivations to do AI privacy work, and ability to address privacy harms introduced or exacerbated by their use of AI in consumer products.

Towards More Practical Threat Models in Artificial Intelligence Security

Kathrin Grosse, EPFL; Lukas Bieringer, QuantPi; Tarek R. Besold, TU Eindhoven; Alexandre M. Alahi, EPFL

Available Media

Recent works have identified a gap between research and practice in artificial intelligence security: threats studied in academia do not always reflect the practical use and security risks of AI. For example, while models are often studied in isolation, they form part of larger ML pipelines in practice. Recent works also brought forward that adversarial manipulations introduced by academic attacks are impractical. We take a first step towards describing the full extent of this disparity. To this end, we revisit the threat models of the six most studied attacks in AI security research and match them to AI usage in practice via a survey with 271 industrial practitioners. On the one hand, we find that all existing threat models are indeed applicable. On the other hand, there are significant mismatches: research is often too generous with the attacker, assuming access to information not frequently available in real-world settings. Our paper is thus a call for action to study more practical threat models in artificial intelligence security.

"There are rabbit holes I want to go down that I'm not allowed to go down": An Investigation of Security Expert Threat Modeling Practices for Medical Devices

Ronald Thompson, Madline McLaughlin, Carson Powers, and Daniel Votipka, Tufts University

Available Media

Threat modeling is considered an essential first step for "secure by design" development. Significant prior work and industry efforts have created novel methods for this type of threat modeling, and evaluated them in various simulated settings. Because threat modeling is context-specific, we focused on medical device security experts as regulators require it, and "secure by design" medical devices are seen as a critical step to securing healthcare. We conducted 12 semi-structured interviews with medical device security experts, having participants brainstorm threats and mitigations for two medical devices. We saw these experts do not sequentially work through a list of threats or mitigations according to the rigorous processes described in existing methods and, instead, regularly switch strategies. Our work consists of three major contributions. The first is a two-part process model that describes how security experts 1) determine threats and mitigations for a particular component and 2) move between components. Second, we observed participants leveraging use cases, a strategy not addressed in prior work for threat modeling. Third, we found that integrating safety into threat modeling is critical, albeit unclear. We also provide recommendations for future work.

"Belt and suspenders" or "just red tape"?: Investigating Early Artifacts and User Perceptions of IoT App Security Certification

Prianka Mandal, Amit Seal Ami, Victor Olaiya, Sayyed Hadi Razmjo, and Adwait Nadkarni, William & Mary

Available Media

As IoT security regulations and standards emerge, the industry has begun adopting the traditional enforcement model for software compliance to the IoT domain, wherein Commercially Licensed Evaluation Facilities (CLEFs) certify vendor products on behalf of regulators (and in turn consumers). Since IoT standards are in their formative stages, we investigate a simple but timely question: does the traditional model work for IoT security, and more importantly, does it work as well as consumers expect it to? This paper investigates the initial artifacts resultant from IoT compliance certiﬁcation, and user perceptions of compliance, in the context of certiﬁed mobile-IoT apps, i.e., critical companion and automation apps that expose an important IoT attack surface, with a focus on three key questions: (1) are certiﬁed IoT products vulnerable?, (2) are vulnerable-but-certiﬁed products non-compliant?, and ﬁnally, (3) how do consumers perceive compliance enforcement? Our systematic analysis of 11 mobile-IoT apps certiﬁed by IOXT, along with an analysis of 5 popular compliance standards, and a user study with 173 users, together yield 17 key ﬁndings. We ﬁnd signiﬁcant vulnerabilities that indicate gaps in certiﬁcation, but which do not violate the standards due to ambiguity and discretionary language. Further, these vulnerabilities contrast with the overwhelming trust that users place in compliance certiﬁcation and certiﬁed apps. We conclude with a discussion on future directions towards a "belt and suspenders" scenario of effective assurance that most users desire, from the status quo of "just red tape", through objective checks and balances that empower the regulators and consumers to reform compliance enforcement for IoT.

TAPFixer: Automatic Detection and Repair of Home Automation Vulnerabilities based on Negated-property Reasoning

Yinbo Yu, National Engineering Laboratory for Integrated Aero-Space-Ground-Ocean Big Data Application Technology, School of Cybersecurity, Northwestern Polytechnical University, China; Research & Development Institute of Northwestern Polytechnical University in Shenzhen, China; Yuanqi Xu, Kepu Huang, and Jiajia Liu, National Engineering Laboratory for Integrated Aero-Space-Ground-Ocean Big Data Application Technology, School of Cybersecurity, Northwestern Polytechnical University, China

Available Media

Trigger-Action Programming (TAP) is a popular end-user programming framework in the home automation (HA) system, which eases users to customize home automation and control devices as expected. However, its simplified syntax also introduces new safety threats to HA systems through vulnerable rule interactions. Accurately fixing these vulnerabilities by logically and physically eliminating their root causes is essential before rules are deployed. However, it has not been well studied. In this paper, we present TAPFixer, a novel framework to automatically detect and repair rule interaction vulnerabilities in HA systems. It extracts TAP rules from HA profiles, translates them into an automaton model with physical and latency features, and performs model checking with various correctness properties. It then uses a novel negated-property reasoning algorithm to automatically infer a patch via model abstraction and refinement and model checking based on negated-properties. We evaluate TAPFixer on market HA apps (1177 TAP rules and 53 properties) and find that it can achieve an 86.65% success rate in repairing rule interaction vulnerabilities. We additionally recruit 23 HA users to conduct a user study that demonstrates the usefulness of TAPFixer for vulnerability repair in practical HA scenarios.

Track 2

User Studies VI: Privacy II

Session Chair: Mainack Mondal

Salon CD

Co-Designing a Mobile App for Bystander Privacy Protection in Jordanian Smart Homes: A Step Towards Addressing a Complex Privacy Landscape

Wael Albayaydh and Ivan Flechais, University of Oxford

Available Media

The proliferation of smart devices fuels privacy concerns, particularly for bystanders—individuals impacted by smart devices beyond their control. Existing research primarily addresses these concerns in Western contexts, with limited focus on Muslim Arab Middle-Eastern (MAME) regions like Jordan. Additionally, there is a scarcity of proposed interventions or assessments for effectively addressing, communicating, negotiating, and remediating privacy issues in these contexts. This study aims to bridge this gap by investigating how a technology probe in the form of a privacy-focused mobile application can serve as an auxiliary tool to support the privacy protection of smart home bystanders in Jordan.

We initiated our research by collaboratively designing the app through four focus groups involving 24 stakeholders. Subsequently, we present and qualitatively evaluate the app's potential for privacy protection with a diverse group of 26 representative stakeholders. While the app is generally well-received, it encounters challenges rooted in broader contextual norms and practices. Our discussion delves into these challenges, offering recommendations to enhance bystander privacy protection in Jordanian smart homes.

"I really just leaned on my community for support": Barriers, Challenges, and Coping Mechanisms Used by Survivors of Technology-Facilitated Abuse to Seek Social Support

Naman Gupta, University of Wisconsin–Madison, Madison Tech Clinic; Kate Walsh, University of Wisconsin–Madison; Sanchari Das, University of Denver; Rahul Chatterjee, University of Wisconsin–Madison, Madison Tech Clinic

Available Media

Technology-facilitated abuse (TFA) from an intimate partner is a growing concern for survivors' safety and security. Prior research introduced tailored interventions to support the survivors of TFA. However, most survivors do not have access to these interventions or are unaware of the appropriate support networks and resources in their community. We conducted nine semi-structured interviews to examine survivors' support-seeking dynamics from their social networks and the effectiveness of social networks in addressing survivors' needs. Through survivors' lived experiences, we systematize socio-technical barriers that impede the participant from seeking support and challenges in seeking support. Further, we identify coping mechanisms used by the survivors despite those barriers and challenges. Through a participatory lens, we echo survivors' call for action to improve support networks and propose recommendations for technology design to promote safer support-seeking practices and resources, consciousness-raising awareness campaigns, and collaborations with the community. Finally, we call for a restorative-justice-oriented framework that recognizes TFA.

From the Childhood Past: Views of Young Adults on Parental Sharing of Children's Photos

Tania Ghafourian, Indiana University Bloomington; Nicholas Micallef, Swansea University; Sameer Patil, University of Utah

Available Media

Parents increasingly post content about their children on social media. While such sharing serves beneficial interactive purposes, it can create immediate and longitudinal privacy risks for the children. Studies on parental content sharing have investigated perceptions of parents and children, leaving out those of young adults between the ages of 18 and 30. We addressed this gap via a questionnaire asking young adults about their perspectives on parental sharing of children's photos on social media. We found that young adults who had content about them shared by their parents during childhood and those who were parents expressed greater acceptance of parental sharing practices in terms of motives, content, and audiences. Our findings indicate the need for system features, policies, and digital literacy campaigns to help parents balance the interactive benefits of sharing content about their children and protecting the children's online footprints.

ATTention Please! An Investigation of the App Tracking Transparency Permission

Reham Mohamed and Arjun Arunasalam, Purdue University; Habiba Farrukh, University of California, Irvine; Jason Tong, Antonio Bianchi, and Z. Berkay Celik, Purdue University

Available Media

Apple introduced the App Tracking Transparency (ATT) framework in iOS 14.5. The goal of this framework is to mitigate user concerns about how their privacy-sensitive data is used for targeted advertising. Through this framework, the OS generates an ATT alert to request user permission for tracking. While this alert includes developer-controlled alert text, Apple mandates this text adheres to specific guidelines to prevent users from being coerced into unwillingly granting the ATT permission for tracking. However, to improve apps' monetization, developers may incorporate dark patterns in the ATT alerts to deceive users into granting the permission.

To understand the prevalence and characteristics of such dark patterns, we first study Apple's alert guidelines and identify four patterns that violate standards. We then develop ATTCLS, an ATT alert classification framework that combines contrastive learning for language modeling with a fully connected neural network for multi-label alert pattern classification. Finally, by applying ATTCLS to 4,000 iOS apps, we reveal that 59% of the alerts use four dark patterns that either mislead users, incentivize tracking, include confusing terms, or omit the purpose of the ATT permission.

We then conduct a user study with 114 participants to examine users' understanding of ATT and how different alert patterns can influence their perception. This study reveals that ATT alerts used by current apps often deceive or confuse users. For instance, users can be misled into believing that granting the ATT permission guarantees better app features or that denying it protects all of their sensitive data. We envision that our developed tools and empirical results will aid mobile platforms to refine guidelines, introduce a strict vetting process, and better design privacy-related prompts for users.

Voice App Developer Experiences with Alexa and Google Assistant: Juggling Risks, Liability, and Security

William Seymour, King's College London; Noura Abdi, Liverpool Hope University; Kopo M. Ramokapane, University of Bristol; Jide Edu, University of Strathclyde; Guillermo Suarez-Tangil, IMDEA Networks Institute; Jose Such, King's College London & Universitat Politecnica de Valencia

Available Media

Voice applications (voice apps) are a key element in Voice Assistant ecosystems such as Amazon Alexa and Google Assistant, as they provide assistants with a wide range of capabilities that users can invoke with a voice command. Most voice apps, however, are developed by third parties—i.e., not by Amazon/Google—and they are included in the ecosystem through marketplaces akin to smartphone app stores but with crucial differences, e.g., the voice app code is not hosted by the marketplace and is not run on the local device. Previous research has studied the security and privacy issues of voice apps in the wild, finding evidence of bad practices by voice app developers. However, developers' perspectives are yet to be explored.

In this paper, we report a qualitative study of the experiences of voice app developers and the challenges they face. Our findings suggest that: 1) developers face several risks due to liability pushed on to them by the more powerful voice assistant platforms, which are linked to negative privacy and security outcomes on voice assistant platforms; and 2) there are key issues around monetization, privacy, design, and testing rooted in problems with the voice app certification process. We discuss the implications of our results for voice app developers, platforms, regulators, and research on voice app development and certification.

Track 3

Measurement V: App

Session Chair: Rahul Chatterjee

Salon E

Swipe Left for Identity Theft: An Analysis of User Data Privacy Risks on Location-based Dating Apps

Karel Dhondt, Victor Le Pochat, Yana Dimova, Wouter Joosen, and Stijn Volckaert, DistriNet, KU Leuven

Available Media

Location-based dating (LBD) apps enable users to meet new people nearby and online by browsing others' profiles, which often contain very personal and sensitive data. We systematically analyze 15 LBD apps on the prevalence of privacy risks that can result in abuse by adversarial users who want to stalk, harass, or harm others. Through a systematic manual analysis of these apps, we assess which personal and sensitive data is shared with other users, both as (intended) data exposure and as inadvertent yet powerful leaks in API traffic that is otherwise hidden from a user, violating their mental model of what they share on LBD apps. We also show that 6 apps allow for pinpointing a victim's exact location, enabling physical threats to users' personal safety. All these data exposures and leaks—supported by easy account creation—enable targeted or large-scale, long-term, and stealthy profiling and tracking of LBD app users. While privacy policies acknowledge personal data processing, and a tension exists between app functionality and user privacy, significant data privacy risks remain. We recommend user control, data minimization, and API hardening as countermeasures to protect users' privacy.

Spill the TeA: An Empirical Study of Trusted Application Rollback Prevention on Android Smartphones

Marcel Busch, Philipp Mao, and Mathias Payer, EPFL

Available Media

The number and complexity of Trusted Applications (TAs, applications running in Trusted Execution Environments—TEEs) deployed on mobile devices has exploded. A vulnerability in a single TA impacts the security of the entire device. Thus, vendors must rapidly fix such vulnerabilities and revoke vulnerable versions to prevent rollback attacks, i.e., loading an old version of the TA to exploit a known vulnerability.

In this paper, we assess the state of TA rollback prevention by conducting a large-scale cross-vendor study. First, we establish the largest TA dataset in existence, encompassing 35,541 TAs obtained from 1,330 firmware images deployed on mobile devices across the top five most common vendors. Second, we identify 37 TA vulnerabilities that we leverage to assess the state of industry-wide TA rollback effectiveness. Third, we make the counterintuitive discovery that the uncoordinated usage of rollback prevention correlates with the leakage of security-critical information and has far-reaching consequences potentially negatively impacting the whole mobile ecosystem. Fourth, we demonstrate the severity of ineffective TA rollback prevention by exploiting two different TEEs on fully-updated mobile devices. In summary, our results indicate severe deficiencies in TA rollback prevention across the mobile ecosystem.

A Decade of Privacy-Relevant Android App Reviews: Large Scale Trends

Omer Akgul, University of Maryland; Sai Teja Peddinti and Nina Taft, Google; Michelle L. Mazurek, University of Maryland; Hamza Harkous, Animesh Srivastava, and Benoit Seguin, Google

Available Media

We present an analysis of 12 million instances of privacy-relevant reviews publicly visible on the Google Play Store that span a 10 year period. By leveraging state of the art NLP techniques, we examine what users have been writing about privacy along multiple dimensions: time, countries, app types, diverse privacy topics, and even across a spectrum of emotions. We find consistent growth of privacy-relevant reviews, and explore topics that are trending (such as Data Deletion and Data Theft), as well as those on the decline (such as privacy-relevant reviews on sensitive permissions). We find that although privacy reviews come from more than 200 countries, 33 countries provide 90% of privacy reviews. We conduct a comparison across countries by examining the distribution of privacy topics a country's users write about, and find that geographic proximity is not a reliable indicator that nearby countries have similar privacy perspectives. We uncover some countries with unique patterns and explore those herein. Surprisingly, we uncover that it is not uncommon for reviews that discuss privacy to be positive (32%); many users express pleasure about privacy features within apps or privacy-focused apps. We also uncover some unexpected behaviors, such as the use of reviews to deliver privacy disclaimers to developers. Finally, we demonstrate the value of analyzing app reviews with our approach as a complement to existing methods for understanding users' perspectives about privacy.

Tickets or Privacy? Understand the Ecosystem of Chinese Ticket Grabbing Apps

Yijing Liu and Yiming Zhang, Tsinghua University; Baojun Liu, Tsinghua University; Zhongguancun Laboratory; Haixin Duan, Tsinghua University; Quancheng Laboratory; Qiang Li, Qihoo 360; Mingxuan Liu, Zhongguancun Laboratory; Ruixuan Li and Jia Yao, Tsinghua University

Available Media

Due to the prevalence of scalping and the promotion of real-name ticketing systems, user-oriented mobile ticket grabbing apps have become a popular pattern for scalpers. Compared with traditional scalper-oriented scalping, ticket grabbing apps pose security and privacy risks to users directly. In our study, we take the first step towards revealing the ticket grabbing app ecosystem from the perspectives of app developers, app users, and target platforms synthetically.

We built a large-scale dataset of ticket grabbing apps in the wild within China, containing 758 Chinese ticket grabbing apps with 3,121 versions. Based on the detailed analysis of these apps, we found that ticket grabbing has formed a mature industrial chain, with various specialized technical characteristics to enhance the success rate, such as the abuse of Android accessibility services. We also revealed the profit model of ticket grabbing apps, and disclosed severe security and privacy hazards they pose to end users, including the collection of sensitive information and continuous screenshots. We further conducted an online survey involving 184 participants to get users' usage and privacy concerns on ticket grabbing apps, and regrettably found that users prioritize "tickets" over "privacy". Finally, we proposed an "Indirect Combat" approach to assist in the defense mechanisms. In summary, our findings provide target platforms and users with a better understanding of the ticket grabbing app ecosystem in China, enabling them to better detect and combat these apps.

PIXELMOD: Improving Soft Moderation of Visual Misleading Information on Twitter

Pujan Paudel and Chen Ling, Boston University; Jeremy Blackburn, Binghamton University; Gianluca Stringhini, Boston University

Available Media

Images are a powerful and immediate vehicle to carry misleading or outright false messages, yet identifying image-based misinformation at scale poses unique challenges. In this paper, we present PIXELMOD, a system that leverages perceptual hashes, vector databases, and optical character recognition (OCR) to efficiently identify images that are candidates to receive soft moderation labels on Twitter. We show that PIXELMOD outperforms existing image similarity approaches when applied to soft moderation, with negligible performance overhead. We then test PIXELMOD on a dataset of tweets surrounding the 2020 US Presidential Election, and find that it is able to identify visually misleading images that are candidates for soft moderation with 0.99% false detection and 2.06% false negatives.

Track 4

Network Security III: Detection

Session Chair: Andriy Panchenko

Salon F

Learning with Semantics: Towards a Semantics-Aware Routing Anomaly Detection System

Yihao Chen, Department of Computer Science and Technology & BNRist, Tsinghua University; Qilei Yin, Zhongguancun Laboratory; Qi Li and Zhuotao Liu, Institute for Network Sciences and Cyberspace, Tsinghua University; Zhongguancun Laboratory; Ke Xu, Department of Computer Science and Technology, Tsinghua University; Zhongguancun Laboratory; Yi Xu and Mingwei Xu, Institute for Network Sciences and Cyberspace, Tsinghua University; Zhongguancun Laboratory; Ziqian Liu, China Telecom; Jianping Wu, Department of Computer Science and Technology, Tsinghua University; Zhongguancun Laboratory

Distinguished Paper Award Winner and Winner of the 2024 Internet Defense Prize

Available Media

BGP is the de facto inter-domain routing protocol to ensure global connectivity of the Internet. However, various reasons, such as deliberate attacks or misconfigurations, could cause BGP routing anomalies. Traditional methods for BGP routing anomaly detection require significant manual investigation of routes by network operators. Although machine learning has been applied to automate the process, prior arts typically impose significant training overhead (such as large-scale data labeling and feature crafting), and only produce uninterpretable results. To address these limitations, this paper presents a routing anomaly detection system centering around a novel network representation learning model named BEAM. The core design of BEAM is to accurately learn the unique properties (defined as routing role) of each Autonomous System (AS) in the Internet by incorporating BGP semantics. As a result, routing anomaly detection, given BEAM, is reduced to a matter of discovering unexpected routing role churns upon observing new route announcements. We implement a prototype of our routing anomaly detection system and extensively evaluate its performance. The experimental results, based on 18 real-world RouteViews datasets containing over 11 billion route announcement records, demonstrate that our system can detect all previously-confirmed routing anomalies, while only introducing at most five false alarms every 180 million route announcements. We also deploy our system at a large ISP to perform real-world detection for one month. During the course of deployment, our system detects 497 true anomalies in the wild with an average of only 1.65 false alarms per day.

Enhancing Network Attack Detection with Distributed and In-Network Data Collection System

Seyed Mohammad Mehdi Mirnajafizadeh and Ashwin Raam Sethuram, Wayne State University; David Mohaisen, University of Central Florida; DaeHun Nyang, Ewha Womans University; Rhongho Jang, Wayne State University

Available Media

The collection of network data poses a significant challenge for machine/deep learning-driven network defense systems. This paper proposes a new paradigm, namely In-network Serverless Data Collection (ISDC), to eliminate the bottleneck between network infrastructure (where data is generated) and security application servers (where data is consumed). Considering the extremely mismatched scale between traffic volume and in-network resources, we stress the need for prioritizing flows based on the application's interests, and a sublinear prediction algorithm is proposed to prioritize specific flows, for optimizing resource consumption effectively. Additionally, a negotiation-free task migration mechanism with task-data isolation is introduced to allocate tasks dynamically across the network, for enhancing resource efficiency. Furthermore, ISDC incorporates a serverless data migration and aggregation mechanism to ensure data integrity and serves as a reliable and distributed data source for network defense systems. We present two use cases to demonstrate the feasibility of ISDC, namely covert channel detection and DoS/DDoS attack detection. In both scenarios, ISDC achieves significantly higher flow coverage and feature accuracy than existing schemes, leading to improved attack detection accuracy. Remarkably, ISDC's data integrity addresses a model self-poisoning issue caused by duplicated and fragmented flow measurements generated during collaborative measurements.

You Cannot Escape Me: Detecting Evasions of SIEM Rules in Enterprise Networks

Rafael Uetz, Marco Herzog, and Louis Hackländer, Fraunhofer FKIE; Simon Schwarz, University of Göttingen; Martin Henze, RWTH Aachen University & Fraunhofer FKIE

Distinguished Artifact Award Winner

Available Media

Cyberattacks have grown into a major risk for organizations, with common consequences being data theft, sabotage, and extortion. Since preventive measures do not suffice to repel attacks, timely detection of successful intruders is crucial to stop them from reaching their final goals. For this purpose, many organizations utilize Security Information and Event Management (SIEM) systems to centrally collect security-related events and scan them for attack indicators using expert-written detection rules. However, as we show by analyzing a set of widespread SIEM detection rules, adversaries can evade almost half of them easily, allowing them to perform common malicious actions within an enterprise network without being detected. To remedy these critical detection blind spots, we propose the idea of adaptive misuse detection, which utilizes machine learning to compare incoming events to SIEM rules on the one hand and known-benign events on the other hand to discover successful evasions. Based on this idea, we present AMIDES, an open-source proof-of-concept adaptive misuse detection system. Using four weeks of SIEM events from a large enterprise network and more than 500 hand-crafted evasions, we show that AMIDES successfully detects a majority of these evasions without any false alerts. In addition, AMIDES eases alert analysis by assessing which rules were evaded. Its computational efficiency qualifies AMIDES for real-world operation and hence enables organizations to significantly reduce detection blind spots with moderate effort.

MAGIC: Detecting Advanced Persistent Threats via Masked Graph Representation Learning

Zian Jia and Yun Xiong, Shanghai Key Laboratory of Data Science, School of Computer Science, Fudan University, China; Yuhong Nan, School of Software Engineering, Sun Yat-sen University, China; Yao Zhang, Shanghai Key Laboratory of Data Science, School of Computer Science, Fudan University, China; Jinjing Zhao, National Key Laboratory of Science and Technology on Information System Security, China; Mi Wen, Shanghai University of Electric Power, China

Available Media

Advance Persistent Threats (APTs), adopted by most delicate attackers, are becoming increasing common and pose great threat to various enterprises and institutions. Data provenance analysis on provenance graphs has emerged as a common approach in APT detection. However, previous works have exhibited several shortcomings: (1) requiring attack-containing data and a priori knowledge of APTs, (2) failing in extracting the rich contextual information buried within provenance graphs and (3) becoming impracticable due to their prohibitive computation overhead and memory consumption.

In this paper, we introduce MAGIC, a novel and flexible self-supervised APT detection approach capable of performing multi-granularity detection under different level of supervision. MAGIC leverages masked graph representation learning to model benign system entities and behaviors, performing efficient deep feature extraction and structure abstraction on provenance graphs. By ferreting out anomalous system behaviors via outlier detection methods, MAGIC is able to perform both system entity level and batched log level APT detection. MAGIC is specially designed to handle concept drift with a model adaption mechanism and successfully applies to universal conditions and detection scenarios. We evaluate MAGIC on three widely-used datasets, including both real-world and simulated attacks. Evaluation results indicate that MAGIC achieves promising detection results in all scenarios and shows enormous advantage over state-of-the-art APT detection approaches in performance overhead.

CellularLint: A Systematic Approach to Identify Inconsistent Behavior in Cellular Network Specifications

Mirza Masfiqur Rahman, Imtiaz Karim, and Elisa Bertino, Purdue University

Available Media

In recent years, there has been a growing focus on scrutinizing the security of cellular networks, often attributing security vulnerabilities to issues in the underlying protocol design descriptions. These protocol design specifications, typically extensive documents that are thousands of pages long, can harbor inaccuracies, underspecifications, implicit assumptions, and internal inconsistencies. In light of the evolving landscape, we introduce CellularLint—a semi-automatic framework for inconsistency detection within the standards of 4G and 5G, capitalizing on a suite of natural language processing techniques. Our proposed method uses a revamped few-shot learning mechanism on domain-adapted large language models. Pre-trained on a vast corpus of cellular network protocols, this method enables CellularLint to simultaneously detect inconsistencies at various levels of semantics and practical use cases. In doing so, CellularLint significantly advances the automated analysis of protocol specifications in a scalable fashion. In our investigation, we focused on the Non-Access Stratum (NAS) and the security specifications of 4G and 5G networks, ultimately uncovering 157 inconsistencies with 82.67% accuracy. After verification of these inconsistencies on open-source implementations and 17 commercial devices, we confirm that they indeed have a substantial impact on design decisions, potentially leading to concerns related to privacy, integrity, availability, and interoperability.

Track 5

ML IX: Model Extraction and Watermark

Session Chair: Rainer Böhme

Salon G

SoK: All You Need to Know About On-Device ML Model Extraction - The Gap Between Research and Practice

Tushar Nayan, Qiming Guo, and Mohammed Al Duniawi, Florida International University; Marcus Botacin, Texas A&M University; Selcuk Uluagac and Ruimin Sun, Florida International University

Available Media

On-device ML is increasingly used in different applications. It brings convenience to offline tasks and avoids sending user-private data through the network. On-device ML models are valuable and may suffer from model extraction attacks from different categories. Existing studies lack a deep understanding of on-device ML model security, which creates a gap between research and practice. This paper provides a systematization approach to classify existing model extraction attacks and defenses based on different threat models. We evaluated well known research projects from existing work with real-world ML models, and discussed their reproducibility, computation complexity, and power consumption. We identified the challenges for research projects in wide adoption in practice. We also provided directions for future research in ML model extraction security.

Unveiling the Secrets without Data: Can Graph Neural Networks Be Exploited through Data-Free Model Extraction Attacks?

Yuanxin Zhuang, Chuan Shi, and Mengmei Zhang, Beijing University of Posts and Telecommunications; Jinghui Chen, The Pennsylvania State University; Lingjuan Lyu, SONY AI; Pan Zhou, Huazhong University of Science and Technology; Lichao Sun, Lehigh University

Available Media

Graph neural networks (GNNs) play a crucial role in various graph applications, such as social science, biology, and molecular chemistry. Despite their popularity, GNNs are still vulnerable to intellectual property threats. Previous studies have demonstrated the susceptibility of GNN models to model extraction attacks, where attackers steal the functionality of GNNs by sending queries and obtaining model responses. However, existing model extraction attacks often assume that the attacker has access to specific information about the victim model's training data, including node attributes, connections, and the shadow dataset. This assumption is impractical in real-world scenarios. To address this issue, we propose StealGNN, the first data-free model extraction attack framework against GNNs. StealGNN advances prior GNN extraction attacks in three key aspects: 1) It is completely data-free, as it does not require actual node features or graph structures to extract GNN models. 2) It constitutes a full-rank attack that can be applied to node classification and link prediction tasks, posing significant intellectual property threats across a wide range of graph applications. 3) It can handle the most challenging hard-label attack setting, where the attacker possesses no knowledge about the target GNN model and can only obtain predicted labels through querying the victim model. Our experimental results on four benchmark graph datasets demonstrate the effectiveness of StealGNN in attacking representative GNN models.

ClearStamp: A Human-Visible and Robust Model-Ownership Proof based on Transposed Model Training

Torsten Krauß, Jasper Stang, and Alexandra Dmitrienko, University of Würzburg

Available Media

Due to costly efforts during data acquisition and model training, Deep Neural Networks (DNNs) belong to the intellectual property of the model creator. Hence, unauthorized use, theft, or modification may lead to legal repercussions. Existing DNN watermarking methods for ownership proof are often non-intuitive, embed human-invisible marks, require trust in algorithmic assessment that lacks human-understandable attributes, and rely on rigid thresholds, making it susceptible to failure in cases of partial watermark erasure.

This paper introduces ClearStamp, the first DNN watermarking method designed for intuitive human assessment. ClearStamp embeds visible watermarks, enabling human decision-making without rigid value thresholds while allowing technology-assisted evaluations. ClearStamp defines a transposed model architecture allowing to use of the model in a backward fashion to interwove the watermark with the main task within all model parameters. Compared to existing watermarking methods, ClearStamp produces visual watermarks that are easy for humans to understand without requiring complex verification algorithms or strict thresholds. The watermark is embedded within all model parameters and entangled with the main task, exhibiting superior robustness. It shows an 8,544-bit watermark capacity comparable to the strongest existing work. Crucially, ClearStamp's effectiveness is model and dataset-agnostic, and resilient against adversarial model manipulations, as demonstrated in a comprehensive study performed with four datasets and seven architectures.

DeepEclipse: How to Break White-Box DNN-Watermarking Schemes

Alessandro Pegoraro, Carlotta Segna, Kavita Kumari, and Ahmad-Reza Sadeghi, Technical University of Darmstadt

Available Media

Deep Learning (DL) models have become crucial in digital transformation, thus raising concerns about their intellectual property rights. Different watermarking techniques have been developed to protect Deep Neural Networks (DNNs) from IP infringement, creating a competitive field for DNN watermarking and removal methods. The predominant watermarking schemes use white-box techniques, which involve modifying weights by adding a unique signature to specific DNN layers. On the other hand, existing attacks on white-box watermarking usually require knowledge of the specific deployed watermarking scheme or access to the underlying data for further training and fine-tuning. We propose DeepEclipse, a novel and unified framework designed to remove white-box watermarks. We present obfuscation techniques that significantly differ from the existing white-box watermarking removal schemes. DeepEclipse can evade watermark detection without prior knowledge of the underlying watermarking scheme, additional data, or training and fine-tuning. Our evaluation reveals that DeepEclipse excels in breaking multiple white-box watermarking schemes, reducing watermark detection to random guessing while maintaining a similar model accuracy as the original one. Our framework showcases a promising solution to address the ongoing DNN watermark protection and removal challenges.

ModelGuard: Information-Theoretic Defense Against Model Extraction Attacks

Minxue Tang and Anna Dai, Duke University; Louis DiValentin, Aolin Ding, and Amin Hass, Accenture; Neil Zhenqiang Gong, Yiran Chen, and Hai "Helen" Li, Duke University

Available Media

Malicious utilization of a query interface can compromise the confidentiality of ML-as-a-Service (MLaaS) systems via model extraction attacks. Previous studies have proposed to perturb the predictions of the MLaaS system as a defense against model extraction attacks. However, existing prediction perturbation methods suffer from a poor privacy-utility balance and cannot effectively defend against the latest adaptive model extraction attacks. In this paper, we propose a novel prediction perturbation defense named ModelGuard, which aims at defending against adaptive model extraction attacks while maintaining a high utility of the protected system. We develop a general optimization problem that considers different kinds of model extraction attacks, and ModelGuard provides an information-theoretic defense to efficiently solve the optimization problem and achieve resistance against adaptive attacks. Experiments show that ModelGuard attains significantly better defensive performance against adaptive attacks with less loss of utility compared to previous defenses.

Track 6

Fuzzing IV: Hardware and Firmware

Session Chair: Marius Muench

Salon H

SHiFT: Semi-hosted Fuzz Testing for Embedded Applications

Alejandro Mera and Changming Liu, Northeastern University; Ruimin Sun, Florida International University; Engin Kirda and Long Lu, Northeastern University

Available Media

Modern microcontrollers (MCU)s are ubiquitous on critical embedded applications in the IoT era. Therefore, securing MCU firmware is fundamental. To analyze MCU firmware security, existing works mostly adopt re-hosting based techniques. These techniques transplant firmware to an engineered platform and require tailored hardware or emulation of different parts of the MCU. As a result, security practitioners have observed low-fidelity, false positives, and reduced compatibility with real and complex hardware. This paper presents SHiFT, a framework that leverages the industry semihosting philosophy to provide a brandnew method that analyzes firmware natively in MCUs. This novel method provides high fidelity, reduces false positives, and grants compatibility with complex peripherals, asynchronous events, real-time operations, and direct memory access (DMA). We verified compatibility of SHiFT with thirteen popular embedded architectures, and fully evaluated prototypes for ARMv7-M, ARMv8-M and Xtensa architectures. Our evaluation shows that SHiFT can detect a wide range of firmware faults with instrumentation running natively in the MCU. In terms of performance, SHiFT is up to two orders of magnitude faster (i.e., ×100) than software-based emulation, and even comparable to fuzz testing native applications in a workstation. Thanks to SHiFT's unique characteristics, we discovered five previously unknown vulnerabilities, including a zero-day on the popular FreeRTOS kernel, with no false positives. Our prototypes and source code are publicly available at https://github.com/RiS3-Lab/SHiFT.

Cascade: CPU Fuzzing via Intricate Program Generation

Flavien Solt, Katharina Ceesay-Seitz, and Kaveh Razavi, ETH Zurich

Available Media

Generating interesting test cases for CPU fuzzing is akin to generating programs that exercise unusual states inside the CPU. The performance of CPU fuzzing is heavily influenced by the quality of these programs and by the overhead of bug detection. Our analysis of existing state-of-the-art CPU fuzzers shows that they generate programs that are either overly simple or execute a small fraction of their instructions due to invalid control flows. Combined with expensive instruction-granular bug detection mechanisms, this leads to inefficient fuzzing campaigns. We present Cascade, a new approach for generating valid RISC-V programs of arbitrary length with highly randomized and interdependent control and data flows. Cascade relies on a new technique called asymmetric ISA pre-simulation for entangling control flows with data flows when generating programs. This entanglement results in non-termination when a program triggers a bug in the target CPU, enabling Cascade to detect a CPU bug at program granularity without introducing any runtime overhead. Our evaluation shows that long Cascade programs are more effective in exercising the CPU's internal design. Cascade achieves 28.2x to 97x more coverage than the state-of-the-art CPU fuzzers and uncovers 37 new bugs (28 new CVEs) in 5 RISC-V CPUs with varying degrees of complexity. The programs that trigger these bugs are long and intricate, impeding triaging. To address this challenge, Cascade features an automated pruning method that reduces a program to a minimal number of instructions that trigger the bug.

MultiFuzz: A Multi-Stream Fuzzer For Testing Monolithic Firmware

Michael Chesser, The University of Adelaide and Data61 CSIRO, Cyber Security Cooperative Research Centre; Surya Nepal, Data61 CSIRO, Cyber Security Cooperative Research Centre; Damith C. Ranasinghe, The University of Adelaide

Available Media

Rapid embedded device proliferation is creating new targets and opportunities for adversaries. However, the complex interactions between firmware and hardware pose challenges to applying automated testing, such as fuzzing. State-of-the-art methods re-host firmware in emulators and facilitate complex interactions with hardware by provisioning for inputs from a diversity of methods (such as interrupts) from a plethora of devices (such as modems). We recognize a significant disconnect between how a fuzzer generates inputs (as a monolithic file) and how the inputs are consumed during re-hosted execution (as a stream, in slices, per peripheral). We demonstrate the disconnect to significantly impact a fuzzer's effectiveness at discovering inputs that explore deeper code and bugs.

We rethink the input generation process for fuzzing monolithic firmware and propose a new approach—multi-stream input generation and representation; inputs are now a collection of independent streams, one for each peripheral. We demonstrate the versatility and effectiveness of our approach by implementing: i) stream specific mutation strategies; ii) efficient methods for generating useful values for peripherals; iii) enhancing the use of information learned during fuzzing; and iv) improving a fuzzer's ability to handle roadblocks. We design and build a new fuzzer, MULTIFUZZ, for testing monolithic firmware and evaluate our approach on synthetic and real-world targets. MULTIFUZZ passes all 66 unit tests from a benchmark consisting of 46 synthetic binaries targeting a diverse set of microcontrollers. On an evaluation with 23 real-world firmware targets, MULTIFUZZ outperforms the state-of-the-art fuzzers Fuzzware and Ember-IO. MULTIFUZZ reaches significantly more code on 14 out of the 23 firmware targets and similar coverage on the remainder. Further, MULTIFUZZ discovered 18 new bugs on real-world targets, many thoroughly tested by previous fuzzers.

WhisperFuzz: White-Box Fuzzing for Detecting and Locating Timing Vulnerabilities in Processors

Pallavi Borkar, Indian Institute of Technology Madras; Chen Chen, Texas A&M University; Mohamadreza Rostami, Technische Universität Darmstadt; Nikhilesh Singh, Indian Institute of Technology Madras; Rahul Kande, Texas A&M University; Ahmad-Reza Sadeghi, Technische Universität Darmstadt; Chester Rebeiro, Indian Institute of Technology Madras; Jeyavijayan Rajendran, Texas A&M University

Distinguished Paper Award Winner

Available Media

Timing vulnerabilities in processors have emerged as a potent threat. As processors are the foundation of any computing system, identifying these flaws is imperative. Recently fuzzing techniques, traditionally used for detecting software vulnerabilities, have shown promising results for uncovering vulnerabilities in large-scale hardware designs, such as processors. Researchers have adapted black-box or grey-box fuzzing to detect timing vulnerabilities in processors. However, they cannot identify the locations or root causes of these timing vulnerabilities, nor do they provide coverage feedback to enable the designer's confidence in the processor's security.

To address the deficiencies of the existing fuzzers, we present WhisperFuzz—the first white-box fuzzer with static analysis—aiming to detect and locate timing vulnerabilities in processors and evaluate the coverage of microarchitectural timing behaviors. WhisperFuzz uses the fundamental nature of processors' timing behaviors, microarchitectural state transitions, to localize timing vulnerabilities. WhisperFuzz automatically extracts microarchitectural state transitions from a processor design at the register-transfer level (RTL) and instruments the design to monitor the state transitions as coverage. Moreover, WhisperFuzz measures the time a design-under-test (DUT) takes to process tests, identifying any minor, abnormal variations that may hint at a timing vulnerability. WhisperFuzz detects 12 new timing vulnerabilities across advanced open-sourced RISC-V processors: BOOM, Rocket Core, and CVA6. Eight of these violate the zero latency requirements of the Zkt extension and are considered serious security vulnerabilities. Moreover, WhisperFuzz also pinpoints the locations of the new and the existing vulnerabilities.

EL3XIR: Fuzzing COTS Secure Monitors

Christian Lindenmeier, FAU Erlangen-Nürnberg; Mathias Payer and Marcel Busch, EPFL

Available Media

ARM TrustZone forms the security backbone of mobile devices. TrustZone-based Trusted Execution Environments (TEEs) facilitate security-sensitive tasks like user authentication, disk encryption, and digital rights management (DRM). As such, bugs in the TEE software stack may compromise the entire system's integrity.

EL3XIR introduces a framework to effectively rehost and fuzz the secure monitor firmware layer of proprietary TrustZone-based TEEs. While other approaches have focused on naively rehosting or fuzzing Trusted Applications (EL0) or the TEE OS (EL1), EL3XIR targets the highly-privileged but unexplored secure monitor (EL3) and its unique challenges. Secure monitors expose complex functionality dependent on multiple peripherals through diverse secure monitor calls.

In our evaluation, we demonstrate that state-of-the-art fuzzing approaches are insufficient to effectively fuzz COTS secure monitors. While naive fuzzing appears to achieve reasonable coverage it fails to overcome coverage walls due to missing peripheral emulation and is limited in the capability to trigger bugs due to the large input space and low quality of inputs. We followed responsible disclosure procedures and reported a total of 34 bugs, out of which 17 were classified as security critical. Affected vendors confirmed 14 of these bugs, and as a result, EL3XIR was assigned six CVEs.

Track 7

Crypto IV: Position and Elections

Session Chair: Pascal Lafourcade

Salon IJK

GridSE: Towards Practical Secure Geographic Search via Prefix Symmetric Searchable Encryption

Ruoyang Guo, Jiarui Li, and Shucheng Yu, Stevens Institute of Technology

Available Media

The proliferation of location-based services and applications has brought significant attention to data and location privacy. While general secure computation and privacy-enhancing techniques can partially address this problem, one outstanding challenge is to provide near latency-free search and compatibility with mainstream geographic search techniques, especially the Discrete Global Grid Systems (DGGS). This paper proposes a new construction, namely GridSE, for efficient and DGGS-compatible Secure Geographic Search (SGS) with both backward and forward privacy. We first formulate the notion of a semantic-secure primitive called symmetric prefix predicate encryption (SP²E), for predicting whether or not a keyword contains a given prefix, and provide a construction. Then we extend SP²E for dynamic prefix symmetric searchable encryption (pSSE), namely GridSE, which supports both backward and forward privacy. GridSE only uses lightweight primitives including cryptographic hash and XOR operations and is extremely efficient. Furthermore, we provide a generic pSSE framework that enables prefix search for traditional dynamic SSE that supports only full keyword search. Experimental results over real-world geographic databases of sizes (by the number of entries) from 10^3 to 10^7 and mainstream DGGS techniques show that GridSE achieves a speedup of 150x - 5000x on search latency and a saving of 99% on communication overhead as compared to the state-of-the-art. Interestingly, even compared to plaintext search, GridSE introduces only 1.4x extra computational cost and 0.9x additional communication cost. Source code of our scheme is available at https://github.com/rykieguo1771/GridSE-RAM.

Experimental results over real-world geographic databases of sizes (by the number of entries) from 10^3 to 10^7 and mainstream DGGS techniques show that GridSE achieves a speedup of 150x - 5000x on search latency and a saving of 99% on communication overhead as compared to the state-of-the-art. Interestingly, even compared to plaintext search, GridSE introduces only 1.4x extra computational cost and 0.9x additional communication cost. Source code of our scheme is available at https://github.com/rykieguo1771/GridSE-RAM.

Abuse-Resistant Location Tracking: Balancing Privacy and Safety in the Offline Finding Ecosystem

Harry Eldridge, Gabrielle Beck, and Matthew Green, Johns Hopkins University; Nadia Heninger, University of California, San Diego; Abhishek Jain, Johns Hopkins University

Available Media

Location tracking accessories (or "tracking tags") such as those sold by Apple, Samsung, and Tile, allow owners to track the location of their property via offline finding networks. The tracking protocols were designed to ensure that no entity (including the vendor) can use a tag's broadcasts to surveil its owner. These privacy guarantees, however, seem to be at odds with the phenomenon of tracker-based stalking, where attackers use these very tags to monitor a target's movements. Numerous such criminal incidents have been reported, and in response, manufacturers have chosen to substantially weaken privacy guarantees in order to allow users to detect stalker tags. This compromise has been adopted in a recent IETF draft jointly proposed by Apple and Google.

We put forth the notion of abuse-resistant offline finding protocols that aim to achieve a better balance between user privacy and stalker detection. We present an efficient protocol that achieves stalker detection under realistic conditions without sacrificing honest user privacy. At the heart of our result, and of independent interest, is a new notion of multi-dealer secret sharing which strengthens standard secret sharing with novel privacy and correctness guarantees. We show that this primitive can be instantiated efficiently on edge devices using variants of Interleaved Reed-Solomon codes combined with new lattice-based decoding algorithms.

Security and Privacy Analysis of Samsung's Crowd-Sourced Bluetooth Location Tracking System

Tingfeng Yu, James Henderson, Alwen Tiu, and Thomas Haines, School of Computing, The Australian National University

Available Media

We present a detailed analysis of Samsung's Offline Finding (OF) protocol, which is part of Samsung's Find My Mobile system for locating Samsung mobile devices and Galaxy SmartTags. The OF protocol uses Bluetooth Low Energy (BLE) to broadcast a unique beacon for a lost device. This beacon is then picked up by nearby Samsung phones or tablets (the helper devices), which then forward the beacon and the location it was detected at, to a vendor server. The owner of a lost device can then query the server to locate their device. We examine several security and privacy related properties of the OF protocol and its implementation. These include: the feasibility of tracking an OF device through its BLE data, the feasibility of unwanted tracking of a person by exploiting the OF network, the feasibility for the vendor to de-anonymise location reports to determine the locations of the owner or the helper devices, and the feasibility for an attacker to compromise the integrity of the location reports. Our findings suggest that there are privacy risks on all accounts, arising from issues in the design and the implementation of the OF protocol.

The Decisive Power of Indecision: Low-Variance Risk-Limiting Audits and Election Contestation via Marginal Mark Recording

Benjamin Fuller, Rashmi Pai, and Alexander Russell, University of Connecticut - Voting Technology Research Center

Available Media

Risk-limiting audits (RLAs) are techniques for verifying the outcomes of large elections. While they provide rigorous guarantees of correctness, widespread adoption has been impeded by both efficiency concerns and the fact they offer statistical, rather than absolute, conclusions. We attend to both of these difficulties, defining new families of audits that improve efficiency and offer qualitative advances in statistical power.

Our new audits are enabled by revisiting the standard notion of a cast-vote record so that it can declare multiple possible mark interpretations rather than a single decision; this can reflect the presence of marginal marks, which appear regularly on hand-marked ballots. We show that this simple expedient can offer significant efficiency improvements with only minor changes to existing auditing infrastructure. We consider two ways of representing these marks, both yield risk-limiting comparison audits in the formal sense of Fuller, Harrison, and Russell (IEEE Security & Privacy 2023).

We then define a new type of post-election audit we call a contested audit. These permit each candidate to provide a cast-vote record table advancing their own claim to victory. We prove that these audits offer remarkable sample efficiency, yielding control of risk with a constant number of samples (that is independent of margin). This is a first for an audit with provable soundness. These results are formulated in a game-based security model that specify quantitative soundness and completeness guarantees. These audits provide a means to handle contestation of election results affirmed by conventional RLAs.

ElectionGuard: a Cryptographic Toolkit to Enable Verifiable Elections

Josh Benaloh and Michael Naehrig, Microsoft Research; Olivier Pereira, Microsoft Research and UCLouvain; Dan S. Wallach, Rice University

Available Media

ElectionGuard is a flexible set of open-source tools that—when used with traditional election systems—can produce end-to-end verifiable elections whose integrity can be verified by observers, candidates, media, and even voters themselves. ElectionGuard has been integrated into a variety of systems and used in actual public U.S. elections in Wisconsin, California, Idaho, Utah, and Maryland as well as in caucus elections in the U.S. Congress. It has also been used for civic voting in the Paris suburb of Neuilly-sur-Seine and for an online election by a Switzerland/Denmark-based organization.

The principal innovation of ElectionGuard is the separation of the cryptographic tools from the core mechanics and user interfaces of voting systems. This separation allows the cryptography to be designed and built by security experts without having to re-invent and replace the existing infrastructure. Indeed, in its preferred deployment, ElectionGuard does not replace the existing vote counting infrastructure but instead runs alongside and produces its own independently-verifiable tallies. Although much of the cryptography in ElectionGuard is, by design, not novel, some significant innovations are introduced which greatly simplify the process of verification.

This paper describes the design of ElectionGuard, its innovations, and many of the learnings from its implementation and growing number of real-world deployments.

10:15 am–10:45 am

Coffee and Tea Break

Grand Ballroom Foyer

10:45 am–11:45 am

Track 1

Measurement VI: Human Behavior and Security

Session Chair: Rachel Greenstadt

Salon AB

A High Coverage Cybersecurity Scale Predictive of User Behavior

Yukiko Sawaya, KDDI Research Inc.; Sarah Lu, Massachusetts Institute of Technology; Takamasa Isohara, KDDI Research Inc.; Mahmood Sharif, Tel Aviv University

Available Media

Psychometric security scales can enable various crucial tasks (e.g., measuring changes in user behavior over time), but, unfortunately, they often fail to accurately predict actual user behavior. We hypothesize that one can enhance prediction accuracy via more comprehensive scales measuring a wider range of security-related factors. To test this hypothesis, we ran a series of four online studies with a total of 1,471 participants. First, we developed the extended security behavior scale (ESBS), a high-coverage scale containing substantially more items than prior ones, and collected responses to characterize its underlying structure. Then, we conducted a follow-up study to confirm ESBS' structural validity and reliability. Finally, over the course of two studies, we elicited user responses to our scale and prior ones while measuring three security behaviors reflected by Internet browser data. Then, we constructed predictive machine-learning models and found that ESBS can predict these behaviors with statistically significantly higher accuracy than prior scales (6.17%–8.53% ROC AUC), thus supporting our hypothesis.

Biosignal Authentication Considered Harmful Today

Veena Krish, Stony Brook University; Nicola Paoletti and Milad Kazemi, King's College London; Scott Smolka and Amir Rahmati, Stony Brook University

Available Media

User authentication systems based on cardiovascular biosignals have gained prominence in recent years, as these signals are presumed difficult to forge. We challenge this assumption by showing that an observer who has access to one type of cardiac data – such as a user's pulse waveform, readily obtainable from video and commercial smartwatches – can design a spoofing attack strong enough to fool multiple authentication systems based on other cardiovascular biosignals. We present BioForge, an approach that leverages a cycle-consistent generative adversarial network to synthesize realistic physiological signals for a given user without relying on simultaneously collected supervision data. We evaluate BioForge on multiple open-access datasets and an array of verification systems – many of which can be fooled over 50% of the time in 10 or fewer attempts. Notably, we are able to fool systems that rely not just on heart rate and peak locations but also on the morphology of the waveforms. Our work conclusively demonstrates that authentication systems should not rely on the secrecy of cardiovascular biosignals.

cardiovascular biosignals.

GlobalConfusion: TrustZone Trusted Application 0-Days by Design

Marcel Busch, Philipp Mao, and Mathias Payer, EPFL

Available Media

Trusted Execution Environments form the backbone of mobile device security architectures. The GlobalPlatform Internal Core API is the de-facto standard that unites the fragmented landscape of real-world implementations, providing compatibility between different TEEs.

Unfortunately, our research reveals that this API standard is prone to a design weakness. Manifestations of this weakness result in critical type-confusion bugs in real-world user-space applications of the TEE, called Trusted Applications (TAs). At its core, the design weakness consists of a fail-open design leaving an optional type check for untrusted data to TA developers. The API does not mandate this easily forgettable check that in most cases results in arbitrary read-and-write exploitation primitives. To detect instances of these type-confusion bugs, we design and implement GPCheck, a static binary analysis system capable of vetting real-world TAs. We employ GPCheck to analyze 14,777 TAs deployed on widely used TEEs to investigate the prevalence of the issue. We reconfirm known bugs that fit this pattern and discover unknown instances of the issue in the wild. In total, we confirmed 9 known bugs, found 10 instances of silently-fixed bugs, and discovered a surprising amount of 14 critical 0-day vulnerabilities using our GPCheck prototype. Our findings affect mobile devices currently in use by billions of users. We responsibly disclosed these findings, already received 12,000 USD as bug bounty, and were assigned four CVEs. Ten of our 14 critical 0-day vulnerabilities are still in the responsible disclosure process. Finally, we propose an extension to the GP Internal Core API specification to enforce a fail-safe mechanism that removes the underlying design weakness. We implement and successfully demonstrate our mitigation on OPTEE, an open-source TEE implementation. We shared our findings with GlobalPlatform and suggested our mitigation as an extension to their specification to secure future TEE implementations.

PointerGuess: Targeted Password Guessing Model Using Pointer Mechanism

Kedong Xiu and Ding Wang, Nankai University

Available Media

Most existing targeted password guessing models view users' reuse behaviors as sequences of edit operations (e.g., insert and delete) performed on old passwords. These atomic edit operations are limited to modifying one character at a time and cannot fully cover users' complex password modification behaviors (e.g., modifying the password structure). This partially leads to a significant gap between the proportion of users' reused passwords and the success rates that existing targeted password models can achieve. To fill this gap, this paper models users' reuse behaviors by focusing on two key components: (1) What they want to copy/keep; (2) What they want to tweak. More specifically, we introduce the pointer mechanism and propose a new targeted guessing model, namely PointerGuess. By hierarchically redefining password reuse from both personal and population-wide perspectives, we can accurately and comprehensively characterize users' password reuse behaviors. Moreover, we propose MS-PointerGuess, which can employ the victim's multiple leaked passwords.

By employing 13 large-scale real-world password datasets, we demonstrate that PointerGuess is effective: (1) When the victim's password at site A (namely pwA) is known, within 100 guesses, the average success rate of PointerGuess in guessing her password at site B (namely pwB, pwA ≠ pwB) is 25.21% (for common users) and 12.34% (for security-savvy users), respectively, which is 21.23%~71.54% (38.37% on average) higher than its foremost counterparts; (2) When not excluding identical password pairs (i.e., pwA can equal pwB), within 100 guesses, the average success rate of PointerGuess is 48.30% (for common users) and 28.42% (for security-savvy users), respectively, which is 6.31%~15.92% higher than its foremost counterparts; (3) Within 100 guesses, the MS-PointerGuess further improves the cracking success rate by 31.21% compared to PointerGuess.

Track 2

Hardware Security IV: Firmware

Session Chair: Marius Muench

Salon CD

FFXE: Dynamic Control Flow Graph Recovery for Embedded Firmware Binaries

Ryan Tsang, Asmita, and Doreen Joseph, University of California, Davis; Soheil Salehi, University of Arizona; Prasant Mohapatra and Houman Homayoun, University of California, Davis

Available Media

Control Flow Graphs (CFG) play a significant role as an intermediary analysis in many advanced static and dynamic software analysis techniques. As firmware security and validation for embedded systems becomes a greater concern, accurate CFGs for embedded firmware binaries are crucial for adapting many valuable software analysis techniques to firmware, which can enable more thorough functionality and security analysis. In this work, we present a portable new dynamic CFG recovery technique based on dynamic forced execution that allows us to resolve indirect branches to registered callback functions, which are dependent on asynchronous changes to volatile memory. Our implementation, the Forced Firmware Execution Engine (FFXE), written in Python using the Unicorn emulation framework, is able to identify 100% of known callback functions in our test set of 36 firmware images, something none of the other techniques we tested against were able to do reliably. Using our results and observations, we compare our engine to 4 other CFG recovery techniques and provide both our thoughts on how this work might enhance other tools, and how it might be further developed. With our contributions, we hope to help enable the application of traditionally software-focused security analysis techniques to the hardware interactions that are integral to embedded system firmware.

CO3: Concolic Co-execution for Firmware

Changming Liu, Alejandro Mera, and Engin Kirda, Northeastern University; Meng Xu, University of Waterloo; Long Lu, Northeastern University

Available Media

Firmware running on resource-constrained embedded microcontrollers (MCUs) is critical in this IoT era, yet their security is under-analyzed. At the same time, concolic execution has proven to be a successful program analysis technique on conventional workstation platforms. However, porting it to the MCUs faces challenges, such as incomplete and inaccurate emulation of hardware peripherals, reliance on customized hardware, and low execution speed.

CO3 is a firmware-oriented concolic executor attempting to address these limitations. CO3 runs the firmware concretely on a real MCU to utilize its fidelity. Unlike previous designs, CO3 gets rid of the slow or proprietary debugging interfaces for synchronization between the MCU and workstation. Instead, CO3 instruments the firmware source code to strategically report runtime information via a basic serial port to a workstation where symbolic constraints are constructed and solved. We further combine CO3 with a semi-hosted firmware fuzzing framework to create a hybrid fuzzer (SHACO).

The evaluation shows that CO3 outperforms state-of-the-art (SoTA) firmware-oriented concolic executors by three orders of magnitude while incurring mild memory and runtime overheads. It is also faster than SymCC, a general concolic executor. When evaluated on the existing benchmark, SHACO finds all known bugs in a much shorter time. It also found seven bugs from three new firmware samples. All new bugs have been confirmed and patched responsibly.

Unveiling IoT Security in Reality: A Firmware-Centric Journey

Nicolas Nino, School of Computing, University of Georgia; Ruibo Lu and Wei Zhou, School of Cyber Science and Engineering, Huazhong University of Science and Technology; Kyu Hyung Lee, School of Computing, University of Georgia; Ziming Zhao, Khoury College of Computer Sciences, Northeastern University; Le Guan, School of Computing, University of Georgia

Available Media

To study the security properties of the Internet of Things (IoT), firmware analysis is crucial. In the past, many works have been focused on analyzing Linux-based firmware. Less known is the security landscape of MCU-based IoT devices, an essential portion of the IoT ecosystem. Existing works on MCU firmware analysis either leverage the companion mobile apps to infer the security properties of the firmware (thus unable to collect low-level properties) or rely on small-scale firmware datasets collected in ad-hoc ways (thus cannot be generalized). To fill this gap, we create a large dataset of MCU firmware for real IoT devices. Our approach statically analyzes how MCU firmware is distributed and then captures the firmware. To reliably recognize the firmware, we develop a firmware signature database, which can match the footprints left in the firmware compilation and packing process. In total, we obtained 8,432 confirmed firmware images (3,692 unique) covering at least 11 chip vendors across 7 known architectures and 2 proprietary architectures. We also conducted a series of static analyses to assess the security properties of this dataset. The result reveals three disconcerting facts: 1) the lack of firmware protection, 2) the existence of N-day vulnerabilities, and 3) the rare adoption of security mitigation.

Your Firmware Has Arrived: A Study of Firmware Update Vulnerabilities

Yuhao Wu, Jinwen Wang, Yujie Wang, Shixuan Zhai, and Zihan Li, Washington University in St. Louis; Yi He, Tsinghua University; Kun Sun, George Mason University; Qi Li, Tsinghua University; Ning Zhang, Washington University in St. Louis

Available Media

Embedded devices are increasingly ubiquitous in our society. Firmware updates are one of the primary mechanisms to mitigate vulnerabilities in embedded systems. However, the firmware update procedure also introduces new attack surfaces, particularly through vulnerable firmware verification procedures. Unlike memory corruption bugs, numerous vulnerabilities in firmware updates stem from incomplete or incorrect verification steps, to which existing firmware analysis methods are not applicable. To bridge this gap, we propose ChkUp, an approach to Check for firmware Update vulnerabilities. ChkUp can resolve the program execution paths during firmware updates using cross-language inter-process control flow analysis and program slicing. With these paths, ChkUp locates firmware verification procedures, examining and validating their vulnerabilities. We implemented ChkUp and conducted a comprehensive analysis on 12,000 firmware images. Then, we validated the alerts in 150 firmware images from 33 device families, leading to the discovery of both zero-day and n-day vulnerabilities. Our findings were disclosed responsibly, resulting in the assignment of 25 CVE IDs and one PSV ID at the time of writing.

Track 3

Mobile Privacy

Session Chair: Zubair Shafiq

Salon E

Abandon All Hope Ye Who Enter Here: A Dynamic, Longitudinal Investigation of Android's Data Safety Section

Ioannis Arkalakis, Michalis Diamantaris, Serafeim Moustakas, and Sotiris Ioannidis, Technical University of Crete; Jason Polakis, University of Illinois Chicago; Panagiotis Ilia, Cyprus University of Technology

Available Media

Users' growing concerns about online privacy have led to increased platform support for transparency and consent in the web and mobile ecosystems. To that end, Android recently mandated that developers must disclose what user data their applications collect and share, and that information is made available in Google Play's Data Safety section.

In this paper, we provide the first large-scale, in-depth investigation on the veracity of the Data Safety section and its use in the Android application ecosystem. We build an automated analysis framework that dynamically exercises and analyzes applications so as to uncover discrepancies between the applications' behavior and the data practices that have been reported in their Data Safety section. Our study on almost 5K applications uncovers a pervasive trend of incomplete disclosure, as 81% misrepresent their data collection and sharing practices in the Data Safety section. At the same time, 79.4% of the applications with incomplete disclosures do not ask the user to provide consent for the data they collect and share, and 78.6% of those that ask for consent disregard the users' choice. Moreover, while embedded third-party libraries are the most common offender, Data Safety discrepancies can be traced back to the application's core code in 41% of the cases. Crucially, Google's documentation contains various "loopholes" that facilitate incomplete disclosure of data practices. Overall, we find that in its current form, Android's Data Safety section does not effectively achieve its goal of increasing transparency and allowing users to provide informed consent. We argue that Android's Data Safety policies require considerable reform, and automated validation mechanisms like our framework are crucial for ensuring the correctness and completeness of applications' Data Safety disclosures.

iHunter: Hunting Privacy Violations at Scale in the Software Supply Chain on iOS

Dexin Liu, Peking University and Alibaba Group; Yue Xiao and Chaoqi Zhang, Indiana University Bloomington; Kaitao Xie and Xiaolong Bai, Alibaba Group; Shikun Zhang, Peking University; Luyi Xing, Indiana University Bloomington

Available Media

Privacy violations and compliance issues in mobile apps are serious concerns for users, developers, and regulators. With many off-the-shelf tools on Android, prior works extensively studied various privacy issues for Android apps. Privacy risks and compliance issues can be equally expected in iOS apps, but have been little studied. In particular, a prominent recent privacy concern was due to diverse third-party libraries widely integrated into mobile apps whose privacy practices are non-transparent. Such a critical supply chain problem, however, was never systematically studied for iOS apps, at least partially due to the lack of the necessary tools.

This paper presents the first large-scale study, based on our new taint analysis system named iHunter, to analyze privacy violations in the iOS software supply chain. iHunter performs static taint analysis on iOS SDKs to extract taint traces representing privacy data collection and leakage practices. It is characterized by an innovative iOS-oriented symbolic execution that tackles dynamic features of Objective-C and Swift and an NLP-powered generator for taint sources and taint rules. iHunter identified non-compliance in 2,585 SDKs (accounting for 40.4%) out of 6,401 iOS SDKs, signifying a substantial presence of SDKs that fail to adhere to compliance standards. We further found a high proportion (47.2% in 32,478) of popular iOS apps using these SDKs, with practical non-compliance risks violating Apple policies and major privacy laws. These results shed light on the pervasiveness and severity of privacy violations in iOS apps' supply chain. iHunter is thoroughly evaluated for its high effectiveness and efficiency. We are responsibly reporting the results to relevant stakeholders.

Is It a Trap? A Large-scale Empirical Study And Comprehensive Assessment of Online Automated Privacy Policy Generators for Mobile Apps

Shidong Pan and Dawen Zhang, CSIRO's Data61 & Australian National University; Mark Staples, CSIRO's Data61; Zhenchang Xing, CSIRO's Data61 & Australian National University; Jieshan Chen, Xiwei Xu, and Thong Hoang, CSIRO's Data61

Available Media

Privacy regulations protect and promote the privacy of individuals by requiring mobile apps to provide a privacy policy that explains what personal information is collected and how these apps process this information. However, developers often do not have sufficient legal knowledge to create such privacy policies. Online Automated Privacy Policy Generators (APPGs) can create privacy policies, but their quality and other characteristics can vary. In this paper, we conduct the first large-scale empirical study and comprehensive assessment of APPGs for mobile apps. Specifically, we scrutinize 10 APPGs on multiple dimensions. We further perform the market penetration analysis by collecting 46,472 Android app privacy policies from Google Play, discovering that nearly 20.1% of privacy policies could be generated by existing APPGs. Lastly, we point out that generated policies in our study do not fully comply with GDPR, CCPA, or LGPD. In summary, app developers must carefully select and use the appropriate APPGs with careful consideration to avoid potential pitfalls.

A NEW HOPE: Contextual Privacy Policies for Mobile Applications and An Approach Toward Automated Generation

Shidong Pan and Zhen Tao, CSIRO's Data61 and Australian National University; Thong Hoang, CSIRO's Data61; Dawen Zhang, CSIRO's Data61 and Australian National University; Tianshi Li, Northeastern University; Zhenchang Xing, CSIRO's Data61 and Australian National University; Xiwei Xu, Mark Staples, and Thierry Rakotoarivelo, CSIRO's Data61; David Lo, Singapore Management University

Available Media

Privacy policies have emerged as the predominant approach to conveying privacy notices to mobile application users. In an effort to enhance both readability and user engagement, the concept of contextual privacy policies (CPPs) has been proposed by researchers. The aim of CPPs is to fragment privacy policies into concise snippets, displaying them only within the corresponding contexts within the application's graphical user interfaces (GUIs). In this paper, we first formulate CPP in mobile application scenario, and then present a novel multimodal framework, named SeePrivacy, specifically designed to automatically generate CPPs for mobile applications. This method uniquely integrates vision-based GUI understanding with privacy policy analysis, achieving 0.88 precision and 0.90 recall to detect contexts, as well as 0.98 precision and 0.96 recall in extracting corresponding policy segments. A human evaluation shows that 77% of the extracted privacy policy segments were perceived as well-aligned with the detected contexts. These findings suggest that SeePrivacy could serve as a significant tool for bolstering user interaction with, and understanding of, privacy policies. Furthermore, our solution has the potential to make privacy notices more accessible and inclusive, thus appealing to a broader demographic. A demonstration of our work can be accessed at: https://cpp4app.github.io/SeePrivacy/

Track 4

Network Security IV: Infrastructure

Session Chair: Nguyen Phong Hoang

Salon F

CDN Cannon: Exploiting CDN Back-to-Origin Strategies for Amplification Attacks

Ziyu Lin, Fuzhou University and Tsinghua University; Zhiwei Lin, Sichuan University and Tsinghua University; Ximeng Liu, Fuzhou University; Jianjun Chen and Run Guo, Tsinghua University; Cheng Chen and Shaodong Xiao, Fuzhou University

Available Media

Content Delivery Networks (CDNs) provide high availability, speed up content delivery, and safeguard against DDoS attacks for their hosting websites. To achieve the aforementioned objectives, CDN designs several 'back-to-origin' strategies that proactively pre-pull resources and modify HTTP requests and responses. However, our research reveals that these 'back-to-origin' strategies prioritize performance over security, which can lead to excessive consumption of the website's bandwidth.

We have proposed a new class of amplification attacks called Back-to-Origin Amplification (BtOAmp) Attacks. These attacks allow malicious attackers to exploit the 'back-to-origin' strategies, triggering the CDN to greedily demand more-than-necessary resources from websites, which finally blows the websites. We evaluated the feasibility and real-world impacts of 'BtOAmp' attacks on fourteen popular CDNs. With real-world threat evaluation, our attack threatens all mainstream websites hosted on CDNs. We responsibly disclosed the details of our attack to the affected CDN vendors and proposed possible mitigation solutions.

You Can Obfuscate, but You Cannot Hide: CrossPoint Attacks against Network Topology Obfuscation

Xuanbo Huang, Kaiping Xue, Lutong Chen, and Mingrui Ai, University of Science and Technology of China; Huancheng Zhou, Texas A&M University; Bo Luo, The University of Kansas; Guofei Gu, Texas A&M University; Qibin Sun, University of Science and Technology of China

Available Media

Link-flooding attacks (LFAs) may disrupt Internet connections in targeted areas by flooding specific links. One effective mitigation strategy against these attacks is network topology obfuscation (NTO), which aims to obscure the network map and conceal critical links, preventing attackers from identifying bottleneck links.

However, we argue that the attackers can still discover critical links in the presence of NTO defenses. In this paper, we introduce the CrossPoint attacks to escape the security protections of state-of-the-art NTO defenses by exploiting two network traffic features: correlated congestion and statistical disparities. Although NTO defenses create a complex and seemingly robust virtual topology, distinct information is still discoverable due to conflicting design objectives and inherent features of the Internet, resulting in novel side channels. Through comprehensive experiments, including a measurement study on the Internet, we demonstrate CrossPoint attacks' high success rate (80%-95%), minor overhead (10%-20%), as well as attack stealthiness and feasibility.

Cross the Zone: Toward a Covert Domain Hijacking via Shared DNS Infrastructure

Yunyi Zhang, National University of Defense Technology; Tsinghua University; Mingming Zhang, Zhongguancun Laboratory; Baojun Liu, Tsinghua University; Zhongguancun Laboratory; Zhan Liu and Jia Zhang, Tsinghua University; Haixin Duan, Tsinghua University; Zhongguancun Laboratory; Min Zhang, Fan Shi, and Chengxi Xu, National University of Defense Technology

Available Media

Domain Name System (DNS) establishes clear responsibility boundaries among nameservers for managing DNS records via authoritative delegation. However, the rise of thirdparty public services has blurred this boundary. In this paper, we uncover a novel attack surface, named XDAuth, arising from public authoritative nameserver infrastructure's failure to isolate data across zones adequately. This flaw enables adversaries to inject arbitrary resource records across logical authority boundaries and covertly hijack domain names without authority. Unlike prior research on stale NS records, which concentrated on domain names delegated to expired nameservers or those of hosting service providers, XDAuth targets enterprises that maintain their authoritative domain names. We demonstrate that XDAuth is entirely feasible, and through comprehensive measurements, we identify 12 vulnerable providers (e.g., Amazon Route 53, NSONE, and DigiCert DNS), affecting 125,124 domains of notable enterprises, including the World Bank, and the BBC. Moreover, we responsibly disclose the issue to the affected vendors. Some DNS providers and enterprises (e.g., Amazon Route 53) have recognized the issue and are adopting mitigation measures based on our suggestions.

CAMP: Compositional Amplification Attacks against DNS

Huayi Duan, Marco Bearzi, Jodok Vieli, David Basin, Adrian Perrig, and Si Liu, ETH Zürich; Bernhard Tellenbach, Armasuisse

Available Media

While DNS is often exploited by reflective DoS attacks, it can also be weaponized as a powerful amplifier to overload itself, as evidenced by a stream of recently discovered application-layer amplification attacks. Given the importance of DNS, the question arises of what the fundamental traits are for such attacks. To answer this question, we perform a systematic investigation by establishing a taxonomy of amplification primitives intrinsic to DNS and a framework to analyze their composability. This approach leads to the discovery of a large family of compositional amplification (CAMP) vulnerabilities, which can produce multiplicative effects with message amplification factors of hundreds to thousands. Our measurements with popular DNS implementations and open resolvers indicate the ubiquity and severity of CAMP vulnerabilities and the serious threats they pose to the Internet's crucial naming infrastructure.

Track 5

LLM III: Abuse

Session Chair: Alvaro A. Cardenas

Salon G

Moderating Illicit Online Image Promotion for Unsafe User Generated Content Games Using Large Vision-Language Models

Keyan Guo, Ayush Utkarsh, Wenbo Ding, and Isabelle Ondracek, University at Buffalo; Ziming Zhao, Northeastern University; Guo Freeman, Clemson University; Nishant Vishwamitra, The University of Texas at San Antonio; Hongxin Hu, University at Buffalo

Available Media

Online user generated content games (UGCGs) are increasingly popular among children and adolescents for social interaction and more creative online entertainment. However, they pose a heightened risk of exposure to explicit content, raising growing concerns for the online safety of children and adolescents. Despite these concerns, few studies have addressed the issue of illicit image-based promotions of unsafe UGCGs on social media, which can inadvertently attract young users. This challenge arises from the difficulty of obtaining comprehensive training data for UGCG images and the unique nature of these images, which differ from traditional unsafe content. In this work, we take the first step towards studying the threat of illicit promotions of unsafe UGCGs. We collect a real-world dataset comprising 2,924 images that display diverse sexually explicit and violent content used to promote UGCGs by their game creators. Our in-depth studies reveal a new understanding of this problem and the urgent need for automatically flagging illicit UGCG promotions. We additionally create a cutting-edge system, UGCG-Guard, designed to aid social media platforms in effectively identifying images used for illicit UGCG promotions. This system leverages recently introduced large vision-language models~(VLMs) and employs a novel conditional prompting strategy for zero-shot domain adaptation, along with chain-of-thought (CoT) reasoning for contextual identification. UGCG-Guardachieves outstanding results, with an accuracy rate of 94% in detecting these images used for the illicit promotion of such games in real-world scenarios.

Deciphering Textual Authenticity: A Generalized Strategy through the Lens of Large Language Semantics for Detecting Human vs. Machine-Generated Text

Mazal Bethany, The University of Texas at San Antonio and Secure AI and Autonomy Lab; Brandon Wherry, The University of Texas at San Antonio, Secure AI and Autonomy Lab, and Peraton Labs; Emet Bethany, The University of Texas at San Antonio and Secure AI and Autonomy Lab; Nishant Vishwamitra and Anthony Rios, The University of Texas at San Antonio; Peyman Najafirad, The University of Texas at San Antonio and Secure AI and Autonomy Lab

Available Media

With the recent proliferation of Large Language Models (LLMs), there has been an increasing demand for tools to detect machine-generated text. The effective detection of machine-generated text face two pertinent problems: First, they are severely limited in generalizing against real-world scenarios, where machine-generated text is produced by a variety of generators and spans diverse domains. Second, existing detection methodologies treat texts produced by LLMs through a restrictive binary classification lens, neglecting the nuanced diversity of artifacts generated by different LLMs, each of which exhibits distinctive stylistic and structural elements. In this work, we undertake a systematic study on the detection of machine-generated text in real-world scenarios. We first study the effectiveness of state-of-the-art approaches and find that they are severely limited against text produced by diverse generators and domains in the real world. Furthermore, t-SNE visualizations of the embeddings from a pretrained LLM's encoder show that they cannot reliably distinguish between human and machine-generated text. Based on our findings, we introduce a novel system, T5LLMCipher, for detecting machine-generated text using a pretrained T5 encoder combined with LLM embedding sub-clustering to address the text produced by diverse generators and domains in the real world. We evaluate our approach across 9 machine-generated text systems and 9 domains and find that our approach provides state-of-the-art generalization ability, with an average increase in F1 score on machine-generated text of 11.9% on unseen generators and domains compared to the top performing supervised learning approaches and correctly attributes the generator of text with an accuracy of 93.6%. We make the code for our proposed approach publicly available at https: //github.com/SecureAIAutonomyLab/LLM-Cipher

Prompt Stealing Attacks Against Text-to-Image Generation Models

Xinyue Shen, Yiting Qu, Michael Backes, and Yang Zhang, CISPA Helmholtz Center for Information Security

Available Media

Text-to-Image generation models have revolutionized the artwork design process and enabled anyone to create high-quality images by entering text descriptions called prompts. Creating a high-quality prompt that consists of a subject and several modifiers can be time-consuming and costly. In consequence, a trend of trading high-quality prompts on specialized marketplaces has emerged. In this paper, we perform the first study on understanding the threat of a novel attack, namely prompt stealing attack, which aims to steal prompts from generated images by text-to-image generation models. Successful prompt stealing attacks directly violate the intellectual property of prompt engineers and jeopardize the business model of prompt marketplaces. We first perform a systematic analysis on a dataset collected by ourselves and show that a successful prompt stealing attack should consider a prompt's subject as well as its modifiers. Based on this observation, we propose a simple yet effective prompt stealing attack, PromptStealer. It consists of two modules: a subject generator trained to infer the subject and a modifier detector for identifying the modifiers within the generated image. Experimental results demonstrate that PromptStealer is superior over three baseline methods, both quantitatively and qualitatively. We also make some initial attempts to defend PromptStealer. In general, our study uncovers a new attack vector within the ecosystem established by the popular text-to-image generation models. We hope our results can contribute to understanding and mitigating this emerging threat.

Quantifying Privacy Risks of Prompts in Visual Prompt Learning

Yixin Wu, Rui Wen, and Michael Backes, CISPA Helmholtz Center for Information Security; Pascal Berrang, University of Birmingham; Mathias Humbert, University of Lausanne; Yun Shen, Netapp; Yang Zhang, CISPA Helmholtz Center for Information Security

Available Media

Large-scale pre-trained models are increasingly adapted to downstream tasks through a new paradigm called prompt learning. In contrast to fine-tuning, prompt learning does not update the pre-trained model's parameters. Instead, it only learns an input perturbation, namely prompt, to be added to the downstream task data for predictions. Given the fast development of prompt learning, a well-generalized prompt inevitably becomes a valuable asset as significant effort and proprietary data are used to create it. This naturally raises the question of whether a prompt may leak the proprietary information of its training data. In this paper, we perform the first comprehensive privacy assessment of prompts learned by visual prompt learning through the lens of property inference and membership inference attacks. Our empirical evaluation shows that the prompts are vulnerable to both attacks. We also demonstrate that the adversary can mount a successful property inference attack with limited cost. Moreover, we show that membership inference attacks against prompts can be successful with relaxed adversarial assumptions. We further make some initial investigations on the defenses and observe that our method can mitigate the membership inference attacks with a decent utility-defense trade-off but fails to defend against property inference attacks. We hope our results can shed light on the privacy risks of the popular prompt learning paradigm. To facilitate the research in this direction, we will share our code and models with the community.

Track 6

Security Analysis IV: OS

Session Chair: Hyungon Moon

Salon H

Pandawan: Quantifying Progress in Linux-based Firmware Rehosting

Ioannis Angelakopoulos, Gianluca Stringhini, and Manuel Egele, Boston University

Available Media

The Internet of Things (IoT) is frequently the epicenter of cyberattacks due to its weak security. Prior works introduce various techniques for analyzing the firmware of IoT devices for bugs and vulnerabilities, especially through firmware re-hosting. However, comparing the emulation outcomes of the different re-hosting approaches can be very challenging. In this paper, we present Firmware Initialization Completion Detection (FICD), a technique that enables the comparison of full-system re-hosting approaches across their re-hosting capabilities. In addition, prior works lack an important capability; they do not focus on both the user and privileged aspect of IoT firmware as a unit. Since prior work is not capable of holistically analyzing (both the user and privileged level) IoT firmware, we develop Pandawan, a framework that enables the holistic re-hosting and analysis of IoT firmware at scale. We use FICD to illustrate Pandawan's re-hosting improvements over the state-of-the-art, such as Firmadyne, FirmAE, and FirmSolo on a dataset of 1,520 firmware images. Our experiments show that Pandawan outperforms these systems, by executing up to 6% more user level programs and 21% more user code basic blocks, on average, than these systems. Furthermore, Pandawan loads 9% more IoT kernel modules and executes 26% more kernel module basic blocks on average than FirmSolo. We also use Pandawan to holistically analyze the firmware images by inspecting the interactions (through system calls) of user level code with kernel module code. Pandawan transforms the system call information into seeds for the TriforceAFL kernel fuzzer to analyze the kernel modules within the firmware images. The TriforceAFL experiment on 479 firmware images with seeds, discovered 16 bugs on 12 binary kernel modules, 6 of which are previously unknown bugs. The bugs affect 8 closed and 4 open source kernel modules.

DEEPTYPE: Refining Indirect Call Targets with Strong Multi-layer Type Analysis

Tianrou Xia, Hong Hu, and Dinghao Wu, The Pennsylvania State University

Available Media

Indirect calls, while facilitating dynamic execution characteristics in C and C++ programs, impose challenges on precise construction of the control-flow graphs (CFG). This hinders effective program analyses for bug detection (e.g., fuzzing) and program protection (e.g., control-flow integrity). Solutions using data-tracking and type-based analysis are proposed for identifying indirect call targets, but are either time-consuming or imprecise for obtaining the analysis results. Multi-layer type analysis (MLTA), as the state-of-the-art approach, upgrades type-based analysis by leveraging multi-layer type hierarchy, but their solution to dealing with the information flow between multi-layer types introduces false positives. In this paper, we propose strong multi-layer type analysis (SMLTA) and implement the prototype, DEEPTYPE, to further refine indirect call targets. It adopts a robust solution to record and retrieve type information, avoiding information loss and enhancing accuracy. We evaluate DEEPTYPE on Linux kernel, 5 web servers, and 14 user applications. Compared to TypeDive, the prototype of MLTA, DEEPTYPE is able to narrow down the scope of indirect call targets by 43.11% on average across most benchmarks and reduce runtime overhead by 5.45% to 72.95%, which demonstrates the effectiveness, efficiency and applicability of SMLTA.

Improving Indirect-Call Analysis in LLVM with Type and Data-Flow Co-Analysis

Dinghao Liu and Shouling Ji, Zhejiang University; Kangjie Lu, University of Minnesota; Qinming He, Zhejiang University

Available Media

Indirect function calls are widely used in building system software like OS kernels for their high flexibility and performance. Statically resolving indirect-call targets has been known to be a hard problem, which is a fundamental requirement for various program analysis and protection tasks. The state-of-the-art techniques, which use type analysis, are still imprecise. In this paper, we present a new approach, TFA, that precisely identifies indirect-call targets. The intuition behind TFA is that type-based analysis and data-flow analysis are inherently complementary in resolving indirect-call targets. TFA incorporates a co-analysis system that makes the best use of both type information and data-flow information. The co-analysis keeps refining the global call graph iteratively, allowing us to achieve an optimal indirect call analysis. We have implemented TFA in LLVM and evaluated it against five famous large-scale programs. The experimental results show that TFA eliminates additional 24% to 59% of indirect-call targets compared with the state-of-the-art approaches, without introducing new false negatives. With the precise indirect-call analysis, we further develop a strengthened fine-grained forward-edge control-flow integrity scheme and apply it to the Linux kernel. We have also used the refined indirect-call analysis results in bug detection, where we have found 8 deep bugs in the Linux kernel. As a generic technique, the precise indirect-call analysis of TFA can also benefit other applications such as compiler optimization and software debloating.

ChainReactor: Automated Privilege Escalation Chain Discovery via AI Planning

Giulio De Pasquale, King's College London and University College London; Ilya Grishchenko, University of California, Santa Barbara; Riccardo Iesari, Vrije Universiteit Amsterdam; Gabriel Pizarro, University of California, Santa Barbara; Lorenzo Cavallaro, University College London; Christopher Kruegel and Giovanni Vigna, University of California, Santa Barbara

Distinguished Artifact Award Winner

Available Media

Current academic vulnerability research predominantly focuses on identifying individual bugs and exploits in programs and systems. However, this goes against the growing trend of modern, advanced attacks that rely on a sequence of steps (i.e., a chain of exploits) to achieve their goals, often incorporating individually benign actions. This paper introduces a novel approach to the automated discovery of such exploitation chains using AI planning. In particular, we aim to discover privilege escalation chains, some of the most critical and pervasive security threats, which involve exploiting vulnerabilities to gain unauthorized access and control over systems. We implement our approach as a tool, ChainReactor, that models the problem as a sequence of actions to achieve privilege escalation from the initial access to a target system. ChainReactor extracts information about available executables, system configurations, and known vulnerabilities on the target and encodes this data into a Planning Domain Definition Language (PDDL) problem. Using a modern planner, ChainReactor can generate chains incorporating vulnerabilities and benign actions. We evaluated ChainReactor on 3 synthetic vulnerable VMs, 504 real-world Amazon EC2 and 177 Digital Ocean instances, demonstrating its capacity to rediscover known privilege escalation exploits and identify new chains previously unreported. Specifically, the evaluation showed that ChainReactor successfully rediscovered the exploit chains in the Capture the Flag (CTF) machines and identified zero-day chains on 16 Amazon EC2 and 4 Digital Ocean VMs.

Track 7

Crypto V: Private Information Retrieval

Session Chair: Wouter Lueks

Salon IJK

VeriSimplePIR: Verifiability in SimplePIR at No Online Cost for Honest Servers

Leo de Castro, MIT; Keewoo Lee, Seoul National University

Available Media

We present VeriSimplePIR, a verifiable version of the state-of-the-art semi-honest SimplePIR protocol. VeriSimplePIR is a stateful verifiable PIR scheme guaranteeing that all queries are consistent with a fixed, well-formed database. It is the first efficient verifiable PIR scheme to not rely on an honest digest to ensure security; any digest, even one produced by a malicious server, is sufficient to commit to some database. This is due to our extractable verification procedure, which can extract the entire database from the consistency proof checked against each response.

Furthermore, VeriSimplePIR ensures this strong security guarantee without compromising the performance of SimplePIR. The online communication overhead is roughly 1.1-1.5x SimplePIR, and the online computation time on the server is essentially the same. We achieve this low overhead via a novel one-time preprocessing protocol that generates a reusable proof that can verify any number of subsequent query-response pairs as long as no malicious behavior is detected. As soon as the verification procedure rejects a response from the server, the offline phase must be rerun to compute a new proof. VeriSimplePIR represents an approach to maliciously secure cryptography that is highly optimized for honest parties while maintaining security even in the presence of malicious adversaries.

Batch PIR and Labeled PSI with Oblivious Ciphertext Compression

Alexander Bienstock, New York University; Sarvar Patel and Joon Young Seo, Google; Kevin Yeo, Google and Columbia University

Available Media

In this paper, we study two problems: oblivious compression and decompression of ciphertexts. In oblivious compression, a server holds a set of ciphertexts with a subset of encryptions of zeroes whose positions are only known to the client. The goal is for the server to effectively compress the ciphertexts obliviously, while preserving the non-zero plaintexts and without learning the plaintext values. For oblivious decompression, the client, instead, succinctly encodes a sequence of plaintexts such that the server may decode encryptions of all plaintexts value, but the zeroes may be replaced with arbitrary values. We present solutions to both problems that construct lossless compressions as small as only 5% more than the optimal minimum using only additive homomorphism. The crux of both algorithms involve embedding ciphertexts as random linear systems that are efficiently solvable.

Using our compression schemes, we obtain state-of-the-art schemes for batch private information retrieval (PIR) where a client wishes to privately retrieve multiple entries from a server-held database in one query. We show that our compression schemes may be used to reduce communication by up to 30% for batch PIR in both the single and two-server settings.

Additionally, we study labeled private set intersection (PSI) in the unbalanced setting where one party's set is significantly smaller than the other party's set and each entry has associated data. By utilizing our novel compression algorithm, we present a protocol with 65-88% reduction in communication with comparable computation compared to prior works.

Single Pass Client-Preprocessing Private Information Retrieval

Arthur Lazzaretti and Charalampos Papamanthou, Yale University

Available Media

Recently, many works have considered Private Information Retrieval (PIR) with client-preprocessing: In this model a client and a server jointly run a preprocessing phase, after which client queries run in time sublinear in the database size. However, the preprocessing phase is expensive—proportional to λ N, where λ is the security parameter (e.g., λ=128).

In this paper we propose SinglePass, the first PIR protocol that is concretely optimal with respect to client-preprocessing, requiring exactly a single linear pass over the database. Our approach yields a preprocessing speedup ranging from 45× to 100× and a query speedup of up to 20× when compared to previous state-of-the-art schemes (e.g., Checklist, USENIX SECURITY 2021, making preprocessing PIR more attractive for a myriad of use cases that are "session-based".

In addition to practical preprocessing, SinglePass features constant-time updates (additions/edits). Previously, the best known approach for handling updates in client-preprocessing PIR had complexity OlogN, while also adding a logN factor to the bandwidth. We implement our update algorithm and show concrete speedups of about 20× over previous state-of-the-art updatable schemes (e.g., Checklist).

YPIR: High-Throughput Single-Server PIR with Silent Preprocessing

Samir Jordan Menon, Blyss; David J. Wu, UT Austin

Available Media

We introduce YPIR, a single-server private information retrieval (PIR) protocol that achieves high throughput (up to 83% of the memory bandwidth of the machine) without any offline communication. For retrieving a 1-bit (or 1-byte) record from a 32 GB database, YPIR achieves 12.1 GB/s/core server throughput and requires 2.5 MB of total communication. On the same setup, the state-of-the-art SimplePIR protocol achieves a 12.5 GB/s/core server throughput, requires 1.5 MB total communication, but additionally requires downloading a 724 MB hint in an offline phase. YPIR leverages a new lightweight technique to remove the hint from high-throughput single-server PIR schemes with small overhead. We also show how to reduce the server preprocessing time in the SimplePIR family of protocols by a factor of 10–15×.

By removing the need for offline communication, YPIR significantly reduces the server-side costs for private auditing of Certificate Transparency logs. Compared to the best previous PIR-based approach, YPIR reduces the server-side costs by a factor of 8×. Note that to reduce communication costs, the previous approach assumed that updates to the Certificate Transparency log servers occurred in weekly batches. Since there is no offline communication in YPIR, our approach allows clients to always audit the most recent Certificate Transparency logs (e.g., updating once a day). Supporting daily updates using the prior scheme would cost 48× more than YPIR (based on current AWS compute costs).

11:45 am–1:15 pm

Lunch (on your own)

1:15 pm–2:15 pm

Track 1

User Studies VII: Policies and Best Practices III

Session Chair: Lucy Qin

Salon AB

Trust Me If You Can – How Usable Is Trusted Types In Practice?

Sebastian Roth, TU Wien; Lea Gröber, CISPA Helmholtz Center for Information Security; Philipp Baus, Saarland University; Katharina Krombholz and Ben Stock, CISPA Helmholtz Center for Information Security

Available Media

Many online services deal with sensitive information such as credit card data, making those applications a prime target for adversaries, e.g., through Cross-Site Scripting (XSS) attacks. Moreover, Web applications nowadays deploy their functionality via client-side code to lower the server's load, require fewer page reloads, and allow Web applications to work even if the connection is interrupted. Given this paradigm shift of increasing complexity on the browser side, client-side security issues such as client-side XSS are getting more prominent these days. A solution already deployed in server-side applications of major companies like Google is to use type-safe data, where potentially attacker-controlled string data can never be output with sanitization. The newly introduced Trusted Types API offers an analogous solution for client-side XSS. With Trusted Types, the browser enforces that no input can be passed to an execution sink without being sanitized first. Thus, a developer's only remaining task – in theory – is to create a proper sanitizer. This study aims to uncover roadblocks that occur during the deployment of the mechanism and strategies on how developers can circumvent those problems by conducting a semi-structured interview, including a coding task with 13 real-world Web developers. Our work also identifies key weaknesses in the design and documentation of Trusted Types, which we urge the standard- ization body to incorporate before the Trusted Types becomes a standard.

"I just hated it and I want my money back": Data-driven Understanding of Mobile VPN Service Switching Preferences in The Wild

Rohit Raj, Mridul Newar, and Mainack Mondal, Indian Institute of Technology, Kharagpur

Available Media

Virtual Private Networks (VPNs) are a crucial PrivacyEnhancing Technology (PET) leveraged by millions of users and catered by multiple VPN providers worldwide; thus, understanding the user preferences for the choice of VPN apps should be of importance and interest to the security community. To that end, prior studies looked into the usage, awareness and adoption of VPN users and the perceptions of providers. However, no study so far has looked into the user preferences and underlying reasons for switching among VPN providers and identified features that presumably enhance users' VPN experience. This work aims to bridge this gap and shed light on the underlying factors that drive existing users when they switch from one VPN to another. In this work, we analyzed over 1.3 million reviews from 20 leading VPN apps, identifying 1,305 explicit mentions and intents to switch. Our NLP-based analysis unveiled distinct clusters of factors motivating users to switch. An examination of 376 blogs from six popular VPN recommendation sites revealed biases in the content, and we found ignorance towards user preferences. We conclude by identifying the key implications of our work for different stakeholders. The data and code for this work is available at https://github.com/Mainack/switch-vpn-datacode-sec24.

I Experienced More than 10 DeFi Scams: On DeFi Users' Perception of Security Breaches and Countermeasures

Mingyi Liu, Georgia Institute of Technology; Jun Ho Huh, Samsung Research; HyungSeok Han, Jaehyuk Lee, Jihae Ahn, and Frank Li, Georgia Institute of Technology; Hyoungshick Kim, Sungkyunkwan University; Taesoo Kim, Georgia Institute of Technology

Available Media

Decentralized Finance (DeFi) offers a whole new investment experience and has quickly emerged as an enticing alternative to Centralized Finance (CeFi). Rapidly growing market size and active users, however, have also made DeFi a lucrative target for scams and hacks, with 1.95 billion USD lost in 2023. Unfortunately, no prior research thoroughly investigates DeFi users' security risk awareness levels and the adequacy of their risk mitigation strategies.

Based on a semi-structured interview study (N = 14) and a follow-up survey (N = 493), this paper investigates DeFi users' security perceptions and commonly adopted practices, and how those affected by previous scams or hacks (DeFi victims) respond and try to recover their losses. Our analysis shows that users often prefer DeFi over CeFi due to their decentralized nature and strong profitability. Despite being aware that DeFi, compared to CeFi, is prone to more severe attacks, users are willing to take those risks to explore new investment opportunities. Worryingly, most victims do not learn from previous experiences; unlike victims studied through traditional systems, DeFi victims tend to find new services, without revising their security practices, to recover their losses quickly. The abundance of various DeFi services and opportunities allows victims to continuously explore new financial opportunities, and this reality seems to cloud their security priorities. Indeed, our results indicate that DeFi users' strong financial motivations outweigh their security concerns – much like those who are addicted to gambling. Our observations about victims' post-incident behaviors suggest that stronger control in the form of industry regulations would be necessary to protect DeFi users from future breaches.

Towards Privacy and Security in Private Clouds: A Representative Survey on the Prevalence of Private Hosting and Administrator Characteristics

Lea Gröber, CISPA Helmholtz Center for Information Security and Saarland University; Simon Lenau and Rebecca Weil, CISPA Helmholtz Center for Information Security; Elena Groben, Saarland University; Michael Schilling and Katharina Krombholz, CISPA Helmholtz Center for Information Security

Available Media

Instead of relying on Software-as-a-Service solutions, some people self-host services from within their homes. In doing so they enhance their privacy but also assume responsibility for the security of their operations. However, little is currently known about how widespread private self-hosting is, which use cases are prominent, and what characteristics set self-hosters apart from the general population. In this work, we present two large-scale surveys: (1) We estimate the prevalence of private self-hosting in the U.S. across five use cases (communication, file storage, synchronized password managing, websites, and smart home) based on a representative survey on Prolific (n = 1505). (2) We run a follow-up survey on Prolific (n = 589) to contrast individual characteristics of identified self-hosters to people of the same demographics who do not show the behavior.

We estimate an upper bound of 8.4% private self-hosters in the U.S. population. Websites are the most common use case for self-hosting, predominately running on home servers. All other use cases were equally frequent. Although past research identified privacy as a leading motivation for private self-hosting, we find that self-hosters are not more privacy-sensitive than the general population. Instead, we find that IT administration skills, IT background, affinity for technology interaction, and "maker" self-identity positively correlate with self-hosting behavior.

Track 2

Wireless Security II: Sky and Space

Session Chair: Nils Aschenbruck

Salon CD

Wireless Signal Injection Attacks on VSAT Satellite Modems

Robin Bisping, ETH Zurich; Johannes Willbold, Ruhr University Bochum; Martin Strohmeier and Vincent Lenders, Cyber-Defence Campus, armasuisse

Available Media

This work considers the threat model of wireless signal injection attacks on Very Small Aperture Terminals (VSAT) satellite modems. In particular, we investigate the feasibility to inject malicious wireless signals from a transmitter on the ground in order to compromise and manipulate the control of close-by satellite terminals. Based on a case study with a widely used commercial modem device, we find that VSATs are not designed to withstand simple signal injection attacks. The modems assume that any received signal comes from a legitimate satellite. We show that an attacker equipped with a low-cost software-defined radio (SDR) can inject arbitrary IP traffic into the internal network of the terminal. We explore different attacks that aim to deny service, manipulate the modem's firmware, or gain a remote admin shell. Further, we quantify their probability of success depending on the wireless channel conditions and the placement of the attacker versus the angle of arrival of the signal at the antenna dish of the receiver.

Orbital Trust and Privacy: SoK on PKI and Location Privacy Challenges in Space Networks

David Koisser, Sanctuary; Richard Mitev, Technische Universität Darmstadt; Nikita Yadav, Indian Institute of Science, Bangalore; Franziska Vollmer and Ahmad-Reza Sadeghi, Technische Universität Darmstadt

Available Media

The dynamic evolution of the space sector, often referred to as "New Space," has led to increased commercialization and innovation. This transformation is characterized by a surge in satellite numbers, the emergence of small, cost-effective satellites like CubeSats, and the development of space networks. As satellite networks play an increasingly vital role in providing essential services and supporting various activities, ensuring their security is crucial, especially concerning trust relationships among satellites and the protection of satellite service users.

Satellite networks possess unique characteristics, such as orbital dynamics, delays, and limited bandwidth, posing challenges to trust and privacy. While prior research has explored various aspects of space network security, this paper systematically investigates two crucial yet unexplored dimensions: (i) The integrity of PKI components directly impacts the security and privacy of satellite communications and data transmission, with orbital delays and disruptions potentially hindering timely certificate revocation checks. (ii) Conversely, transmitting user signals to satellites requires careful consideration to prevent location tracking and unauthorized surveillance. By drawing on insights from terrestrial studies, we aim to provide a comprehensive understanding of these intertwined security aspects, identify research gaps, and stimulate further exploration to tackle these research challenges in the evolving domain of space network security.

RECORD: A RECeption-Only Region Determination Attack on LEO Satellite Users

Eric Jedermann, RPTU Kaiserslautern-Landau; Martin Strohmeier and Vincent Lenders, armasuisse; Jens Schmitt, RPTU Kaiserslautern-Landau

Available Media

Low Earth orbit (LEO) satellite communication has recently experienced a dramatic increase of usage in diverse application sectors. Naturally, the aspect of location privacy is becoming crucial, most notably in security or military applications. In this paper, we present a novel passive attack called RECORD, which is solely based on the reception of messages to LEO satellite users on the ground, threatening their location privacy. In particular, we show that by observing only the downlink of "wandering" communication satellites over wide beams can be exploited at scale from passive attackers situated on Earth to estimate the region in which users are located. We build our own distributed satellite reception platform to implement the RECORD attack. We analyze the accuracy and limiting factors of this new attack using real-world measurements from our own Iridium satellite communication. Our experimental results reveal that by observing only 2.3 hours of traffic, it is possible to narrow down the position of an Iridium user to an area below 11 km of radius (compared to the satellite beam size of 4700 km diameter). We conduct additional extensive simulative evaluations, which suggest that it is feasible to narrow down the unknown location of a user even further, for instance, to below 5 km radius when the observation period is increased to more than 16 hours. We finally discuss the transferability of RECORD to different LEO constellations and highlight possible countermeasures.

On a Collision Course: Unveiling Wireless Attacks to the Aircraft Traffic Collision Avoidance System (TCAS)

Giacomo Longo, DIBRIS, University of Genova; Martin Strohmeier, Cyber-Defence Campus, armasuisse S + T; Enrico Russo, DIBRIS, University of Genova; Alessio Merlo, CASD, School of Advanced Defense Studies; Vincent Lenders, Cyber-Defence Campus, armasuisse S + T

Available Media

Collision avoidance systems have been a safety net of last resort in aviation since their introduction in the 1980s. Through constantly refined safety procedures and hard lessons learned from mid-air collisions, the TCAS II Version 7.1 has become the global standard, significantly improving safety in a fast-growing field.

Despite this safety record, TCAS was not designed with security in mind, even in its newest versions. With the rise of software-defined radios, security researchers have shown many wireless technologies in aviation and critical infrastructures to be insecure against radio frequency (RF) attacks. However, while similar attacks have been postulated for TCAS with its built-in distance measurement, all attempts to execute them have failed so far.

In this paper, we present the first working RF attacks on TCAS. We demonstrate how to take full control over the collision avoidance displays and create so-called RA of arbitrary aircraft on collision course. We build the necessary tooling using commercial off-the-shelf hardware, creating sufficient conditions for the attacker to spoof colliding aircraft from a distance of up to 4.2 km.

We evaluate this and further attacks extensively on a live, real-world, certified aircraft test system and discuss potential countermeasures and mitigations that should be considered by aircraft and system manufacturers in the future.

Track 3

System Security IV: Multithreading

Session Chair: Fish Wang

Salon E

LR-Miner: Static Race Detection in OS Kernels by Mining Locking Rules

Tuo Li, Tsinghua University; Jia-Ju Bai and Gui-Dong Han, Beihang University; Shi-Min Hu, Tsinghua University

Available Media

Data race is one of the most common concurrency issues in OS kernels, and it can cause severe problems like system crashes and privilege escalation. Therefore, detecting kernel races is important and necessary. A critical step of kernel race detection is to identify locking rules that which variable should be protected by which lock. However, due to insufficient documents of kernel concurrency, it is challenging to identify accurate locking rules, causing existing approaches to produce many false results in kernel race detection.

In this paper, we design a new static analysis approach named LR-Miner, to effectively detect data races in OS kernels by mining locking rules from kernel code. LR-Miner consists of three key techniques: (1) a field-aware mining method that constructs and statistically analyzes the structure field relation between locks and accessed variables, to mine accurate locking rules from kernel code; (2) an alias-aware checking method to detect data races that violate the mined locking rules; (3) a pattern-based estimation strategy to estimate the security impact of the found races and identify harmful ones. We have evaluated LR-Miner on two popular OS kernels including Linux and FreeBSD, and it finds 306 real races with a false positive rate of 19.9%. Among these found races, 200 are estimated to be harmful, and 61 of them have been confirmed by kernel developers. 10 harmful races have been assigned with CVE IDs.

When Threads Meet Interrupts: Effective Static Detection of Interrupt-Based Deadlocks in Linux

Chengfeng Ye, Yuandao Cai, and Charles Zhang, The Hong Kong University of Science and Technology

Available Media

Deadlocking is an unresponsive state of software that arises when threads hold locks while trying to acquire other locks that are already held by other threads, resulting in a circular lock dependency. Interrupt-based deadlocks, a specific and prevalent type of deadlocks that occur within the OS kernel due to interrupt preemption, pose significant risks to system functionality, performance, and security. However, existing static analysis tools focus on resource-based deadlocks without characterizing the interrupt preemption. In this paper, we introduce Archerfish, the first static analysis approach for effectively identifying interrupt-based deadlocks in the large-scale Linux kernel. At its core, Archerfish utilizes an Interrupt-Aware Lock Graph (ILG) to capture both regular and interrupt-related lock dependencies, reducing the deadlock detection problem to graph cycle discovery and refinement. Furthermore, Archerfish incorporates four effective analysis components to construct ILG and refine the deadlock cycles, addressing three core challenges, including the extensive interrupt-involving concurrency space, identifying potential interrupt handlers, and validating the feasibility of deadlock cycles. Our experimental results show that Archerfish can precisely analyze the Linux kernel (19.8 MLoC) in approximately one hour. At the time of writing, we have discovered 76 previously unknown deadlocks, with 53 bugs confirmed, 46 bugs already fixed by the Linux community, and 2 CVE IDs assigned. Notably, those found deadlocks are long-latent, hiding for an average of 9.9 years.

GhostRace: Exploiting and Mitigating Speculative Race Conditions

Hany Ragab, Vrije Universiteit Amsterdam; Andrea Mambretti and Anil Kurmus, IBM Research Europe - Zurich; Cristiano Giuffrida, Vrije Universiteit Amsterdam

Available Media

Race conditions arise when multiple threads attempt to access a shared resource without proper synchronization, often leading to vulnerabilities such as concurrent use-after-free. To mitigate their occurrence, operating systems rely on synchronization primitives such as mutexes, spinlocks, etc.

In this paper, we present GhostRace, the first security analysis of these primitives on speculatively executed code paths. Our key finding is that all the common synchronization primitives can be microarchitecturally bypassed on speculative paths, turning all architecturally race-free critical regions into Speculative Race Conditions (SRCs). To study the severity of SRCs, we focus on Speculative Concurrent Use-After-Free (SCUAF) and uncover 1,283 potentially exploitable gadgets in the Linux kernel. Moreover, we demonstrate that SCUAF information disclosure attacks against the kernel are not only practical, but that their reliability can closely match that of traditional Spectre attacks, with our proof of concept leaking kernel memory at 12 KB/s. Crucially, we develop a new technique to create an unbounded race window, accommodating an arbitrary number of SCUAF invocations required by an end-to-end attack in a single race window. To address the new attack surface, we also propose a generic SRC mitigation to harden all the affected synchronization primitives on Linux. Our mitigation requires minimal kernel changes and incurs only ≈5% geomean performance overhead on LMBench.

"There's security, and then there's just being ridiculous." – Linus Torvalds, on Speculative Race Conditions

CARDSHARK: Understanding and Stablizing Linux Kernel Concurrency Bugs Against the Odds

Tianshuo Han, Xiaorui Gong, and Jian Liu, {CAS-KLONAT, BKLONSPT}, Institute of Information Engineering, Chinese Academy of Sciences; School of Cyber Security, University of Chinese Academy of Sciences

Available Media

Concurrency bugs in the Linux kernel are notoriously difficult to reproduce and debug due to their non-deterministic nature. While they bring constant headaches to Linux kernel developers, the reasons behind the non-determinism and how to improve the efficiency in triggering concurrency bugs to ease the debugging process still need to be studied.

This work aims to fill the gap. We comprehensively study the concurrency bug stability problem in the Linux kernel, dissect the factors behind the non-determinism, and systematize the insights into a model to explain the non-deterministic nature of concurrency bugs.

Based on insights derived from the model, we identify an under-studied factor, named misalignment, which plays a vital role in triggering concurrency bugs. By controlling this factor, we significantly reduce the randomness in the concurrency bug-triggering process.

Inspired by this insight, we design a novel technique, named CARDSHARK, that can significantly improve the efficiency in triggering concurrency bugs when kernel instrumentation is possible. A variant of CARDSHARK, named BLINDSHARK, enables developers to improve efficiency in triggering concurrency bugs without knowing their root causes, making the use of CARDSHARK practical.

In our evaluation of 12 real-world concurrency bugs, CARDSHARK and BLINDSHARK significantly reduce the needed time and the number of attempts to trigger concurrency bugs in the Linux kernel. Notably, CARDSHARK can deterministically trigger 10 out of the 12 concurrency bugs with a single attempt. Our evaluation shows that CARDSHARK significantly outperforms existing works in stabilizing concurrency bugs, making it a potential great help to developers in analyzing and fixing concurrency bugs.

Track 4

Blockchain II

Session Chair: Fan Zhang

Salon F

zkCross: A Novel Architecture for Cross-Chain Privacy-Preserving Auditing

Yihao Guo, Minghui Xu, Xiuzhen Cheng, and Dongxiao Yu, Shandong University; Wangjie Qiu, Beihang University; Gang Qu, University of Maryland; Weibing Wang and Mingming Song, Cloud Inspur Information Technology Co., Ltd.

Available Media

One of the key areas of focus in blockchain research is how to realize privacy-preserving auditing without sacrificing the system's security and trustworthiness. However, simultaneously achieving auditing and privacy protection, two seemingly contradictory objectives, is challenging because an auditing system would require transparency and accountability which might create privacy and security vulnerabilities. This becomes worse in cross-chain scenarios, where the information silos from multiple chains further complicate the problem. In this paper, we identify three important challenges in cross-chain privacy-preserving auditing, namely Cross-chain Linkability Exposure (CLE), Incompatibility of Privacy and Auditing (IPA), and Full Auditing Inefficiency (FAI). To overcome these challenges, we propose zkCross, which is a novel two-layer cross-chain architecture equipped with three cross-chain protocols to achieve privacy-preserving cross-chain auditing. Among these three protocols, two are privacy-preserving cross-chain protocols for transfer and exchange, respectively; the third one is an efficient cross-chain auditing protocol. These protocols are built on solid cross-chain schemes to guarantee privacy protection and audit efficiency. We implement zkCross on both local and cloud servers and perform comprehensive tests to validate that zkCross is well-suited for processing large-scale privacy-preserving auditing tasks. We evaluate the performance of the proposed protocols in terms of run time, latency, throughput, gas consumption, audit time, and proof size to demonstrate their practicality.

Pixel+ and Pixel++: Compact and Efficient Forward-Secure Multi-Signatures for PoS Blockchain Consensus

Jianghong Wei, State Key Laboratory of Integrated Service Networks (ISN), Xidian University, and State Key Laboratory of Mathematical Engineering and Advanced Computing; Guohua Tian, State Key Laboratory of Integrated Service Networks (ISN), Xidian University; Ding Wang, College of Cyber Science, Nankai University; Fuchun Guo and Willy Susilo, School of Computing and Information Technology, University of Wollongong; Xiaofeng Chen, State Key Laboratory of Integrated Service Networks (ISN), Xidian University

Available Media

Multi-signature schemes have attracted considerable attention in recent years due to their popular applications in PoS blockchains. However, the use of general multi-signature schemes poses a critical threat to the security of PoS blockchains once signing keys get corrupted. That is, after an adversary obtains enough signing keys, it can break the immutable nature of PoS blockchains by forking the chain and modifying the history from some point in the past. Forward-secure multi-signature (FS-MS) schemes can overcome this issue by periodically updating signing keys. The only FS-MS construction currently available is Drijvers et al's Pixel, which builds on pairing groups and only achieves forward security at the time period level.

In this work, we present new FS-MS constructions that either are free from pairing or capture forward security at the individual message level (i.e., fine-grained forward security). Our first construction Pixel+ works for a maximum number of time periods T. Pixel+ signatures consist of only one group element, and can be verified using two exponentiations. It is the first FS-MS from RSA assumption, and has 3.5x and 22.8x faster signing and verification than Pixel, respectively. Our second FS-MS construction Pixel++ is a pairing-based one. It immediately revokes the signing key's capacity of re-signing the message after creating a signature on this message, rather than at the end of the current time period. Thus, it provides more practical forward security than Pixel. On the other hand, Pixel++ is almost as efficient as Pixel in terms of signing and verification. Both Pixel+ and Pixel++ allow for non-interactive aggregation of signatures from independent signers and are proven to be secure in the random oracle model. In addition, they also support the aggregation of public keys, significantly reducing the storage overhead on PoS blockchains.

We demonstrate how to integrate Pixel+ and Pixel++ into PoS blockchains. As a proof-of-concept, we provide implementations of Pixel+ and Pixel++, and conduct several representative experiments to show that Pixel+ and Pixel++ have good concrete efficiency and are practical.

Max Attestation Matters: Making Honest Parties Lose Their Incentives in Ethereum PoS

Mingfei Zhang, Shandong University; Rujia Li and Sisi Duan, Tsinghua University

Available Media

We present staircase attack, the first attack on the incentive mechanism of the Proof-of-Stake (PoS) protocol used in Ethereum 2.0 beacon chain. Our attack targets the penalty of the incentive mechanism that penalizes inactive participation. Our attack can make honest validators suffer from penalties, even if they strictly follow the specification of the protocol. We show both theoretically and experimentally that if the adversary controls 29.6% stake in a moderate-size system, the attack can be launched continuously, so eventually all honest validators will lose their incentives. In contrast, the adversarial validators can still receive incentives, and the stake owned by the adversary can eventually exceed the one-third threshold (system assumption), posing a threat to the security properties of the system.

In practice, the attack feasibility is directly related to two parameters: the number of validators and the parameter MAX_ATTESTATION, the maximum number of attestations (i.e., votes) that can be included in each block. We further modify our attack such that, with the current system setup (900,000 validators and MAX_ATTESTATION=128), our attack can be launched continuously with a probability of 80.25%. As a result, the incentives any honest validator receives are only 28.9% of its fair share.

Sprints: Intermittent Blockchain PoW Mining

Michael Mirkin, Technion; Lulu Zhou, Yale University; Ittay Eyal, Technion; Fan Zhang, Yale University

Available Media

Cryptocurrencies and decentralized platforms have been rapidly gaining traction since Nakamoto's discovery of Bitcoin's blockchain protocol. Prominent systems use Proof of Work (PoW) to achieve unprecedented security for digital assets. However, the significant carbon footprint due to the manufacturing and operation of PoW mining hardware is leading policymakers to consider stark measures against them and various systems to explore alternatives. But these alternatives imply stepping away from key security aspects of PoW.

We present Sprints, a blockchain protocol that achieves almost the same security guarantees as PoW blockchains, but with an order-of-magnitude lower carbon footprint while increasing the number of mining rigs by a factor 1.27x. Our conservative estimate of environmental footprint uses common metrics, taking into account both power and hardware. To achieve this reduction, Sprints forces miners to mine intermittently. It interleaves Proof of Delay (PoD, e.g., using a Verifiable Delay Function) and PoW, where only the latter bears a significant resource expenditure. We prove that in Sprints the attacker's success probability is the same as that of legacy PoW. To evaluate practical performance, we analyze the effect of shortened PoW duration, showing a minor reduction in resilience (49% instead of 50%). We confirm the results with a full implementation using 100 patched Bitcoin clients in an emulated network.

Track 5

Autonomous and Automatic Systems

Session Chair: Ning Zhang

Salon G

A First Physical-World Trajectory Prediction Attack via LiDAR-induced Deceptions in Autonomous Driving

Yang Lou, City University of Hong Kong; Yi Zhu, State University of New York at Buffalo; Qun Song, Delft University of Technology; Rui Tan, Nanyang Technological University; Chunming Qiao, State University of New York at Buffalo; Wei-Bin Lee, Information Security Center, Hon Hai Research Institute, and Feng Chia University; Jianping Wang, City University of Hong Kong

Available Media

Trajectory prediction forecasts nearby agents' moves based on their historical trajectories. Accurate trajectory prediction (or prediction in short) is crucial for autonomous vehicles (AVs). Existing attacks compromise the prediction model of a victim AV by directly manipulating the historical trajectory of an attacker AV, which has limited real-world applicability. This paper, for the first time, explores an indirect attack approach that induces prediction errors via attacks against the perception module of a victim AV. Although it has been shown that physically realizable attacks against LiDAR-based perception are possible by placing a few objects at strategic locations, it is still an open challenge to find an object location from the vast search space in order to launch effective attacks against prediction under varying victim AV velocities.

Through analysis, we observe that a prediction model is prone to an attack focusing on a single point in the scene. Consequently, we propose a novel two-stage attack framework to realize the single-point attack. The first stage of prediction-side attack efficiently identifies, guided by the distribution of detection results under object-based attacks against perception, the state perturbations for the prediction model that are effective and velocity-insensitive. In the second stage of location matching, we match the feasible object locations with the found state perturbations. Our evaluation using a public autonomous driving dataset shows that our attack causes a collision rate of up to 63% and various hazardous responses of the victim AV. The effectiveness of our attack is also demonstrated on a real testbed car. To the best of our knowledge, this study is the first security analysis spanning from LiDAR-based perception to prediction in autonomous driving, leading to a realistic attack on prediction. To counteract the proposed attack, potential defenses are discussed.

On Data Fabrication in Collaborative Vehicular Perception: Attacks and Countermeasures

Qingzhao Zhang, Shuowei Jin, Ruiyang Zhu, Jiachen Sun, and Xumiao Zhang, University of Michigan; Qi Alfred Chen, University of California, Irvine; Z. Morley Mao, University of Michigan and Google

Available Media

Collaborative perception, which greatly enhances the sensing capability of connected and autonomous vehicles (CAVs) by incorporating data from external resources, also brings forth potential security risks. CAVs' driving decisions rely on remote untrusted data, making them susceptible to attacks carried out by malicious participants in the collaborative perception system. However, security analysis and countermeasures for such threats are absent. To understand the impact of the vulnerability, we break the ground by proposing various real-time data fabrication attacks in which the attacker delivers crafted malicious data to victims in order to perturb their perception results, leading to hard brakes or increased collision risks. Our attacks demonstrate a high success rate of over 86% on high-fidelity simulated scenarios and are realizable in real-world experiments. To mitigate the vulnerability, we present a systematic anomaly detection approach that enables benign vehicles to jointly reveal malicious fabrication. It detects 91.5% of attacks with a false positive rate of 3% in simulated scenarios and significantly mitigates attack impacts in real-world scenarios.

VOGUES: Validation of Object Guise using Estimated Components

Raymond Muller, Purdue University; Yanmao Man and Ming Li, University of Arizona; Ryan Gerdes, Virginia Tech; Jonathan Petit, Qualcomm; Z. Berkay Celik, Purdue University

Available Media

Object Detection (OD) and Object Tracking (OT) are an important part of autonomous systems (AS), enabling them to perceive and reason about their surroundings. While both OD and OT have been successfully attacked, defenses only exist for OD. In this paper, we introduce VOGUES, which combines perception algorithms in AS with logical reasoning about object components to model human perception. VOGUES leverages pose estimation algorithms to reconstruct the constituent components of objects within a scene, which are then mapped via bipartite matching against OD/OT predictions to detect OT attacks. VOGUES's component reconstruction process is designed such that attacks against OD/OT will not implicitly affect its performance. To prevent adaptive attackers from simultaneously evading OD/OT and component reconstruction, VOGUES integrates an LSTM validator to ensure that the component behavior of objects remains consistent over time. Evaluations in both the physical domain and digital domain yield an average attack detection rate of 96.78% and an FPR of 3.29%. Meanwhile, adaptive attacks against VOGUES require perturbations 30x stronger than previously established in OT attack works, significantly increasing the attack difficulty and reducing their practicality.

Adversary is on the Road: Attacks on Visual SLAM using Unnoticeable Adversarial Patch

Baodong Chen, The Ohio State University; Wei Wang and Pascal Sikorski, Saint Louis University; Ting Zhu, The Ohio State University

Available Media

Visual Simultaneous Localization and Mapping (vSLAM) plays a pivotal role in numerous emerging applications, including autonomous driving and robotic navigation. It mainly utilizes consecutive frames captured by image sensors to conduct localization and build high-definition maps. However, existing approaches mainly focus on building reliable and accurate vSLAM systems, while little work has been done to investigate the vulnerability of existing vSLAM systems.

To fill the gap, we introduce an AoR (Adversary is on the Road) attack, which can effectively alter localization and mapping results of widely used vSLAM systems without being detected by the legitimate user. To do this, we conducted in-depth investigations on existing vSLAM systems and found that these systems are very sensitive to environmental texture changes. Building upon this insight, we design a novel adversarial patch generation mechanism that can generate unnoticeable adversarial patches to attack existing vSLAM systems. We extensively evaluate the effectiveness of the AoR attack on industry-level vehicles, robotic platforms, and four well-known open-source datasets. The evaluation results show that the AoR attack can effectively attack existing vSLAM systems and introduce extremely high localization errors (up to 713%). To mitigate this attack, we also designed an innovative defense module to simultaneously detect abnormal environmental texture distributions and support reliable vSLAM. Our defense module is lightweight and has the potential to be applied to existing vSLAM systems.

Track 6

Crypto VI: Security Analysis

Session Chair: Sebastian Schinzel

Salon H

Cryptographic Analysis of Delta Chat

Yuanming Song, Lenka Mareková, and Kenneth G. Paterson, ETH Zurich

Available Media

We analyse the cryptographic protocols underlying Delta Chat, a decentralised messaging application which uses e-mail infrastructure for message delivery. It provides end-to-end encryption by implementing the Autocrypt standard and the SecureJoin protocols, both making use of the OpenPGP standard. Delta Chat's adoption by categories of high-risk users such as journalists and activists, but also more generally users in regions affected by Internet censorship, makes it a target for powerful adversaries. Yet, the security of its protocols has not been studied to date. We describe five new attacks on Delta Chat in its own threat model, exploiting cross-protocol interactions between its implementation of SecureJoin and Autocrypt, as well as bugs in rPGP, its OpenPGP library. The findings have been disclosed to the Delta Chat team, who implemented fixes.

ENG25519: Faster TLS 1.3 handshake using optimized X25519 and Ed25519

Jipeng Zhang, CCST, Nanjing University of Aeronautics and Astronautics; Junhao Huang, Guangdong Provincial Key Laboratory IRADS, BNU-HKBU United International College; Hong Kong Baptist University; Lirui Zhao, CCST, Nanjing University of Aeronautics and Astronautics; Donglong Chen, Guangdong Provincial Key Laboratory IRADS, BNU-HKBU United International College; Çetin Kaya Koç, CCST, Nanjing University of Aeronautics and Astronautics; Iğdır University; University of California Santa Barbara

Distinguished Paper Award Winner

Available Media

The IETF released RFC 8446 in 2018 as the new TLS 1.3 standard, which recommends using X25519 for key exchange and Ed25519 for identity verification. These computations are the most time-consuming steps in the TLS handshake. Intel introduced AVX-512 in 2013 as an extension of AVX2, and in 2018, AVX-512IFMA, a submodule of AVX-512 to further support 52-bit (integer) multipliers, was implemented on Cannon Lake CPUs.

This paper first revisits various optimization strategies for ECC and presents a more performant X25519/Ed25519 implementation using the AVX-512IFMA instructions. These optimization strategies cover all levels of ECC arithmetic, including finite field arithmetic, point arithmetic, and scalar multiplication computations. Furthermore, we formally verify our finite field implementation to ensure its correctness and robustness.

In addition to the cryptographic implementation, we further explore the deployment of our optimized X25519/Ed25519 library in the TLS protocol layer and the TLS ecosystem. To this end, we design and implement an OpenSSL ENGINE called ENG25519, which propagates the performance benefits of our ECC library to the TLS protocol layer and the TLS ecosystem. The TLS applications can benefit directly from the underlying cryptographic improvements through ENG25519 without necessitating any changes to the source code of OpenSSL and applications. Moreover, we discover that the cold-start issue of vector units degrades the performance of cryptography in TLS protocol, and we develop an auxiliary thread with a heuristic warm-up scheme to mitigate this issue.

Finally, this paper reports a successful integration of the ENG25519 into an unmodified DNS over TLS (DoT) server called unbound, which further highlights the practicality of the ENG25519. We also report benchmarks of TLS 1.3 handshake and DoT query, achieving a speedup of 25% to 35% for TLS 1.3 handshakes per second and an improvement of 24% to 41% for the peak server throughput of DoT queries.

Formal Security Analysis of Widevine through the W3C EME Standard

Stéphanie Delaune and Joseph Lallemand, Univ Rennes, CNRS, IRISA, France; Gwendal Patat, Fraunhofer SIT | ATHENE, Germany; Florian Roudot and Mohamed Sabt, Univ Rennes, CNRS, IRISA, France

Available Media

Streaming services such as Netﬂix, Amazon Prime Video, or Disney+ rely on the widespread EME standard to deliver their content to end users on all major web browsers. While providing an abstraction layer to the underlying DRM protocols of each device, the security of this API has never been formally studied. In this paper, we provide the ﬁrst formal analysis of Widevine, the most deployed DRM instantiating EME.

We deﬁne security goals for EME, focusing on media protection and usage control. Then, relying on the TAMARIN prover, we conduct a detailed security analysis of these goals on some Widevine EME implementations, reverse-engineered by us for this study. Our investigation highlights a vulnerability that could allow for unlimited media consumption. Additionally, we present a patched protocol that is suitable for both mobile and desktop platforms, and that we formally proved secure using TAMARIN.

Length Leakage in Oblivious Data Access Mechanisms

Grace Jia, Yale University; Rachit Agarwal, Cornell University; Anurag Khandelwal, Yale University

Available Media

This paper explores the problem of preventing length leakage in oblivious data access mechanisms with passive persistent adversaries. We show that designing mechanisms that prevent both length leakage and access pattern leakage requires navigating a three-way tradeoff between storage footprint, bandwidth footprint, and the information leaked to the adversary. We establish powerful lower bounds on achievable storage and bandwidth footprints for a variety of leakage profiles, and present constructions that perfectly or near-perfectly match the lower bounds.

Track 7

Crypto VII: Private Set Operations

Session Chair: Daniel S. Roche

Salon IJK

Unbalanced Circuit-PSI from Oblivious Key-Value Retrieval

Meng Hao, Nanyang Technological University; Weiran Liu and Liqiang Peng, Alibaba Group; Hongwei Li, Peng Cheng Laboratory; Cong Zhang, Institute for Advanced Study, BNRist, Tsinghua University; Hanxiao Chen and Tianwei Zhang, Nanyang Technological University

Available Media

Circuit-based Private Set Intersection (circuit-PSI) empowers two parties, a client and a server, each with input sets X and Y, to securely compute a function f on the intersection X∩Y while preserving the confidentiality of X∩Y from both parties. Despite the recent proposals of computationally efficient circuit-PSI protocols, they primarily focus on the balanced scenario where |X| is similar to |Y|. However, in many practical situations, a circuit-PSI protocol may be applied in an unbalanced context, where |X| is significantly smaller than |Y|. Directly applying existing protocols to this scenario poses notable efficiency challenges due to the communication complexity of these protocols scaling at least linearly with the size of the larger set, i.e., max(|X|,|Y|).

In this work, we put forth efficient constructions for unbalanced circuit-PSI, demonstrating sublinear communication complexity in the size of the larger set. Our key insight lies in formalizing unbalanced circuit-PSI as the process of obliviously retrieving values corresponding to keys from a set of key-value pairs. To achieve this, we propose a new functionality named Oblivious Key-Value Retrieval (OKVR) and design the OKVR protocol based on a new notion termed sparse Oblivious Key-Value Store (sparse OKVS). We conduct comprehensive experiments and the results showcase substantial improvements over the state-of-the-art circuit-PSI schemes, i.e., 1.84∼48.86x communication improvement and 1.50∼39.81x faster computation. Compared to a very recent unbalanced circuit-PSI work, our constructions outperform them by 1.18∼15.99x and 1.22∼10.44x in communication and computation overhead, respectively, depending on set sizes and network environments.

PEPSI: Practically Efficient Private Set Intersection in the Unbalanced Setting

Rasoul Akhavan Mahdavi, Nils Lukas, Faezeh Ebrahimianghazani, and Thomas Humphries, University of Waterloo; Bailey Kacsmar, University of Alberta; John Premkumar and Xinda Li, University of Waterloo; Simon Oya, University of British Columbia; Ehsan Amjadian, University of Waterloo and Royal Bank of Canada; Florian Kerschbaum, University of Waterloo

Available Media

Two parties with private data sets can find shared elements using a Private Set Intersection (PSI) protocol without revealing any information beyond the intersection. Circuit PSI protocols privately compute an arbitrary function of the intersection - such as its cardinality, and are often employed in an unbalanced setting where one party has more data than the other. Existing protocols are either computationally inefficient or require extensive server-client communication on the order of the larger set. We introduce Practically Efficient PSI or PEPSI, a non-interactive solution where only the client sends its encrypted data. PEPSI can process an intersection of 1024 client items with a million server items in under a second, using less than 5 MB of communication. Our work is over 4 orders of magnitude faster than an existing non-interactive circuit PSI protocol and requires only 10% of the communication. It is also up to 20 times faster than the work of Ion et al., which computes a limited set of functions and has communication costs proportional to the larger set. Our work is the first to demonstrate that non-interactive circuit PSI can be practically applied in an unbalanced setting.

Scalable Private Set Union, with Stronger Security

Yanxue Jia, Purdue University; Shi-Feng Sun, Shanghai Jiao Tong University; Hong-Sheng Zhou, Virginia Commonwealth University; Dawu Gu, Shanghai Jiao Tong University

Available Media

Private Set Union (PSU) protocol allows parties, each holding an input set, to jointly compute the union of the sets without revealing anything else. In the literature, scalable PSU protocols follow the split-execute-assemble'' paradigm (Kolesnikov et al., ASIACRYPT 2019); in addition, those fast protocols often use Oblivious Transfer as building blocks. Kolesnikov et al.~(ASIACRYPT 2019) and Jia et al.~(USENIX Security 2022), pointed out that certain security issues can be introduced in thesplit-execute-assemble'' paradigm. In this work, surprisingly, we observe that the typical way of invoking Oblivious Transfer also causes unnecessary leakage, and only the PSU protocols based on additively homomorphic encryption (AHE) can avoid the leakage. However, the AHE-based PSU protocols are far from being practical.

To bridge the gap, we also design a new PSU protocol that can avoid the unnecessary leakage. Unlike the AHE-based PSU protocols, our new construction only relies on symmetric-key operations other than base OTs, thereby being much more scalable. The experimental results demonstrate that our protocol can obtain at least 873.74 x speedup over the best-performing AHE-based scheme. Moreover, our performance is comparable to that of the state-of-the-art PSU protocol (Chen et al., USENIX Security 2023), which also suffers from the unnecessary leakage.

O-Ring and K-Star: Efficient Multi-party Private Set Intersection

Mingli Wu, The University of Hong Kong; Tsz Hon Yuen, Monash University; Kwan Yin Chan, The University of Hong Kong

Available Media

Multi-party private set intersection (mPSI) securely enables multiple parties to know the intersection of their sets without disclosing anything else. Many mPSI protocols are not efficient in practice. In this paper, we propose two efficient mPSI protocols that are secure against an arbitrary number of colluding parties. In the protocol O-Ring, we take advantage of the ring network topology such that the communication costs of the party with the largest workload can be cheaper than other mPSI protocols with a star topology. In the protocol K-Star, we take advantage of the star topology to support better concurrency such that the protocol can run fast. K-Star is suitable for applications with a powerful centralized server. Different from KMPRT (CCS'17) and CDGOSS (CCS'21) that rely on Oblivious Programmable PRF primitive, we simply utilize the cheaper Oblivious PRF (OPRF) and a data structure Oblivious Key-value Store (OKVS). We further propose two fine-grained optimizations for OKVS and OPRF in multi-party cases to improve runtime performance.

After extensive experiments, we demonstrate that both protocols run the fastest and achieve the lowest total communication costs compared with the state-of-the-art counterparts in most settings. Specifically, O-Ring/K-Star is respectively 1.6 × ∼ 48.3 × 1.6×∼48.3× and 4.0 × ∼ 39.8 × 4.0×∼39.8× (except one setting) cheaper than KMPRT (CCS'17) and CDGOSS (CCS'21) in the total communication costs. For the total running time, K-Star can be respectively 1.4 × ∼ 9.0 × 1.4×∼9.0× and 1.0 × ∼ 15.3 × 1.0×∼15.3× as fast as them in the LAN setting.

2:15 pm–2:45 pm

Coffee and Tea Break

Grand Ballroom Foyer

2:45 pm–3:45 pm

Track 1

Social Issues IV

Session Chair: Habiba Farrukh

Salon AB

Being Transparent is Merely the Beginning: Enforcing Purpose Limitation with Polynomial Approximation

Shuofeng Liu and Zihan Wang, The University of Queensland; Minhui Xue, CSIRO's Data61; Long Wang and Yuanchao Zhang, Information Security Department, Ant Finance; Guangdong Bai, The University of Queensland

Available Media

Obtaining the authorization of users (i.e., data owners) prior to data collection has become commonplace for online service providers (i.e., data processors), in light of the stringent data regulations around the world. However, it remains a challenge to uphold the principle of purpose limitation, which mandates that collected data should only be processed for the purpose that the data owner has originally authorized. In this work, we advocate algorithm specificity, as a means to enforce the purpose limitation principle. We propose AlgoSpec, which obscures data to restrict its usability solely to an authorized algorithm or algorithm group. AlgoSpec exploits the nature of polynomial approximation that given the input data and the highest order, any algorithm can be approximated with a unique polynomial. It converts the original authorized algorithm (or a part of it) into a polynomial and then creates a list of alternatives to the original data. To assess the efficacy and efficiency of AlgoSpec, we apply it to the entropy method and Naive Bayes classification under datasets of different magnitudes from 10^2 to 10^6. AlgoSpec significantly outperforms cryptographic solutions such as fully homomorphic encryption (FHE) in efficiency. On accuracy, it achieves a negligible Mean Squared Error (MSE) of 0.289 in the entropy method against computation over plaintext data, and identical accuracy (92.11%) and similar F1 score (87.67%) in the Naive Bayes classification.

DVSorder: Ballot Randomization Flaws Threaten Voter Privacy

Braden L. Crimmins and Dhanya Y. Narayanan, University of Michigan; Drew Springall, Auburn University; J. Alex Halderman, University of Michigan

Distinguished Paper Award Winner

Available Media

A trend towards publishing ballot-by-ballot election results has created new risks to voter privacy due to inadequate protections by election technology. These risks are manifested by a vulnerability we discovered in precinct-based ballot scanners made by Dominion Voting Systems, which are used in parts of 21 states and Canada. In a variety of scenarios, the flaw—which we call DVSorder—would allow attackers to link individuals with their votes and compromise ballot secrecy. The root cause is that the scanners assign pseudorandom ballot identifiers using a linear congruential generator, an approach known since the 1970s to be insecure. Dominion attempted to obfuscate the generator's output, but we show that it can be broken using only pen and paper to reveal the order in which all ballots were cast. Unlike past ballot randomization flaws, which typically required insider access to exploit or access to proprietary software to discover, DVSorder can be discovered and exploited using only public information.

In addition, the election sector's response to our findings provides a case study highlighting gaps in regulations and vulnerability management within this area of critical infrastructure. Although Dominion released a software update in response to DVSorder, some localities have continued to publish vulnerable data due to inadequate information sharing and mitigation planning, and at least one state has deferred addressing the flaw until after the 2024 presidential election, more than two years following our disclosure.

Navigating the Privacy Compliance Maze: Understanding Risks with Privacy-Configurable Mobile SDKs

Yifan Zhang, Indiana University Bloomington; Zhaojie Hu and Xueqiang Wang, University of Central Florida; Yuhui Hong, Indiana University Bloomington; Yuhong Nan, Sun Yat-sen University; XiaoFeng Wang, Indiana University Bloomington; Jiatao Cheng, Sun Yat-sen University; Luyi Xing, Indiana University Bloomington

Available Media

The rise of privacy laws like GDPR and CCPA has made privacy compliance a requirement for mobile apps. Yet, achieving it is difficult due to the apps' use of third-party SDKs with opaque data practices. Recently, to assist apps in complying with privacy laws, many leading third-party SDKs have started providing privacy APIs for configuring the SDK's data practices. Nevertheless, the extent to which such a paradigm, referred to as privacy-configurable SDKs (or PICO SDKs), truly enhances app privacy compliance remains unclear to the community.

This question can only be answered through a systematic measurement study, which is nontrivial and requires in-depth analysis of the implementation of privacy APIs in PICO SDKs, as well as the way they are utilized, sometimes through a "wrapper" SDK that encapsulates other SDKs. To address this challenge, we developed PICOSCAN, a privacy risk analysis framework targeting Android, one of the most common mobile platforms. PICOSCAN automatically analyzes the code of both apps and SDKs to detect practices that potentially invade user privacy. Applying PICOSCAN to 65 most popular PICO SDKs and over 48,000 Google Play apps, we uncovered significant privacy risks in today's Android ecosystem. A large number of them fail to correctly utilize privacy APIs as prescribed, and even when these APIs are used, they often do not align with user privacy preferences. Moreover, our study reveals that many wrapper SDKs do not accurately convey privacy configurations to the SDKs they encapsulate, resulting in compliance risks. Our findings expose systematic failures in the design, implementation, and usage of PICO SDKs, highlighting the urgent need for more effective solutions to enhance the privacy assurance of Android apps. We will open-source the framework and make the data produced by this study publicly available.

Enabling Developers, Protecting Users: Investigating Harassment and Safety in VR

Abhinaya S.B., Aafaq Sabir, and Anupam Das, North Carolina State University

Available Media

Virtual Reality (VR) has witnessed a rising issue of harassment, prompting the integration of safety controls like muting and blocking in VR applications. However, the lack of standardized safety measures across VR applications hinders their universal effectiveness, especially across contexts like socializing, gaming, and streaming. While prior research has studied safety controls in social VR applications, our user study (n = 27) takes a multi-perspective approach, examining both users' perceptions of safety control usability and effectiveness as well as the challenges that developers face in designing and deploying VR safety controls. We identify challenges VR users face while employing safety controls, such as finding users in crowded virtual spaces to block them. VR users also find controls ineffective in addressing harassment; for instance, they fail to eliminate the harassers' presence from the environment. Further, VR users find the current methods of submitting evidence for reports time-consuming and cumbersome. Improvements desired by users include live moderation and behavior tracking across VR apps; however, developers cite technological, financial, and legal obstacles to implementing such solutions, often due to a lack of awareness and high development costs. We emphasize the importance of establishing technical and legal guidelines to enhance user safety in virtual environments.

Track 2

IoT and CPS

Session Chair: Albert Levi

Salon CD

Demystifying the Security Implications in IoT Device Rental Services

Yi He and Yunchao Guan, Tsinghua University; Ruoyu Lun, China National Digital Switching System Engineering and Technological Research Center; Shangru Song and Zhihao Guo, Tsinghua University; Jianwei Zhuge and Jianjun Chen, Tsinghua University and Zhongguancun Laboratory; Qiang Wei and Zehui Wu, China National Digital Switching System Engineering and Technological Research Center; Miao Yu and Hetian Shi, Tsinghua University; Qi Li, Tsinghua University and Zhongguancun Laboratory

Available Media

Nowadays, unattended device rental services with cellular IoT controllers, such as e-scooters and EV chargers, are widely deployed in public areas around the world, offering convenient access to users via mobile apps.While differing from traditional smart homes in functionality and implementation, the security of these devices remains largely unexplored.In this work, we conduct a systematic study to uncover security implications in IoT device rental services.By investigating 17 physical devices and 92 IoT apps, we identify multiple design and implementation flaws across a wide range of products, which can lead to severe security consequences, such as forcing all devices offline, remotely controlling all devices, or hijacking all users' accounts of the vendors. The root cause is that rentable IoT devices adopt weak resource identifiers (IDs), and attackers can infer these IDs at scale and exploit access control flaws to manipulate these resources.For instance, rentable IoT products allow authenticated users to find and use any device from the rentable IoT apps via a device serial number, which can be easily inferred by attackers and combined with other vulnerabilities to exploit remote devices on a large scale.To identify these risks, we propose a tool, called IDScope, to automatically detect the weak IDs in apps and assess if these IDs can be abused to scale the exploitation scope of existing access control vulnerabilities.Finally, we identify 57 vulnerabilities in 28 products which can lead to various large-scale exploitation in 24 products and affect millions of users and devices by exploiting three types of weak IDs. The vendors have confirmed our findings and most issues have been mitigated with our assistance.

SAIN: Improving ICS Attack Detection Sensitivity via State-Aware Invariants

Syed Ghazanfar Abbas, Muslum Ozgur Ozmen, Abdulellah Alsaheel, Arslan Khan, Z. Berkay Celik, and Dongyan Xu, Purdue University

Available Media

Industrial Control Systems (ICSs) rely on Programmable Logic Controllers (PLCs) to operate within a set of states. The states are composed of variables that determine how sensor data is interpreted, configuration parameters are applied, and actuator commands are issued. Recent works have shown that attackers can manipulate these variables to compromise ICS safety and security. To detect such attacks, previous approaches have leveraged invariants—a set of rules defining the correct behavior of an ICS. However, these invariants suffer from a critical limitation: they are state-agnostic. This means they define variable ranges across all possible ICS states, leading to loosely bounded detection thresholds. Unfortunately, attackers can exploit these loose bounds and launch stealthy attacks that evade detection without violating such invariants.

In this paper, we introduce SAIN, an automated method to derive state-aware ICS invariants with tighter bounds and enforce them through a PLC-based monitor. SAIN first generates invariant templates by identifying the PLC program states, state transitions, and the inter-dependencies among sensing, actuation, and configuration variables within each state through program analysis. It then partitions the ICS data traces into state-specific sub-traces and quantifies the invariant templates with concrete, tighter bounds, as system-specific knowledge about the subject ICS. Lastly, it enforces the state-aware invariants through a run-time monitor. We evaluate SAIN on a Fischertechnik manufacturing plant and a chemical plant simulator against 17 attacks. SAIN protects the plants, on average, with a false positive rate of 2% and a run-time overhead of 3%.

Opportunistic Data Flow Integrity for Real-time Cyber-physical Systems Using Worst Case Execution Time Reservation

Yujie Wang, Ao Li, Jinwen Wang, Sanjoy Baruah, and Ning Zhang, Washington University in St. Louis

Available Media

With the proliferation of safety-critical real-time systems in our daily life, it is imperative that their security is protected to guarantee their functionalities. To this end, one of the most powerful modern security primitives is the enforcement of data flow integrity. However, the run-time overhead can be prohibitive for real-time cyber-physical systems. On the other hand, due to strong safety requirements on such real-time cyber-physical systems, platforms are often designed with enough reservation such that the system remains real-time even if it is experiencing the worst-case execution time. We conducted a measurement study on eight popular CPS systems and found the worst-case execution time is often at least five times the average run time.

In this paper, we propose opportunistic data flow integrity, OP-DFI, that takes advantage of the system reservation to enforce data flow integrity to the CPS software. To avoid impacting the real-time property, OP-DFI tackles the challenge of slack estimation and run-time policy swapping to take advantage of the extra time in the system opportunistically. To ensure the security protection remains coherent, OP-DFI leverages in-line reference monitors and hardware-assisted features to perform dynamic fine-grained sandboxing. We evaluated OP-DFI on eight real-time CPS. With a worst-case execution time overhead of 2.7%, OP-DFI effectively performs DFI checking on 95.5% of all memory operations and 99.3% of safety-critical control-related memory operations on average.

On Bridging the Gap between Control Flow Integrity and Attestation Schemes

Mahmoud Ammar, Ahmed Abdelraoof, and Silviu Vlasceanu, Huawei Research, Germany

Available Media

Control-flow hijacking attacks are still a major challenge in software security. Several means of protection and detection have been proposed but gaps still exist. To bridge such gaps, major processor manufacturers have designed and implemented several hardware security extensions in the new generations of processors. High-profile examples include Pointer Authentication (PA) and Branch Target Identification (BTI) technologies that are supported in the ARMv8.5-A processor architecture. Nevertheless, the direct enablement of these technologies would only provide coarse-grained security guarantees without any trustworthy evidence of runtime integrity.

To fill this gap, we propose CFA+, a practical hardware-assisted control flow attestation mechanism with prevention capabilities. CFA+ leverages the ARMv8.5-A's BTI security extension along with selective software instrumentation to enable lightweight always-on monitoring of the execution state without the need to maintain in-memory control-flow logs. The hybrid policy of CFA+ allows for either immediate prevention or quick detection of control-flow hijacks while providing trustworthy evidence of the runtime integrity status. CFA+ provides fine-grained security guarantees to complex software stacks while maintaining a high level of efficiency and scalability, surpassing state-of-the-art solutions. Our evaluation results show that CFA+ incurs less than 3% of runtime overhead on average when applied to a wide range of benchmark applications including SPEC CPU2006 suite and nginx.

Track 3

Crypto VIII: Side Channel

Session Chair: Yasemin Acar

Salon E

Windows into the Past: Exploiting Legacy Crypto in Modern OS's Kerberos Implementation

Michal Shagam and Eyal Ronen, Tel Aviv University

Available Media

The Kerberos protocol is used by millions of users and network administrators worldwide for secure authentication, key distribution, and access control management to enterprise networks and services. Since its initial public deployment in 1989, the protocol has undergone many revisions to incorporate new cryptographic primitives and improve security. For example, initially based solely on users' passwords and symmetric cryptographic primitives, current implementations also support smartcard-based authentication with asymmetric cryptographic primitives for improved security. However, this iterative revision process has resulted in implementations riddled with legacy crypto primitives and protocol designs.

In this work, we show how we can exploit this legacy crypto to completely break the security of the enterprise network. Firstly, while arguably more secure, smartcard-based authentication uses RSA encryption with the notorious PKCS #1 v1.5 padding scheme. Although the RSA decryption is done securely inside the smartcard, a non-constant time unpadding code runs on the client's CPU. This makes both Windows's and several Linux distributions' implementations vulnerable to the Bleichenbacher attack that can recover cryptographic session tokens. Secondly, we show that the RSA smartcard-based authentication does not provide forward secrecy to the cryptographic tokens that the server provisions to the client. Thirdly, we propose and analyze different algorithmic approaches to minimize the overhead required to handle noisy oracles in the Bleichenbacher attack. This general Bleichenbacher attack analysis may be of independent interest.

Finally, we demonstrate microarchitectural side channel-based end-to-end attacks on the Windows Kerberos implementation. We start by showing how to recover tokens used to encrypt session transferred remote files by Samba. We then show how to amplify the number of decryptions performed with a single user's PIN code input, allowing us to accelerate our attack and recover users' (and admins') credentials before expiration. In addition, we describe a remote attack vector that allows us to perform the attack and generate queries.

Divide and Surrender: Exploiting Variable Division Instruction Timing in HQC Key Recovery Attacks

Robin Leander Schröder, Fraunhofer SIT, Darmstadt, Germany and Fraunhofer Austria, Vienna, Austria; Stefan Gast, Graz University of Technology, Austria; Qian Guo, Lund University, Sweden

Available Media

We uncover a critical side-channel vulnerability in the Hamming Quasi-Cyclic (HQC) round 4 optimized implementation arising due to the use of the modulo operator. In some cases, compilers optimize uses of the modulo operator with compile-time known divisors into constant-time Barrett reductions. However, this optimization is not guaranteed: for example, when a modulo operation is used in a loop the compiler may emit division (div) instructions which have variable execution time depending on the numerator. When the numerator depends on secret data, this may yield a timing side-channel. We name vulnerabilities of this kind Divide and Surrender (DaS) vulnerabilities.

For processors supporting Simultaneous Multithreading (SMT) we propose a new approach called DIV-SMT which enables precisely measuring small division timing variations using scheduler and/or execution unit contention. We show that using only 100 such side-channel traces we can build a Plaintext-Checking (PC) oracle with above 90% accuracy. Our approach might also prove applicable to other instances of the DaS vulnerability, such as KyberSlash. We stress that exploitation with DIV-SMT requires co-location of the attacker on the same physical core as the victim.

We then apply our methodology to HQC and present a novel way to recover HQC secret keys faster, achieving an 8-fold decrease in the number of idealized oracle queries when compared to previous approaches. Our new PC oracle attack uses our newly developed Zero Tester method to quickly determine whether an entire block of bits contains only zero-bits. The Zero Tester method enables the DIV-SMT powered attack on HQC-128 to complete in under 2 minutes on our targeted AMD Zen2 machine.

With Great Power Come Great Side Channels: Statistical Timing Side-Channel Analyses with Bounded Type-1 Errors

Martin Dunsche, Marcel Maehren, and Nurullah Erinola, Ruhr University Bochum; Robert Merget, Technology Innovation Institute; Nicolai Bissantz, Ruhr University Bochum; Juraj Somorovsky, Paderborn University; Jörg Schwenk, Ruhr University Bochum

Available Media

Constant-time implementations are essential to guarantee the security of secret-key operations. According to Jancar et al. [42], most cryptographic developers do not use statistical tests to evaluate their implementations for timing side-channel vulnerabilities. One of the main reasons is their high unreliability due to potential false positives caused by noisy data. In this work, we address this issue and present an improved statistical evaluation methodology with a controlled type-1 error (α) that restricts false positives independently of the noise distribution. Simultaneously, we guarantee statistical power with increasing sample size. With the bounded type-1 error, the user can perform trade-offs between false positives and the size of the side channels they wish to detect. We achieve this by employing an empirical bootstrap that creates a decision rule based on the measured data.

We implement this approach in an open-source tool called RTLF and compare it with three different competitors: Mona, dudect, and tlsfuzzer. We further compare our results to the t-test, a commonly used statistical test for side-channel analysis. To show the applicability of our tool in real cryptographic network scenarios, we performed a quantitative analysis with local timing measurements for CBC Padding Oracle attacks, Bleichenbacher's attack, and the Lucky13 attack in 823 available versions of eleven TLS libraries. Additionally, we performed a qualitative analysis of the most recent version ofeach library. We find that most libraries were long-time vulnerable to at least one of the considered attacks, with side channels big enough likely to be exploitable in a LAN setting. Through the qualitative analysis based on the results of RTLF, we identified seven vulnerabilities in recent versions.

"These results must be false": A usability evaluation of constant-time analysis tools

Marcel Fourné, Paderborn University and MPI-SP; Daniel De Almeida Braga, Rennes University, CNRS, IRISA; Jan Jancar, Masaryk University; Mohamed Sabt, Rennes University, CNRS, IRISA; Peter Schwabe, MPI-SP and Radboud University; Gilles Barthe, MPI-SP and IMDEA Software Institute; Pierre-Alain Fouque, Rennes University, CNRS, IRISA; Yasemin Acar, Paderborn University and George Washington University

Available Media

Cryptography secures our online interactions, transactions, and trust. To achieve this goal, not only do the cryptographic primitives and protocols need to be secure in theory, they also need to be securely implemented by cryptographic library developers in practice.

However, implementing cryptographic algorithms securely is challenging, even for skilled professionals, which can lead to vulnerable implementations, especially to side-channel attacks. For timing attacks, a severe class of side-channel attacks, there exist a multitude of tools that are supposed to help cryptographic library developers assess whether their code is vulnerable to timing attacks. Previous work has established that despite an interest in writing constant-time code, cryptographic library developers do not routinely use these tools due to their general lack of usability. However, the precise factors affecting the usability of these tools remain unexplored. While many of the tools are developed in an academic context, we believe that it is worth exploring the factors that contribute to or hinder their effective use by cryptographic library developers.

To assess what contributes to and detracts from usability of tools that verify constant-timeness (CT), we conducted a two-part usability study with 24 (post) graduate student participants on 6 tools across diverse tasks that approximate real-world use cases for cryptographic library developers.

We find that all studied tools are affected by similar usability issues to varying degrees, with no tool excelling in usability, and usability issues preventing their effective use.

Based on our results, we recommend that effective tools for verifying CT need usable documentation, simple installation, easy to adapt examples, clear output corresponding to CT violations, and minimal noninvasive code markup. We contribute first steps to achieving these with limited academic resources, with our documentation, examples, and installation scripts.

Track 4

Web Security III: XSS and PHP

Session Chair: Musard Balliu

Salon F

Dancer in the Dark: Synthesizing and Evaluating Polyglots for Blind Cross-Site Scripting

Robin Kirchner, Technische Universität Braunschweig; Jonas Möller, Technische Universität Berlin; Marius Musch and David Klein, Technische Universität Braunschweig; Konrad Rieck, Technische Universität Berlin; Martin Johns, Technische Universität Braunschweig

Distinguished Paper Award Winner

Available Media

Cross-Site Scripting (XSS) is a prevalent and well known security problem in web applications. Numerous methods to automatically analyze and detect these vulnerabilities exist. However, all of these methods require that either code or feedback from the application is available to guide the detection process. In larger web applications, inputs can propagate from a frontend to an internal backend that provides no feedback to the outside. None of the previous approaches are applicable in this scenario, known as blind XSS (BXSS). In this paper, we address this problem and present the first comprehensive study on BXSS. As no feedback channel exists, we verify the presence of vulnerabilities through blind code execution. For this purpose, we develop a method for synthesizing polyglots, small XSS payloads that execute in all common injection contexts. Seven of these polyglots are already sufficient to cover a state-of-the-art XSS testbed. In a validation on real-world client-side vulnerabilities, we show that their XSS detection rate is on par with existing taint tracking approaches. Based on these polyglots, we conduct a study of BXSS vulnerabilities on the Tranco Top 100,000 websites. We discover 20 vulnerabilities in 18 web-based backend systems. These findings demonstrate the efficacy of our detection approach and point at a largely unexplored attack surface in web security.

Spider-Scents: Grey-box Database-aware Web Scanning for Stored XSS

Eric Olsson and Benjamin Eriksson, Chalmers University of Technology; Adam Doupé, Arizona State University; Andrei Sabelfeld, Chalmers University of Technology

Available Media

As web applications play an ever more important role in society, so does ensuring their security. A large threat to web application security is XSS vulnerabilities, and in particular, stored XSS. Due to the complexity of web applications and the difficulty of properly injecting XSS payloads into a web application, many of these vulnerabilities still evade current state-of-the-art scanners. We approach this problem from a new direction—by injecting XSS payloads directly into the database we can completely bypass the difficulty of injecting XSS payloads into a web application. We thus propose Spider-Scents, a novel method for grey-box database-aware scanning for stored XSS, that maps database values to the web application and automatically finds unprotected outputs. Spider-Scents reveals code smells that expose stored XSS vulnerabilities. We evaluate our approach on a set of 12 web applications and compare with three state-of-the-art black-box scanners. We demonstrate improvement of database coverage, ranging from 79% to 100% database coverage across the applications compared to the range of 2% to 60% for the other scanners. We systematize the relationship between unprotected outputs, vulnerabilities, and exploits in the context of stored XSS. We manually analyze unprotected outputs reported by Spider-Scents to determine their vulnerability and exploitability. In total, this method finds 85 stored XSS vulnerabilities, outperforming the union of state-of-the-art's 32.

Argus: All your (PHP) Injection-sinks are belong to us.

Rasoul Jahanshahi and Manuel Egele, Boston University

Available Media

Injection-based vulnerabilities in web applications such as cross-site scripting (XSS), insecure deserialization, and command injection have proliferated in recent years, exposing both clients and web applications to security breaches. Current studies in this area focus on detecting injection vulnerabilities in applications. Crucially, existing systems rely on manually curated lists of functions, so-called sinks, to detect such vulnerabilities. However, current studies are oblivious to the internal mechanics of the underlying programming language. In such a case, existing systems rely on an incomplete set of sinks, which results in disregarding security vulnerabilities. Despite numerous studies on injection vulnerabilities, there has been no study that comprehensively identifies the set of functions that an attacker can exploit for injection attacks.

This paper addresses the drawbacks of relying on manually curated lists of sinks to identify such vulnerabilities. We devise a novel generic approach to automatically identify the set of sinks that can lead to injection-style security vulnerabilities. To demonstrate the generality, we focused on three types of injection vulnerabilities: XSS, command injection, and insecure deserialization. We implemented a prototype of our approach in a tool called Argus to identify the set of PHP functions that deserialize user-input, execute operating system (OS) commands, or write user-input to the output buffer. We evaluated our prototype on the three most popular major versions of the PHPinterpreter. Argus detected 284 deserialization functions that allow adversaries to perform deserialization attacks, an order of magnitude more than the most exhaustive manually curated list used in related work. Furthermore, we detected 22 functions that can lead to XSS attacks, which is twice the number of functions used in prior work. To demonstrate thatArgus produces security-relevant findings, we integrated its results with three existing analysis systems– Psalm and RIPS, two static taint analyses, and FUGIO, an exploit generation tool. Themodifiedtoolsdetected 13 previously unknown deserialization and XSS vulnerabilities in WordPress and its plugins, of which 11 have been assigned CVE IDs and designated as high-severity vulnerabilities.

SSRF vs. Developers: A Study of SSRF-Defenses in PHP Applications

Malte Wessels and Simon Koch, Technische Universität Braunschweig; Giancarlo Pellegrino, CISPA Helmholtz Center for Information Security; Martin Johns, Technische Universität Braunschweig

Available Media

Server-side requests (SSR) are a potent and important tool for modern web applications, as they enable features such as link preview and web hooks. Unfortunately, naive usage of SSR opens the underlying application up to Server-Side Request Forgery – an underappreciated vulnerability risk. To shed light on this vulnerability class, we conduct an in-depth analysis of known exploitation methods as well as defenses and mitigations across PHP. We then proceed to study the prevalence of the vulnerability and defenses across 27,078 open-source PHP applications. For this we perform an initial data flow analysis, identifying attacker-controlled inputs into known SSR functions, followed up by a manual analysis of our results to gain a detailed understanding of the involved vulnerabilities and present defenses. Our results show that defenses are sparse. The hypermajority of our 237 detected data flows are vulnerable. Only two analyzed applications implement safe SSR features.

Since known defenses are not used and detected attacker-controlled flows are almost always vulnerable, we can only conclude that developers are still unaware of SSR abuses and the need to defend against them. Consequently, SSRF is a present and underappreciated danger in modern web applications.

Track 5

ML X: Privacy Inference II

Session Chair: Kavita Kumari

Salon G

How Does a Deep Learning Model Architecture Impact Its Privacy? A Comprehensive Study of Privacy Attacks on CNNs and Transformers

Guangsheng Zhang, Bo Liu, Huan Tian, and Tianqing Zhu, University of Technology Sydney; Ming Ding, Data 61, Australia; Wanlei Zhou, City University of Macau

Available Media

As a booming research area in the past decade, deep learning technologies have been driven by big data collected and processed on an unprecedented scale. However, privacy concerns arise due to the potential leakage of sensitive information from the training data. Recent research has revealed that deep learning models are vulnerable to various privacy attacks, including membership inference attacks, attribute inference attacks, and gradient inversion attacks. Notably, the efficacy of these attacks varies from model to model. In this paper, we answer a fundamental question: Does model architecture affect model privacy? By investigating representative model architectures from convolutional neural networks (CNNs) to Transformers, we demonstrate that Transformers generally exhibit higher vulnerability to privacy attacks than CNNs. Additionally, we identify the micro design of activation layers, stem layers, and LN layers, as major factors contributing to the resilience of CNNs against privacy attacks, while the presence of attention modules is another main factor that exacerbates the privacy vulnerability of Transformers. Our discovery reveals valuable insights for deep learning models to defend against privacy attacks and inspires the research community to develop privacy-friendly model architectures.

Reconstructing training data from document understanding models

Jérémie Dentan, Crédit Agricole SA and École Polytechnique, IP Paris; Arnaud Paran and Aymen Shabou, Crédit Agricole SA

Available Media

Document understanding models are increasingly employed by companies to supplant humans in processing sensitive documents, such as invoices, tax notices, or even ID cards. However, the robustness of such models to privacy attacks remains vastly unexplored.

This paper presents CDMI, the first reconstruction attack designed to extract sensitive fields from the training data of these models. We attack LayoutLM and BROS architectures, demonstrating that an adversary can perfectly reconstruct up to 4.1% of the fields of the documents used for fine-tuning, including some names, dates, and invoice amounts up to six-digit numbers. When our reconstruction attack is combined with a membership inference attack, our attack accuracy escalates to 22.5%.

In addition, we introduce two new end-to-end metrics and evaluate our approach under various conditions: unimodal or bimodal data, LayoutLM or BROS backbones, four fine-tuning tasks, and two public datasets (FUNSD and SROIE). We also investigate the interplay between overfitting, predictive performance, and susceptibility to our attack. We conclude with a discussion on possible defenses against our attack and potential future research directions to construct robust document understanding models.

Privacy Side Channels in Machine Learning Systems

Edoardo Debenedetti, ETH Zurich; Giorgio Severi, Northeastern University; Nicholas Carlini, Christopher A. Choquette-Choo, Matthew Jagielski, and Milad Nasr, Google DeepMind; Eric Wallace, UC Berkeley; Florian Tramèr, ETH Zurich

Available Media

Most current approaches for protecting privacy in machine learning (ML) assume that models exist in a vacuum. Yet, in reality, these models are part of larger systems that include components for training data filtering, output monitoring, and more. In this work, we introduce privacy side channels: attacks that exploit these system-level components to extract private information at far higher rates than is otherwise possible for standalone models. We propose four categories of side channels that span the entire ML lifecycle (training data filtering, input preprocessing, output post-processing, and query filtering) and allow for enhanced membership inference, data extraction, and even novel threats such as extraction of users' test queries. For example, we show that deduplicating training data before applying differentially-private training creates a side-channel that completely invalidates any provable privacy guarantees. We further show that systems which block language models from regenerating training data can be exploited to exfiltrate private keys contained in the training set—even if the model did not memorize these keys. Taken together, our results demonstrate the need for a holistic, end-to-end privacy analysis of machine learning systems.

FaceObfuscator: Defending Deep Learning-based Privacy Attacks with Gradient Descent-resistant Features in Face Recognition

Shuaifan Jin, He Wang, and Zhibo Wang, Zhejiang University; Feng Xiao, Palo Alto Networks; Jiahui Hu, Zhejiang University; Yuan He and Wenwen Zhang, Alibaba Group; Zhongjie Ba, Weijie Fang, Shuhong Yuan, and Kui Ren, Zhejiang University

Available Media

As face recognition is widely used in various security-sensitive scenarios, face privacy issues are receiving increasing attention. Recently, many face recognition works have focused on privacy preservation and converted the original images into protected facial features. However, our study reveals that emerging Deep Learning-based (DL-based) reconstruction attacks exhibit notable ability in learning and removing the protection patterns introduced by existing schemes and recovering the original facial images, thus posing a significant threat to face privacy. To address this threat, we introduce FaceObfuscator, a lightweight privacy-preserving face recognition system that first removes visual information that is non-crucial for face recognition from facial images via frequency domain and then generates obfuscated features interleaved in the feature space to resist gradient descent in DL-based reconstruction attacks. To minimize the loss in face recognition accuracy, obfuscated features with different identities are well-designed to be interleaved but non-duplicated in the feature space. This non-duplication ensures that FaceObfuscator can extract identity information from the obfuscated features for accurate face recognition. Extensive experimental results demonstrate that FaceObfuscator's privacy protection capability improves around 90% compared to existing privacy-preserving methods in two major leakage scenarios including channel leakage and database leakage, with a negligible 0.3% loss in face recognition accuracy. Our approach has also been evaluated in a real-world environment and protected more than 100K people's face data of a major university.

Track 6

Security Analysis V: ML

Session Chair: Hyungon Moon

Salon H

Hijacking Attacks against Neural Network by Analyzing Training Data

Yunjie Ge, Qian Wang, and Huayang Huang, Wuhan University; Qi Li, Tsinghua University; BNRist; Cong Wang, City University of Hong Kong; Chao Shen, Xi'an Jiaotong University; Lingchen Zhao, Wuhan University; Peipei Jiang, Wuhan University; City University of Hong Kong; Zheng Fang and Shenyi Zhang, Wuhan University

Available Media

Backdoors and adversarial examples are the two primary threats currently faced by deep neural networks (DNNs). Both attacks attempt to hijack the model behaviors with unintended outputs by introducing (small) perturbations to the inputs. However, neither attack is without limitations in practice. Backdoor attacks, despite the high success rates, often require the strong assumption that the adversary could tamper with the training data or code of the target model, which is not always easy to achieve in reality. Adversarial example attacks, which put relatively weaker assumptions on attackers, often demand high computational resources, yet do not always yield satisfactory success rates when attacking mainstream blackbox models in the real world. These limitations motivate the following research question: can model hijacking be achieved in a simpler way with more satisfactory attack performance and also more reasonable attack assumptions?

In this paper, we provide a positive answer with CleanSheet, a new model hijacking attack that obtains the high performance of backdoor attacks without requiring the adversary to temper with the model training process. CleanSheet exploits vulnerabilities in DNNs stemming from the training data. Specifically, our key idea is to treat part of the clean training data of the target model as "poisoned data", and capture the characteristics of these data that are more sensitive to the model (typically called robust features) to construct "triggers". These triggers can be added to any input example to mislead the target model, similar to backdoor attacks. We validate the effectiveness of CleanSheet through extensive experiments on five datasets, 79 normally trained models, 68 pruned models, and 39 defensive models. Results show that CleanSheet exhibits performance comparable to state-of-theart backdoor attacks, achieving an average attack success rate (ASR) of 97.5% on CIFAR-100 and 92.4% on GTSRB, respectively. Furthermore, CleanSheet consistently maintains a high ASR, with most ASR surpassing 80%, when confronted with various mainstream backdoor defense mechanisms.

False Claims against Model Ownership Resolution

Jian Liu and Rui Zhang, Zhejiang University; Sebastian Szyller, Intel Labs & Aalto University; Kui Ren, Zhejiang University; N. Asokan, University of Waterloo & Aalto University

Available Media

Deep neural network (DNN) models are valuable intellectual property of model owners, constituting a competitive advantage. Therefore, it is crucial to develop techniques to protect against model theft. Model ownership resolution (MOR) is a class of techniques that can deter model theft. A MOR scheme enables an accuser to assert an ownership claim for a suspect model by presenting evidence, such as a watermark or fingerprint, to show that the suspect model was stolen or derived from a source model owned by the accuser. Most of the existing MOR schemes prioritize robustness against malicious suspects, ensuring that the accuser will win if the suspect model is indeed a stolen model.

In this paper, we show that common MOR schemes in the literature are vulnerable to a different, equally important but insufficiently explored, robustness concern: a malicious accuser. We show how malicious accusers can successfully make false claims against independent suspect models that were not stolen. Our core idea is that a malicious accuser can deviate (without detection) from the specified MOR process by finding (transferable) adversarial examples that successfully serve as evidence against independent suspect models. To this end, we first generalize the procedures of common MOR schemes and show that, under this generalization, defending against false claims is as challenging as preventing (transferable) adversarial examples. Via systematic empirical evaluation we show that our false claim attacks always succeed in MOR schemes that follow our generalization, including in a real-world model: Amazon's Rekognition API.

Landscape More Secure Than Portrait? Zooming Into the Directionality of Digital Images With Security Implications

Benedikt Lorch and Rainer Böhme, University of Innsbruck

Available Media

The orientation in which a source image is captured can affect the resulting security in downstream applications. One reason for this is that many state-of-the-art methods in media security assume that image statistics are similar in the horizontal and vertical directions, allowing them to reduce the number of features (or trainable weights) by merging coefficients. We show that this artificial symmetrization tends to suppress important properties of natural images and common processing operations, causing a loss of performance. We also observe the opposite problem, where unaddressed directionality causes learning-based methods to overfit to a single orientation. These are vulnerable to manipulation if an adversary chooses inputs with the less common orientation. This paper takes a comprehensive approach, identifies and systematizes causes of directionality at several stages of a typical acquisition pipeline, measures their effect, and demonstrates for three selected security applications (steganalysis, forensic source identification, and the detection of synthetic images) how the performance of state-of-the-art methods can be improved by properly accounting for directionality.

Information Flow Control in Machine Learning through Modular Model Architecture

Trishita Tiwari, Cornell University; Suchin Gururangan, University of Washington; Chuan Guo, FAIR at Meta; Weizhe Hua, Google DeepMind; Sanjay Kariyappa, Georgia Institute of Technology; Udit Gupta, Cornell University; Wenjie Xiong, Virginia Tech; Kiwan Maeng, Pennsylvania State University; Hsien-Hsin S. Lee, Intel; G. Edward Suh, NVIDIA/Cornell University

Available Media

In today's machine learning (ML) models, any part of the training data can affect the model output. This lack of control for information flow from training data to model output is a major obstacle in training models on sensitive data when access control only allows individual users to access a subset of data. To enable secure machine learning for access-controlled data, we propose the notion of information flow control for machine learning, and develop an extension to the Transformer language model architecture that strictly adheres to the IFC definition we propose. Our architecture controls information flow by limiting the influence of training data from each security domain to a single expert module, and only enables a subset of experts at inference time based on the access control policy. The evaluation using large text and code datasets show that our proposed parametric IFC architecture has minimal (1.9%) performance overhead and can significantly improve model accuracy (by 38% for the text dataset, and between 44%–62% for the code datasets) by enabling training on access-controlled data.

Track 7

Cryptographic Protocols III

Session Chair: Lucy Simko

Salon IJK

POPSTAR: Lightweight Threshold Reporting with Reduced Leakage

Hanjun Li, Sela Navot, and Stefano Tessaro, University of Washington

Available Media

This paper proposes POPSTAR, a new lightweight protocol for the private computation of heavy hitters, also known as a private threshold reporting system. In such a protocol, the users provide input measurements, and a report server learns which measurements appear more than a pre-specified threshold. POPSTAR follows the same architecture as STAR (Davidson et al., CCS 2022) by relying on a helper randomness server in addition to a main server computing the aggregate heavy hitter statistics. While STAR is extremely lightweight, it leaks a substantial amount of information, consisting of an entire histogram of the provided measurements (but only reveals the actual measurements that appear beyond the threshold). POPSTAR shows that this leakage can be reduced at a modest cost (∼7× longer aggregation time). Our leakage is closer to that of Poplar (Boneh et al., S&P 2021), which relies however on distributed point functions and a different model which requires interactions of two non-colluding servers to compute the heavy hitters.

Privacy-Preserving Data Aggregation with Public Verifiability Against Internal Adversaries

Marco Palazzo and Florine W. Dekker, Cyber Security Group, Delft University of Technology; Alessandro Brighente, SPRITZ Security and Privacy Research Group, Università di Padova; Mauro Conti, SPRITZ Security and Privacy Research Group, Università di Padova & Cyber Security Group, Delft University of Technology; Zekeriya Erkin, Cyber Security Group, Delft University of Technology

Available Media

We consider the problem of publicly verifiable privacy-preserving data aggregation in the presence of a malicious aggregator colluding with malicious users. State-of-the-art solutions either split the aggregator into two parties under the assumption that they do not collude, or require many rounds of interactivity and have non-constant verification time.

In this work, we propose mPVAS, the first publicly verifiable privacy-preserving data aggregation protocol that allows arbitrary collusion, without relying on trusted third parties during execution, where verification runs in constant time. We also show three extensions to mPVAS: mPVAS+, for improved communication complexity, mPVAS-IV, for the identification of malicious users, and mPVAS-UD, for graceful handling of reduced user availability without the need to redo the setup. We show that our schemes achieve the desired confidentiality, integrity, and authenticity. Finally, through both theoretical and experimental evaluations, we show that our schemes are feasible for real-world applications.

PINE: Efficient Verification of a Euclidean Norm Bound of a Secret-Shared Vector

Guy N. Rothblum, Apple; Eran Omri, Ariel University and Ariel Cyber Innovation Center; Junye Chen and Kunal Talwar, Apple

Available Media

Secure aggregation of high-dimensional vectors is a fundamental primitive in federated statistics and learning. A two-server system such as PRIO allows for scalable aggregation of secret-shared vectors. Adversarial clients might try to manipulate the aggregate, so it is important to ensure that each (secret-shared) contribution is well-formed. In this work, we focus on the important and well-studied goal of ensuring that each contribution vector has bounded Euclidean norm. Existing protocols for ensuring bounded-norm contributions either incur a large communication overhead, or only allow for approximate verification of the norm bound. We propose Private Inexpensive Norm Enforcement (PINE): a new protocol that allows exact norm verification with little communication overhead. For high-dimensional vectors, our approach has a communication overhead of a few percent, compared to the 16-32x overhead of previous approaches.

DaCapo: Automatic Bootstrapping Management for Efficient Fully Homomorphic Encryption

Seonyoung Cheon, Yongwoo Lee, Dongkwan Kim, and Ju Min Lee, Yonsei University; Sunchul Jung and Taekyung Kim, CryptoLab. Inc.; Dongyoon Lee, Stony Brook University; Hanjun Kim, Yonsei University

Available Media

By supporting computation on encrypted data, fully homomorphic encryption (FHE) offers the potential for privacy-preserving computation offloading. However, its applicability is constrained to small programs because each FHE multiplication increases the scale of a ciphertext with a limited scale capacity. By resetting the accumulated scale, bootstrapping enables a longer FHE multiplication chain. Nonetheless, manual bootstrapping placement poses a significant programming burden to avoid scale overflow from insufficient bootstrapping or the substantial computational overhead of unnecessary bootstrapping. Additionally, the bootstrapping placement affects costs of FHE operations due to changes in scale management, further complicating the overall management process.

This work proposes DaCapo, the first automatic bootstrapping management compiler. Aiming to reduce bootstrapping counts, DaCapo analyzes live-out ciphertexts at each program point and identifies candidate points for inserting bootstrapping operations. DaCapo estimates the FHE operation latencies under different scale management scenarios for each bootstrapping placement plan at each candidate point, and decides the bootstrapping placement plan with minimal latency. This work evaluates DaCapo with deep learning models that existing FHE compilers cannot compile due to a lack of bootstrapping support. The evaluation achieves 1.21x speedup on average compared to manually implemented FHE programs.

3:45 pm–4:00 pm

Short Break

Grand Ballroom Foyer

4:00 pm–5:00 pm

Track 1

Measurement VII: Auditing and Best Practices II

Session Chair: Davide Balzarotti

Salon AB

SoK (or SoLK?): On the Quantitative Study of Sociodemographic Factors and Computer Security Behaviors

Miranda Wei, University of Washington; Jaron Mink, University of Illinois at Urbana-Champaign; Yael Eiger and Tadayoshi Kohno, University of Washington; Elissa M. Redmiles, Georgetown University; Franziska Roesner, University of Washington

Available Media

Researchers are increasingly exploring how gender, culture, and other sociodemographic factors correlate with user computer security and privacy behaviors. To more holistically understand relationships between these factors and behaviors, we make two contributions. First, we broadly survey existing scholarship on sociodemographics and secure behavior (151 papers) before conducting a focused literature review of 47 papers to synthesize what is currently known and identify open questions for future research. Second, by incorporating contemporary social and critical theories, we establish guidelines for future studies of sociodemographic factors and security behaviors that address how to overcome common pitfalls. We present a case study to demonstrate our guidelines in action, at-scale, that conduct a measurement study of the relationships between sociodemographics and de-identified, aggregated log data of security and privacy behaviors among 16,829 users on Facebook across 16 countries. Through these contributions, we position our work as a systemization of a lack of knowledge (SoLK). Overall, we find contradictory results and vast unknowns about how identity shapes security behavior. Through our guidelines and discussion, we chart new directions to more deeply examine how and why sociodemographic factors affect security behaviors.

IoT Market Dynamics: An Analysis of Device Sales, Security and Privacy Signals, and their Interactions

Swaathi Vetrivel, Brennen Bouwmeester, Michel van Eeten, and Carlos H. Gañán, Delft University of Technology

Available Media

We explore the relationship between the Security and Privacy (S&P) of IoT devices and their sales, considering the S&P signals in the context of these sales. We obtained expert S&P ratings of IoT devices from a European consumer association and the corresponding sales data from a leading Dutch online store. We complemented this with additional information like user ratings, the number of reviews and update support duration from two Dutch online stores. Our regression model shows that, holding other variables constant, a one-standard-deviation increase in S&P ratings corresponds to a noteworthy 56% boost in sales. Crucially, we observe a possible correlation between price and demand for S&P; at lower prices, the sales of IoT devices are directly proportional to the S&P rating, but this relationship diminishes as price increases. Further, we find that the presence of update support duration information, intended as a security signal, corresponds to higher S&P ratings and, all else being constant, also corresponds to a 69% increase in sales. While the exact causal mechanisms for the boost in sales remain unclear, our findings suggest positive incentives might be at play for IoT devices offering S&P at affordable prices and presenting relevant S&P information at the point of purchase.

The Unpatchables: Why Municipalities Persist in Running Vulnerable Hosts

Aksel Ethembabaoglu, Rolf van Wegberg, Yury Zhauniarovich, and Michel van Eeten, Delft University of Technology

Available Media

Many organizations continue to expose vulnerable systems for which patches exist, opening themselves up for cyberattacks. Local governments are found to be especially affected by this problem. Why are these systems not patched? Prior work relied on vulnerability scanning to observe unpatched systems, notification studies on remediating them, and on user studies of sysadmins to describe self-reported patching behavior, but they are rarely used together as we do in this study. We analyze scan data following standard industry practices and detect unpatched hosts across the set of 322 Dutch municipalities. Our first question is: Are these detections false positives? We engage with 29 security professionals working for 54 municipalities to collect ground truth.

All detections were accurate. Our approach also uncovers a major misalignment between systems that the responsible CERT attributes to the municipalities and the systems the practitioners at municipalities believe they are responsible for. We then interviewed the professionals as to why these vulnerable systems were still exposed. We identify four explanations for non-patching: unaware, unable, retired and shut down. The institutional framework to mitigate cyber threats assumes that vulnerable systems are first correctly identified, then correctly attributed and notified, and finally correctly mitigated. Our findings illustrate that the first assumption is correct, the second one is not and the third one is more complicated in practice. We end with reflections on how to better remediate vulnerable hosts.

Track 2

Hardware Security V: Embedded

Session Chair: Ning Zhang

Salon CD

Leveraging Semantic Relations in Code and Data to Enhance Taint Analysis of Embedded Systems

Jiaxu Zhao, Institute of Information Engineering, Chinese Academy of Sciences; School of Cyber Security, University of Chinese Academy of Sciences; Key Laboratory of Network Assessment Technology, Chinese Academy of Sciences; Beijing Key Laboratory of Network Security and Protection Technology; Yuekang Li, The University of New South Wales; Yanyan Zou, Zhaohui Liang, Yang Xiao, Yeting Li, Bingwei Peng, Nanyu Zhong, and Xinyi Wang, Institute of Information Engineering, Chinese Academy of Sciences; School of Cyber Security, University of Chinese Academy of Sciences; Key Laboratory of Network Assessment Technology, Chinese Academy of Sciences; Beijing Key Laboratory of Network Security and Protection Technology; Wei Wang, Institute of Information Engineering, Chinese Academy of Sciences; Key Laboratory of Network Assessment Technology, Chinese Academy of Sciences; Beijing Key Laboratory of Network Security and Protection Technology; Wei Huo, Institute of Information Engineering, Chinese Academy of Sciences; School of Cyber Security, University of Chinese Academy of Sciences; Key Laboratory of Network Assessment Technology, Chinese Academy of Sciences; Beijing Key Laboratory of Network Security and Protection Technology

Available Media

IoT devices have significantly impacted our daily lives, and detecting vulnerabilities in embedded systems early on is critical for ensuring their security. Among the existing vulnerability detection techniques for embedded systems, static taint analysis has been proven effective in detecting severe vulnerabilities, such as command injection vulnerabilities, which can cause remote code execution. Nevertheless, static taint analysis is faced with the problem of identifying sources comprehensively and accurately.

This paper presents Lara, a novel static taint analysis technique to detect vulnerabilities in embedded systems. The design of Lara is inspired by an observation that pertains to semantic relations within and between the code and data of embedded software: user input entries can be categorized as URIs or keys (data), and identifying their handling code (code) and relations can help systematically and comprehensively identify the sources for taint analysis. Transforming the observation into a practical methodology poses challenges. To address these challenges, Lara employs a combination of pattern-based static analysis and large language model(LLM)-aided analysis, aiming to replicate how human experts would utilize the findings during analysis and enhance it. The pattern-based static analysis simulates human experience, while the LLM-aided analysis captures the way human experts perceive code semantics. We implemented Lara and evaluated it on 203 IoT devices from 21 vendors. In general, Lara detects 556 and 602 more vulnerabilities than SaTC and Karonte while reducing false positives by 57.0% and 54.3%. Meanwhile, with more sources and sinks from Lara, EmTaint can detect 245 more vulnerabilities. To date, Lara has found 245 0-day vulnerabilities in 57 devices, all of which were confirmed or fixed with 162 CVE IDs assigned.

A Friend's Eye is A Good Mirror: Synthesizing MCU Peripheral Models from Peripheral Drivers

Chongqing Lei and Zhen Ling, Southeast University; Yue Zhang, Drexel University; Yan Yang and Junzhou Luo, Southeast University; Xinwen Fu, University of Massachusetts Lowell

Available Media

The extensive integration of embedded devices within the Internet of Things (IoT) has given rise to significant security concerns. Various initiatives have been undertaken to bolster the security of these devices at the software level, involving the analysis of MCU firmware and the implementation of automatic MCU rehosting methods. However, existing hardware-oriented rehosting techniques often face scalability challenges, while firmware-oriented approaches may have limited universality and fidelity. To address these limitations, we propose Perry, a system that synthesizes faithful and extendable peripheral models for MCUs. By extracting peripheral models from hardware drivers, Perry ensures compatibility and accurate emulation of targeted MCUs. The process involves gathering hardware metadata, analyzing driver code, capturing traces of peripheral accesses, and converting software beliefs into hardware behaviors. Perry is implemented with approximately 19,000 lines of code. A comprehensive evaluation of 75 firmware samples has showcased its effectiveness, consistency, universality, and scalability in generating hardware models for MCUs. Perry can efficiently synthesize hardware models consistent with the actual hardware and achieve a 74.24% unit test passing rate, outperforming the state-of-the-art techniques. The hardware models produced by Perry can faithfully emulate diverse firmware and can be readily expanded with minimal manual intervention. Through case studies, we show that Perry can help reproduce firmware vulnerabilities, discover specification-violation bugs in drivers, and fuzz RTOS for vulnerabilities. These case studies have led to the identification of two specification-violating bugs and the discovery of seven new vulnerabilities, underscoring Perry's potential to enhance various security-focused tasks.

SoK: Security of Programmable Logic Controllers

Efrén López-Morales, Texas A&M University-Corpus Christi; Ulysse Planta, CISPA Helmholtz Center for Information Security; Carlos Rubio-Medrano, Texas A&M University-Corpus Christi; Ali Abbasi, CISPA Helmholtz Center for Information Security; Alvaro A. Cardenas, University of California, Santa Cruz

Available Media

Billions of people rely on essential utility and manufacturing infrastructures such as water treatment plants, energy management, and food production. Our dependence on reliable infrastructures makes them valuable targets for cyberattacks. One of the prime targets for adversaries attacking physical infrastructures are Programmable Logic Controllers (PLCs) because they connect the cyber and physical worlds. In this study, we conduct the first comprehensive systematization of knowledge that explores the security of PLCs: We present an in-depth analysis of PLC attacks and defenses and discover trends in the security of PLCs from the last 17 of research. We introduce a novel threat taxonomy for PLCs and Industrial Control Systems (ICS). Finally, we identify and point out research gaps that, if left ignored, could lead to new catastrophic attacks against critical infrastructures.

Operation Mango: Scalable Discovery of Taint-Style Vulnerabilities in Binary Firmware Services

Wil Gibbs, Arvind S Raj, Jayakrishna Menon Vadayath, Hui Jun Tay, Justin Miller, Akshay Ajayan, Zion Leonahenahe Basque, Audrey Dutcher, and Fangzhou Dong, Arizona State University; Xavier Maso, unaffiliated; Giovanni Vigna and Christopher Kruegel, UC Santa Barbara; Adam Doupé, Yan Shoshitaishvili, and Ruoyu Wang, Arizona State University

Available Media

The rise of IoT (Internet of Things) devices has created a system of convenience, which allows users to control and automate almost everything in their homes. But this increase in convenience comes with increased security risks to the users of IoT devices, partially because IoT firmware is frequently complex, feature-rich, and very vulnerable. Existing solutions for automatically finding taint-style vulnerabilities significantly reduce the number of binaries analyzed to achieve scalability. However, we show that this trade-off results in missing significant numbers of vulnerabilities. In this paper, we propose a new direction: scaling static analysis of firmware binaries so that all binaries can be analyzed for command injection or buffer overflows. To achieve this, we developed MANGODFA, a novel binary data-flow analysis leveraging value analysis and data dependency analysis on binary code. Through key algorithmic optimizations in MANGODFA, our prototype Mango achieves fast analysis without sacrificing precision. On the same dataset used in prior work, Mango analyzed 27× more binaries in a comparable amount of time to the state-of-the-art in Linux-based user-space firmware taint-analysis SaTC. Mango achieved an average per-binary analysis time of 8 minutes compared to 6.56 hours for SaTC. In addition, Mango finds 56 real vulnerabilities that SaTC does not find in a set of seven firmware. We also performed an ablation study demonstrating the performance gains in Mango come from key algorithmic improvements.

Track 3

System Security V: Memory II

Session Chair: Hong Hu

Salon E

SCAVY: Automated Discovery of Memory Corruption Targets in Linux Kernel for Privilege Escalation

Erin Avllazagaj, Yonghwi Kwon, and Tudor Dumitraș, University of Maryland

Available Media

Kernel privilege-escalation exploits typically leverage memory-corruption vulnerabilities to overwrite particular target locations. These memory corruption targets play a critical role in the exploits, as they determine which privileged resources (e.g., files, memory, and operations) the adversary may access and what privileges (e.g., read, write, and unrestricted) they may gain. While prior research has made important advances in discovering vulnerabilities and achieving privilege escalation, in practice, the exploits rely on the few memory corruption targets that have been discovered manually so far.

We propose SCAVY, a framework that automatically discovers memory corruption targets for privilege escalation in the Linux kernel. SCAVY's key insight lies in broadening the search scope beyond the kernel data structures explored in prior work, which focused on function pointers or pointers to structures that include them, to encompass the remaining 90% of Linux kernel structures. Additionally, the search is bug-type agnostic, as it considers any memory corruption capability. To this end, we develop novel and scalable techniques that combine fuzzing and differential analysis to automatically explore and detect privilege escalation by comparing the accessibility of resources between executions with and without corruption. This allows SCAVY to determine that corrupting a certain field puts the system in an exploitable state, independently of the vulnerability exploited. SCAVY found 955 PoC, from which we identify 17 new fields in 12 structures that can enable privilege escalation. We utilize these targets to develop 6 exploits for 5 CVE vulnerabilities. Our findings show that new memory corruption targets can change the security implications of vulnerabilities, urging researchers to proactively discover memory corruption targets.

Voodoo: Memory Tagging, Authenticated Encryption, and Error Correction through MAGIC

Lukas Lamster, Martin Unterguggenberger, David Schrammel, and Stefan Mangard, Graz University of Technology

Available Media

Confidentiality, authenticity, integrity of data, and runtime security are ubiquitous concerns in modern computer systems. However, these security concerns have traditionally been addressed by separate mechanisms. Error-correcting codes (ECC) detect and correct DRAM errors, ensuring the integrity of stored data. Authenticated memory encryption provides data confidentiality and authenticity. Memory tagging enforces memory safety, thereby improving runtime security. The lack of a combined primitive increases system complexity, memory overheads, and the overall performance impact. In this work, we present Voodoo, the first combined scheme for authenticated encryption, DRAM error correction, and memory tagging. Our design extends the MAGIC mode for authenticated encryption and error correction proposed by Kounavis et al. With Voodoo, DRAM data is encrypted, and a tag-dependent message authentication code protects the integrity of the stored data while simultaneously allowing for the correction of DRAM faults. Thus, we can implement a wide range of tagged memory architectures without introducing additional memory requests or storage overheads. We present three tag encoding schemes providing up to 36 tag bits per cache line. Using the gem5 simulator, we implement and benchmark our design. Our evaluation shows a low runtime overhead of 1.4% on average compared to a system without any of the provided security features. We use a Monte-Carlo simulation of a DRAM fault model based on real-world DRAM fault behavior to demonstrate the corrective capabilities of Voodoo. Our results show that we consistently outperform traditional single-error correction, double-error detection (SEC-DED) codes in terms of error correction and detection. For multi-chip faults, Voodoo offers stronger error detection than commodity Chipkill solutions.

ShadowBound: Efficient Heap Memory Protection Through Advanced Metadata Management and Customized Compiler Optimization

Zheng Yu, Ganxiang Yang, and Xinyu Xing, Northwestern University

Available Media

In software development, the prevalence of unsafe languages such as C and C++ introduces potential vulnerabilities, especially within the heap, a pivotal component for dynamic memory allocation. Despite its significance, heap management complexities have made heap corruption pervasive, posing severe threats to system security. While prior solutions aiming for temporal and spatial memory safety exhibit overheads deemed impractical, we present ShadowBound, a unique heap memory protection design. At its core, ShadowBound is an efficient out-of-bounds defense that can work with various use-after-free defenses (e.g. MarkUS, FFMalloc, PUMM) without compatibility constraints. We harness a shadow memory-based metadata management mechanism to store heap chunk boundaries and apply customized compiler optimizations tailored for boundary checking. We implemented ShadowBound atop the LLVM framework and integrated three state-of-the-art use-after-free defenses. Our evaluations show that ShadowBound provides robust heap protection with minimal time and memory overhead, suggesting its effectiveness and efficiency in safeguarding real-world programs against prevalent heap vulnerabilities.

OPTISAN: Using Multiple Spatial Error Defenses to Optimize Stack Memory Protection within a Budget

Rahul George, University of California, Riverside; Mingming Chen and Kaiming Huang, The Pennsylvania State University; Zhiyun Qian, University of California, Riverside; Thomas La Porta, The Pennsylvania State University; Trent Jaeger, University of California, Riverside

Available Media

Spatial memory errors continue to be the cause of many vulnerabilities. While researchers have proposed several defenses to prevent exploitation of spatial memory errors, systems currently rely on defenses that only protect a small fraction of stack data (e.g., return addresses) and leave a window of vulnerability (e.g., by only enforcing on function returns). One proposal to address this problem is to place defenses at the lowest cost locations until a cost budget was met, but this approach only considers a single defense and does not account for the security implications of possible placements. In this paper, we propose the OptiSan system, which is the first system to apply multiple spatial memory defenses to maximize the number of objects protected from spatial memory errors within a cost budget. OptiSan analyzes each program to identify the stack objects that may be exploited by spatial memory errors, called usable targets, and estimates the overhead for individual defense operations, for both metadata management and spatial checks, to enable flexibility in placement choices. OptiSan applies this information in a novel Mixed-Integer Non-Linear Programming formulation to generate an optimal placement. We apply OptiSan to generate placements using a combination of identity-based (i.e., influential BaggyBounds) and location-based (i.e., widely used AddressSanitizer (ASan)) spatial memory defenses, finding that OptiSan utilizes the more effective Baggy Bounds defense broadly, augmenting it with ASan to increase the number of memory operations with usable targets protected by 18.4% on average across a set of benchmark and server programs. OptiSan shows that using multiple spatial memory defenses provides valuable flexibility to prevent the exploitation of many spatial memory errors within a cost budget.

Track 4

User Studies VIII: Cryptography

Session Chair: Sarah Meiklejohn

Salon F

The Challenges of Bringing Cryptography from Research Papers to Products: Results from an Interview Study with Experts

Konstantin Fischer, Ruhr University Bochum; Ivana Trummová, Czech Technical University in Prague; Phillip Gajland, Ruhr University Bochum and Max Planck Institute for Security and Privacy; Yasemin Acar, Paderborn University and The George Washington University; Sascha Fahl, CISPA - Helmholtz-Center for Information Security; Angela Sasse, Ruhr University Bochum

Available Media

Cryptography serves as the cornerstone of information security and privacy in modern society. While notable progress has been made in the implementation of cryptographic techniques, a substantial portion of research outputs in cryptography, which strive to offer robust security solutions, are either implemented inadequately or not at all. Our study aims to investigate the challenges involved in bringing cryptography innovations from papers to products.

To address this open question, we conducted 21 semistructured interviews with cryptography experts who possess extensive experience (10+ years) in academia, industry, and nonprofit and governmental organizations. We aimed to gain insights into their experiences with deploying cryptographic research outputs, their perspectives on the process of bringing cryptography to products, and the necessary changes within the cryptography ecosystem to facilitate faster, wider, and more secure adoption.

We identified several challenges including misunderstandings and miscommunication among stakeholders, unclear delineation of responsibilities, misaligned or conflicting incentives, and usability challenges when bringing cryptography from theoretical papers to end user products. Drawing upon our findings, we provide a set of recommendations for cryptography researchers and practitioners. We encourage better supporting cross-disciplinary engagement between cryptographers, standardization organizations, and software developers for increased cryptography adoption.

Why Aren't We Using Passkeys? Obstacles Companies Face Deploying FIDO2 Passwordless Authentication

Leona Lassak, Ruhr University Bochum; Elleen Pan and Blase Ur, University of Chicago; Maximilian Golla, CISPA Helmholtz Center for Information Security

Available Media

When adopted by the W3C in 2019, the FIDO2 standard for passwordless authentication was touted as a replacement for passwords on the web. With FIDO2, users leverage passkeys (cryptographic credentials) to authenticate to websites. Even though major operating systems now support passkeys, compatible hardware is now widely available, and some major companies now offer passwordless options, both the deployment and adoption have been slow. As FIDO2 has many security and usability advantages over passwords, we investigate what obstacles hinder companies from large-scale deployment of passwordless authentication. We conducted 28 semi-structured interviews with chief information security officers (CISOs) and authentication managers from both companies that have and have not deployed passwordless authentication, as well as FIDO2 experts. Our results shed light on the current state of deployment and perception. We highlight key barriers to adoption, including account recovery, friction, technical issues, regulatory requirements, and security culture. From the obstacles identified, we make recommendations for increasing the adoption of passwordless authentication.

"You have to read 50 different RFCs that contradict each other": An Interview Study on the Experiences of Implementing Cryptographic Standards

Nicolas Huaman and Jacques Suray, Leibniz University Hannover; Jan H. Klemmer, CISPA Helmholtz Center for Information Security; Marcel Fourné, Paderborn University; Sabrina Klivan, CISPA Helmholtz Center for Information Security; Ivana Trummová, Czech Technical University in Prague; Yasemin Acar, Paderborn University & The George Washington University; Sascha Fahl, CISPA Helmholtz Center for Information Security

Available Media

Implementing cryptographic standards is a critical process for the cryptographic ecosystem. Cryptographic standards aim to support developers and engineers in implementing cryptographic primitives and protocols. However, past security incidents suggest that implementing cryptographic standards can be challenging and might jeopardize software and hardware security. We need to understand and mitigate the pain points of those implementing cryptographic standards to support them better.

To shed light on the challenges and obstacles of implementing cryptographic standards, we conducted 20 semi-structured interviews with experienced cryptographers and cryptographic software engineers. We identify common practices when implementing standards, including the criticality of reference and third-party implementations, test vectors to verify implementations, and the open standard community as central support for questions and reviews of implementations.

Based on our findings, we recommend transparent standardization processes, strong (ideally formal) verification, improved support for comparing implementations, and covering updates and error handling in the standardization process.

A Mixed-Methods Study on User Experiences and Challenges of Recovery Codes for an End-to-End Encrypted Service

Sandra Höltervennhoff, Leibniz University Hannover; Noah Wöhler, CISPA Helmholtz Center for Information Security; Arne Möhle, Tutao GmbH; Marten Oltrogge, CISPA Helmholtz Center for Information Security; Yasemin Acar, Paderborn University and The George Washington University; Oliver Wiese and Sascha Fahl, CISPA Helmholtz Center for Information Security

Available Media

Recovery codes are a popular backup mechanism for online services to aid users who lost their passwords or two-factor authentication tokens in regaining access to their accounts or encrypted data. Especially for end-to-end encrypted services, recovery codes are a critical feature, as the service itself cannot access the encrypted user data and help users regain access. The way end-users manage recovery codes is not well understood. Hence, we investigate end-user perceptions and management strategies of recovery codes. Therefore, we survey users of an end-to-end encrypted email service provider, deploying recovery codes for accounts and encrypted data recovery in case of authentication credential loss. We performed an online survey with 281 users. In a second study, we analyzed 197 support requests on Reddit. Most of our participants stored the service provider's recovery code. We could identify six strategies for saving it, with using a password manager being the most widespread. Participants were generally satisfied with the service provider's recovery code. However, while they appreciated its security, its usability was lacking. We found obstacles, such as losing access to the recovery code or non-functioning recovery codes and security misconceptions. These often resulted from users not understanding the underlying security implications, e.g., that the support cannot access or restore their unencrypted data.

Track 5

ML XI: Physical Adversarial Attacks

Session Chair: Ryan Gerdes

Salon G

Devil in the Room: Triggering Audio Backdoors in the Physical World

Meng Chen, Zhejiang University; Xiangyu Xu, Southeast University; Li Lu, Zhongjie Ba, Feng Lin, and Kui Ren, Zhejiang University

Available Media

Recent years have witnessed deep learning techniques endowing modern audio systems with powerful capabilities. However, the latest studies have revealed its strong reliance on training data, raising serious threats from backdoor attacks. Different from most existing works that study audio backdoors in the digital world, we investigate the mismatch between the trigger and backdoor in the physical space by examining sound channel distortion. Inspired by this observation, this paper proposes TrojanRoom to bridge the gap between digital and physical audio backdoor attacks. TrojanRoom utilizes the room impulse response (RIR) as a physical trigger to enable injection-free backdoor activation. By synthesizing dynamic RIRs and poisoning a source class of samples during data augmentation, TrojanRoom enables any adversary to launch an effective and stealthy attack using the specific impulse response in a room. The evaluation shows over 92% and 97% attack success rates on both state-of-the-art speech command recognition and speaker recognition systems with negligible impact on benign accuracy below 3% at a distance of over 5m. The experiments also demonstrate that TrojanRoom could bypass human inspection and voice liveness detection, as well as resist trigger disruption and backdoor defense.

FraudWhistler: A Resilient, Robust and Plug-and-play Adversarial Example Detection Method for Speaker Recognition

Kun Wang, Zhejiang University; Xiangyu Xu, Southeast University; Li Lu, Zhongjie Ba, Feng Lin, and Kui Ren, Zhejiang University

Available Media

With the in-depth integration of deep learning, state-of-the-art speaker recognition systems have achieved breakthrough progress. However, the intrinsic vulnerability of deep learning to Adversarial Example (AE) attacks has brought new severe threats to real-world speaker recognition systems. In this paper, we propose FraudWhistler, a practical AE detection system, which is resilient to various AE attacks, robust in complex physical environments, and plug-and-play for deployed systems. Its basic idea is to make use of an intrinsic characteristic of AE, i.e., the instability of model prediction for AE, which is totally different from benign samples. FraudWhistler generates several audio variants for the original audio sample with some distortion techniques, obtains multiple outputs of the speaker recognition system for these audio variants, and based on that FraudWhistler extracts some statistics representing the instability of the original audio sample and further trains a one-class SVM classifier to detect adversarial example. Extensive experimental results show that FraudWhistler achieves 98.7% accuracy on AE detection outperforming SOTA works by 13%, and 84% accuracy in the worst case against an adaptive adversary.

π-Jack: Physical-World Adversarial Attack on Monocular Depth Estimation with Perspective Hijacking

Tianyue Zheng, Southern University of Science and Technology; Jingzhi Hu and Rui Tan, Nanyang Technological University; Yinqian Zhang, Southern University of Science and Technology; Ying He and Jun Luo, Nanyang Technological University

Available Media

Monocular depth estimation (MDE) plays a crucial role in modern autonomous driving (AD) by facilitating 3-D scene understanding and interaction. While vulnerabilities in deep neural networks (e.g., adversarial perturbations) have been exploited to compromise MDE, existing attacks face challenges in target accessibility and stealthiness. To address these limitations, we introduce pi-Jack, a novel physical-world attack on MDE via perspective hijacking. It is based on an observation that MDE relies heavily on perspective cues to infer depth, yet these cues can be manipulated by strategically placing common 3-D objects in AD scenes. With an optimization-based approach, pi-Jack "hijacks" the perspective information and alters the target pixels' depths perceived by the MDE model in a black-box manner. We also show via experiments that pi-Jack is effective across various MDE models and scenarios, confirming generalizability of perspective hijacking. Our extensive evaluations demonstrate that pi-Jack is effective across different target and attack vectors, and increases the mean depth error by over 14 meters. Moreover, in our end-to-end AD simulation, pi-Jack results in compromised lane change, sudden braking, and life-threatening collisions.

AE-Morpher: Improve Physical Robustness of Adversarial Objects against LiDAR-based Detectors via Object Reconstruction

Shenchen Zhu, Institute of Information Engineering, Chinese Academy of Sciences, China; School of Cyber Security, University of Chinese Academy of Sciences, China; Yue Zhao, Institute of Information Engineering, Chinese Academy of Sciences, China; Kai Chen, Institute of Information Engineering, Chinese Academy of Sciences, China; School of Cyber Security, University of Chinese Academy of Sciences, China; Bo Wang, Huawei Technologies Co., Ltd.; Hualong Ma and Cheng'an Wei, Institute of Information Engineering, Chinese Academy of Sciences, China; School of Cyber Security, University of Chinese Academy of Sciences, China

Available Media

LiDAR-based perception is crucial to ensure the safety and reliability of autonomous driving (AD) systems. Though some adversarial attack methods against LiDAR-based detectors perception models have been proposed, deceiving such models in the physical world is still challenging. While existing robustness methods focus on transforming point clouds to embed more robust adversarial information, our research reveals how to reduce the errors during the LiDAR capturing process to improve the robustness of adversarial attacks. In this paper, we present AE-Morpher, a novel approach that minimizes differences between the LiDAR-captured and original adversarial point clouds to improve the robustness of adversarial objects. It reconstructs the adversarial object using surfaces with regular shapes to fit the discrete laser beams. We evaluate AE-Morpher by conducting physical disappearance attacks that use a mounted adversarial ornament to conceal a car from models' detection results in both SVL Simulator environments and real-world LiDAR setups. In the simulated world, we successfully deceive the model up to 91.1% of the time when LiDAR moves towards the target vehicle from 20m away. On average, our method increases the ASR by 38.64% and reduces the adversarial ornament's projection area by 67.59%. For the real world, we achieve an average attack success rate of 71.4% over a 12m motion scenario. Moreover, adversarial objects reconstructed by our method can be easily physically constructed by human hands without the requirement of a 3D printer.

Track 6

Software Security + ML 2

Session Chair: Matthew Wright

Salon H

EaTVul: ChatGPT-based Evasion Attack Against Software Vulnerability Detection

Shigang Liu, CSIRO's Data61 and Swinburne University of Technology; Di Cao, Swinburne University of Technology; Junae Kim, Tamas Abraham, and Paul Montague, DST Group, Australia; Seyit Camtepe, CSIRO's Data61; Jun Zhang and Yang Xiang, Swinburne University of Technology

Available Media

Recently, deep learning has demonstrated promising results in enhancing the accuracy of vulnerability detection and identifying vulnerabilities in software. However, these techniques are still vulnerable to attacks. Adversarial examples can exploit vulnerabilities within deep neural networks, posing a significant threat to system security. This study showcases the susceptibility of deep learning models to adversarial attacks, which can achieve 100% attack success rate. The proposed method, EaTVul, encompasses six stages: identification of important adversarial samples using support vector machines, identification of important features using the attention mechanism, generation of adversarial data based on these features, preparation of an adversarial attack pool, selection of seed data using a fuzzy genetic algorithm, and the execution of an evasion attack. Extensive experiments demonstrate the effectiveness of EaTVul, achieving an attack success rate of more than 83% when the snippet size is greater than 2. Furthermore, in most cases with a snippet size of 4, EaTVul achieves a 100% attack success rate. The findings of this research emphasize the necessity of robust defenses against adversarial attacks in software vulnerability detection.

FVD-DPM: Fine-grained Vulnerability Detection via Conditional Diffusion Probabilistic Models

Miaomiao Shao and Yuxin Ding, Harbin Institute of Technology, Shenzhen

Available Media

Software vulnerabilities pose a significant threat to software security. Nevertheless, existing vulnerability detection methods still struggle to effectively identify vulnerabilities and pinpoint vulnerable statements. In this paper, we introduce FVD-DPM: a novel Fine-grained Vulnerability Detection approach via a conditional Diffusion Probabilistic Model. FVD-DPM formalizes vulnerability detection as a diffusion-based graph-structured prediction problem. Firstly, it generates a new fine-grained code representation by extracting graph-level program slices from the Code Joint Graph. Then, a conditional diffusion probabilistic model is employed to model the node label distribution in the program slices, predicting which nodes are vulnerable. FVD-DPM achieves both precise vulnerability identification (slice-level detection) and vulnerability localization (statement-level detection). We evaluate FVD-DPM on five collected datasets and compare it against nine state-of-the-art vulnerability detection approaches. Experimental results demonstrate that FVD-DPM significantly outperforms the baseline approaches across various evaluation settings.

A Wolf in Sheep's Clothing: Practical Black-box Adversarial Attacks for Evading Learning-based Windows Malware Detection in the Wild

Xiang Ling, Intelligent Software Research Center, Institute of Software, Chinese Academy of Sciences; Key Laboratory of System Software (Chinese Academy of Sciences); State Key Laboratory of Computer Science, Institute of Software, Chinese Academy of Sciences; Zhiyu Wu, Zhejiang University; Bin Wang, Zhejiang Key Laboratory of Artificial Intelligence of Things (AIoT) Network and Data Security; Hangzhou Research Institute, Xidian University; Wei Deng, Zhejiang University; Jingzheng Wu, Intelligent Software Research Center, Institute of Software, Chinese Academy of Sciences; Key Laboratory of System Software (Chinese Academy of Sciences); State Key Laboratory of Computer Science, Institute of Software, Chinese Academy of Sciences; Shouling Ji, Zhejiang University; Tianyue Luo, Intelligent Software Research Center, Institute of Software, Chinese Academy of Sciences; Yanjun Wu, Intelligent Software Research Center, Institute of Software, Chinese Academy of Sciences; Key Laboratory of System Software (Chinese Academy of Sciences); State Key Laboratory of Computer Science, Institute of Software, Chinese Academy of Sciences

Available Media

Given the remarkable achievements of existing learning-based malware detection in both academia and industry, this paper presents MalGuise, a practical black-box adversarial attack framework that evaluates the security risks of existing learning-based Windows malware detection systems under the black-box setting. MalGuise first employs a novel semantics-preserving transformation of call-based redividing to concurrently manipulate both nodes and edges of malware's control-flow graph, making it less noticeable. By employing a Monte-Carlo-tree-search-based optimization, MalGuise then searches for an optimized sequence of call-based redividing transformations to apply to the input Windows malware for evasions. Finally, it reconstructs the adversarial malware file based on the optimized transformation sequence while adhering to Windows executable format constraints, thereby maintaining the same semantics as the original. MalGuise is systematically evaluated against three state-of-the-art learning-based Windows malware detection systems under the black-box setting. Evaluation results demonstrate that MalGuise achieves a remarkably high attack success rate, mostly exceeding 95%, with over 91% of the generated adversarial malware files maintaining the same semantics. Furthermore, MalGuise achieves up to a 74.97% attack success rate against five anti-virus products, highlighting potential tangible security concerns to real-world users.

Track 7

Crypto IX: Attacks

Session Chair: Sebastian Schinzel

Salon IJK

Leakage-Abuse Attacks Against Structured Encryption for SQL

Alexander Hoover, University of Chicago; Ruth Ng, Daren Khu, Yao'An Li, Joelle Lim, and Derrick Ng, DSO National Laboratories; Jed Lim, NUS High School of Mathematics and Science; Yiyang Song, Raffles Institution

Available Media

Structured Encryption (StE) enables a client to securely store and query data stored on an untrusted server. Recent constructions of StE have moved beyond basic queries, and now support large subsets of SQL. However, the security of these constructions is poorly understood, and no systematic analysis has been performed.

We address this by providing the first leakage-abuse attacks against StE for SQL schemes. Our attacks can be run by a passive adversary on a server with access to some information about the distribution of underlying data, a common model in prior work. They achieve partial query recovery against select operations and partial plaintext recovery against join operations. We prove the optimality and near-optimality of two new attacks, in a Bayesian inference framework. We complement our theoretical results with an empirical investigation testing the performance of our attacks against real-world data and show they can successfully recover a substantial proportion of queries and plaintexts.

In addition to our new attacks, we provide proofs showing that the conditional optimality of a previously proposed leakage-abuse attack and that inference against join operations is NP-hard in general.

RADIUS/UDP Considered Harmful

Sharon Goldberg, Cloudflare; Miro Haller and Nadia Heninger, UC San Diego; Mike Milano, BastionZero; Dan Shumow, Microsoft Research; Marc Stevens, Centrum Wiskunde & Informatica; Adam Suhl, UC San Diego

Available Media

The RADIUS protocol is the de facto standard lightweight protocol for authentication, authorization, and accounting (AAA) for networked devices. It is used to support remote access for diverse use cases including network routers, industrial control systems, VPNs, enterprise Wi-Fi including the Eduroam network, Linux Pluggable Authentication Modules, and mobile roaming and Wi-Fi offload.

We have discovered a protocol vulnerability in RADIUS that has been present for decades. Our attack allows a man-in-the-middle attacker to authenticate itself to a device using RADIUS for user authentication, or to assign itself arbitrary network privileges. Our attack exploits an MD5 chosen-prefix collision on the ad hoc RADIUS packet authentication construction to produce Access-Accept and Access-Reject packets with identical Response Authenticators, allowing our attacker to transform a reject into an accept without knowledge of the shared secret between RADIUS client and server.

We optimize the MD5 chosen-prefix attack to produce collisions online in less than five minutes, and show how to fit the collision blocks within RADIUS attributes that will be echoed back from the server. We demonstrate our attack in a variety of settings against popular RADIUS implementations. It is our hope that this attack will provide the impetus for vendors and the IETF to deprecate RADIUS over UDP, and to require RADIUS to run over secure channels with modern cryptographic privacy and integrity guarantees.

Key Recovery Attacks on Approximate Homomorphic Encryption with Non-Worst-Case Noise Flooding Countermeasures

Qian Guo and Denis Nabokov, Lund University; Elias Suvanto, ENS Lyon; Thomas Johansson, Lund University

Available Media

In this paper, we present novel key-recovery attacks on Approximate Homomorphic Encryption schemes, such as CKKS, when employing noise-flooding countermeasures based on non-worst-case noise estimation. Our attacks build upon and enhance the seminal work by Li and Micciancio at EUROCRYPT 2021. We demonstrate that relying on average-case noise estimation undermines noise-flooding countermeasures, even if the secure noise bounds derived from differential privacy as published by Li et al. at CRYPTO 2022 are implemented. This study emphasizes the necessity of adopting worst-case noise estimation in Approximate Homomorphic Encryption when sharing decryption results.

We perform the proposed attacks on OpenFHE, an emerging open-source FHE library garnering increased attention. We experimentally demonstrate the ability to recover the secret key using just one shared decryption output. Furthermore, we investigate the implications of our findings for other libraries, such as IBM's HElib library, which allows experimental estimation of the noise bounds. Finally, we reveal that deterministic noise generation utilizing a pseudorandom generator fails to provide supplementary protection.

Terrapin Attack: Breaking SSH Channel Integrity By Sequence Number Manipulation

Fabian Bäumer, Marcus Brinkmann, and Jörg Schwenk, Ruhr University Bochum

Distinguished Paper Award Winner

Distinguished Artifact Award Winner

Available Media

The SSH protocol provides secure access to network services, particularly remote terminal login and file transfer within organizational networks and to over 15 million servers on the open internet. SSH uses an authenticated key exchange to establish a secure channel between a client and a server, which protects the confidentiality and integrity of messages sent in either direction. The secure channel prevents message manipulation, replay, insertion, deletion, and reordering. At the network level, SSH uses the Binary Packet Protocol over TCP.

In this paper, we show that as new encryption algorithms and mitigations were added to SSH, the SSH Binary Packet Protocol is no longer a secure channel: SSH channel integrity (INT-PST, aINT-PTXT, and INT-sfCTF) is broken for three widely used encryption modes. This allows prefix truncation attacks where encrypted packets at the beginning of the SSH channel can be deleted without the client or server noticing it. We demonstrate several real-world applications of this attack. We show that we can fully break SSH extension negotiation (RFC 8308), such that an attacker can downgrade the public key algorithms for user authentication or turn off a new countermeasure against keystroke timing attacks introduced in OpenSSH 9.5. Further, we identify an implementation flaw in AsyncSSH that, together with prefix truncation, allows an attacker to redirect the victim's login into a shell controlled by the attacker.

We also performed an internet-wide scan for affected encryption modes and support for extension negotiation. We find that 71.6% of SSH servers support a vulnerable encryption mode, while 63.2% even list it as their preferred choice.

We identify two root causes that enable these attacks: First, the SSH handshake supports optional messages that are not authenticated. Second, SSH does not reset message sequence numbers when activating encryption keys. Based on this analysis, we propose effective and backward-compatible changes to SSH that mitigate our attacks.

USENIX Security '24 Technical Sessions

Papers and Proceedings

Wednesday, August 14

7:00 am–8:45 am

Continental Breakfast

8:45 am–9:15 am

Opening Remarks and Awards

9:15 am–10:15 am

Keynote Address

10:15 am–11:15 am

Coffee and Tea Break

11:15 am–12:15 pm

User Studies I: Social Media Platforms

Hardware Security I: Attacks and Defense

System Security I: OS

Network Security I: DDoS

ML I: Federated Learning

Security Analysis I: Source Code and Binary

Crypto I: Secret Key Exchange

12:15 pm–1:45 pm

1:45 pm–2:45 pm

Social Issues I: Phishing and Password

Side Channel I: Transient Execution

Mobile Security I

Web Security I

LLM for Security

Fuzzing I: Software

Differential Privacy I

2:45 pm–3:15 pm

Coffee and Tea Break

3:15 pm–4:15 pm

Deepfake and Synthesis

Hardware Security II: Architecture and Microarchitecture

System Security II: OS Kernel

Network Security II: Attacks

ML II: Fault Injection and Robustness

Security Analysis II: Program Analysis

Zero-Knowledge Proof I

4:15 pm–4:30 pm

Short Break

4:30 pm–5:30 pm

Measurement I: Fraud and Malware and Spam

Side Channel II: RowHammer

Forensics

ML for Security

LLM I: Attack and Defense