RAID 2020 Proceedings

The full Proceedings published by USENIX for the symposium are available for download below. Individual papers can also be downloaded from the presentation pages. Copyright to the individual works is retained by the author[s].

Full Proceedings
RAID 2020 Full Proceedings (PDF)

View the full program on the RAID 2020 website.

Wednesday, October 14

Attacks

SpecROP: Speculative Exploitation of ROP Chains

Atri Bhattacharyya and Andrés Sánchez, EPFL; Esmaeil M. Koruyeh, Nael Abu-Ghazaleh, and Chengyu Song UC Riverside; Mathias Payer, EPFL

Available Media

Speculative execution attacks, such as Spectre, reuse code from the victim’s binary to access and leak secret information during speculative execution. Every variant of the attack requires very particular code sequences, necessitating elaborate gadget-search campaigns. Often, victim programs contain few, or even zero, usable gadgets. Consequently, speculative attacks are sometimes demonstrated by injecting usable code sequences into the victim. So far, attacks search for monolithic gadgets, a single sequence of code which performs all the attack steps.

We introduce SpecROP, a novel speculative execution attack technique, inspired by classic code reuse attacks like Return-Oriented Programming to tackle the rarity of code gadgets. The SpecROP attacker uses multiple, small gadgets chained by poisoning multiple control-flow instructions to perform the same computation as a monolithic gadget. A key difference to classic code reuse attacks is that control-flow transfers between gadgets use speculative targets compared to targets in memory or registers.

We categorize SpecROP gadgets into generic classes and demonstrate the abundance of such gadgets in victim libraries. Further, we explore the practicality of influencing multiple control-flow instructions on modern processors, and demonstrate an attack which uses gadget chaining to increase the leakage potential of a Spectre variant, SMoTherSpectre.

Never Trust Your Victim: Weaponizing Vulnerabilities in Security Scanners

Andrea Valenza, University of Genova; Gabriele Costa, IMT School for Advanced Studies Lucca; Alessandro Armando, University of Genova

Available Media

The first step of every attack is reconnaissance, i.e., to acquire information about the target. A common belief is that there is almost no risk in scanning a target from a remote location. In this paper we falsify this belief by showing that scanners are exposed to the same risks as their targets. Our methodology is based on a novel attacker model where the scan author becomes the victim of a counter-strike. We developed a working prototype, called RevOK, and we applied it to 78 scanning systems. Out of them, 36 were found vulnerable to XSS. Remarkably, RevOK also found a severe vulnerability in Metasploit Pro, a mainstream penetration testing tool.

Camera Fingerprinting Authentication Revisited

Dominik Maier, Technische Universität Berlin; Henrik Erb, Patrick Mullan, and Vincent Haupert, Friedrich-Alexander-Universität Erlangen-Nürnberg

Available Media

Authentication schemes that include smartphones gain popularity. Instead of storing keys in app-private storage — clone-able by privileged malware — recent research proposes authentication with hardware fingerprints, arguing they will be harder for attackers to fake. Notably, the use of camera sensor fingerprints has been discussed, recently. This paper revisits the eligibility of this camera sensor noise for authentication.The so-called Photo Response Non-Uniformity (PRNU) exploits use of production tolerances in the CMOS sensors, commonly used in smartphone cameras, to trace a photo to a specific phone and authenticate its user. We conducted the first large-scale study for PRNU on smartphones, with 56,630 images stemming from individual 3,809 devices across 1036 models. Based on the collected dataset, we reproduce proposed authentication schemes and uncover caveats not discussed in prior work on authentication. In addition, we give constraints an image used for authentication schemes needs to fit, to increase the reliability of the results. We are able to pro-vide novel insights and discuss and implement attacks against the proposed schemes and discuss future improvements.

Dynamic Program Analysis

Binary-level Directed Fuzzing for Use-After-Free Vulnerabilities

Manh-Dung Nguyen and Sébastien Bardin, Univ. Paris-Saclay, CEA LIST, France; Richard Bonichon, Tweag I/O, France; Roland Groz, Univ. Grenoble Alpes, France; Matthieu Lemerre, Univ. Paris-Saclay, CEA LIST, France

Available Media

Directed fuzzing focuses on automatically testing specific parts of the code by taking advantage of additional information such as (partial) bug stack trace, patches or risky operations. Key applications include bug reproduction, patch testing and static analysis report verification. Although directed fuzzing has received a lot of attention recently, hard-to-detect vulnerabilities such as Use-After-Free (UAF) are still not well addressed, especially at the binary level. We propose UAFuzz , the first (binary-level) directed greybox fuzzer dedicated to UAF bugs. The technique features a fuzzing engine tailored to UAF specifics, a lightweight code instrumentation and an efficient bug triage step. Experimental evaluation for bug reproduction on real cases demonstrates that UAFuzz significantly outperforms state-of-the-art directed fuzzers in terms of fault detection rate, time to exposure and bug triaging. UAFuzz has also been proven effective in patch testing, leading to the discovery of 30 new bugs (7 CVEs) in programs such as Perl, GPAC and GNU Patch. Finally, we provide to the community a large fuzzing benchmark dedicated to UAF, built on both real codes and real bugs.

WearFlow: Expanding Information Flow Analysis To Companion Apps in Wear OS

Marcos Tileria and Jorge Blasco, Royal Holloway, University of London; Guillermo Suarez-Tangil, King's College London, IMDEA Networks

Available Media

Smartwatches and wearable technology have proliferated in the recent years featured by a seamless integration with a paired smartphone. Many mobile applications now come with a companion app that the mobile OS deploys on the wearable. These execution environments expand the context of mobile applications across more than one device, introducing new security and privacy issues. One such issue is that current information flow analysis techniques can not capture communication between devices. This can lead to undetected privacy leaks when developers use these channels. In this paper, we present WearFlow, a framework that uses static analysis to detect sensitive data flows across mobile and wearable companion apps in Android. WearFlow augments taint analysis capabilities to enable inter-device analysis of apps. WearFlow models proprietary libraries embedded in Google Play Services and instruments the mobile and wearable app to allow for a precise information flow analysis between them. We evaluate WearFlow on a test suite purposely designed to cover different scenarios for the communication Mobile-Wear, which we release as Wear-Bench. We also run WearFlow on 3K+ real-world apps and discover privacy violations in popular apps (10M+ downloads).

MEUZZ: Smart Seed Scheduling for Hybrid Fuzzing

Yaohui Chen, Mansour Ahmadi, and Reza Mirzazade farkhani, Northeastern University; Boyu Wang, Stony Brook University; Long Lu, Northeastern University

Available Media

Seed scheduling highly impacts the yields of hybrid fuzzing. Existing hybrid fuzzers schedule seeds based on fixed heuristics that aim to predict input utilities. However, such heuristics are not generalizable as there exists no one-size-fits-all rule applicable to different programs. They may work well on the programs from which they were derived, but not others.

To overcome this problem, we design a Machine learning-Enhanced hybrid fUZZing system (MEUZZ), which employs supervised machine learning for adaptive and generalizable seed scheduling. MEUZZ determines which new seeds are expected to produce better fuzzing yields based on the knowledge learned from past seed scheduling decisions made on the same or similar programs. MEUZZ extracts a series of features for learning via code reachability and dynamic analysis, which incurs negligible runtime overhead (in microseconds). MEUZZ automatically infers the data labels by evaluating the fuzzing performance of each selected seed. As a result, MEUZZ is generally applicable to, and performs well on, various kinds of programs.

Our evaluation shows MEUZZ significantly outperforms the state-of-the-art grey-box and hybrid fuzzers, achieving 27.1% more code coverage than QSYM. The learned models are reusable and transferable, which boosts fuzzing performance by 7.1% on average and improves 68% of the 56 cross-program fuzzing campaigns. When fuzzing 8 well-tested programs under the same configurations as used in previous work, MEUZZ discovered 47 deeply hidden and previously unknown bugs, among which 21 were confirmed and fixed by the developers.

Web Security

Tracing and Analyzing Web Access Paths Based on User-Side Data Collection: How Do Users Reach Malicious URLs?

Takeshi Takahashi, National Institute of Information and Communications Technology; Christopher Kruegel and Giovanni Vigna, University of California, Santa Barbara; Katsunari Yoshioka, Yokohama National University; Daisuke Inoue, National Institute of Information and Communications Technology

Available Media

Web access exposes users to various attacks, such as malware infections and social engineering attacks. Despite ongoing efforts by security and browser vendors to protect users, some users continue to access malicious URLs. To provide better protection, we need to know how users reach such URLs. In this work, we collect web access records of users from their using our browser extension. Differing from data collection on the network, user-side data collection enables us to discern users and web browser tabs, facilitating efficient data analysis. Then, we propose a scheme to extract an entire web access path to a malicious URL, called a hazardous path, from the access records. With all the hazardous paths extracted from the access records, we analyze web access activities of users considering initial accesses on the hazardous paths, risk levels of bookmarked URLs, time required to reach malicious URLs, and the number of concurrently active browser tabs when reaching such URLs. In addition, we propose a preemptive domain filtering scheme, which identifies domains leading to malicious URLs, called hazardous domains. We demonstrate the effectiveness of the scheme by identifying hazardous domains that are not included in blacklists.

What's in an Exploit? An Empirical Analysis of Reflected Server XSS Exploitation Techniques

Ahmet Salih Buyukkayhan, Microsoft; Can Gemicioglu, Northeastern University; Tobias Lauinger, New York University; Alina Oprea, William Robertson, and Engin Kirda, Northeastern University

Available Media

Cross-Site Scripting (XSS) is one of the most prevalent vulnerabilities on the Web. While exploitation techniques are publicly documented, to date there is no study of how frequently each technique is used in the wild. In this paper, we conduct a longitudinal study of 134k reflected server XSS exploits submitted to XSSED and OPENBUGBOUNTY, two vulnerability databases collectively spanning a time period of nearly ten years. We use a combination of static and dynamic analysis techniques to identify the portion of each archived server response that contains the exploit, execute it in a sandboxed analysis environment, and detect the exploitation techniques used. We categorize the exploits based on the exploitation techniques used and generate common exploit patterns. We find that most exploits are relatively simple, but there is a moderate trend of increased sophistication over time. For example, as automated XSS defenses evolve, direct code execution with <script> is declining in favour of indirect execution triggered by event handlers in conjunction with other tags, such as <svg onload. We release our annotated data, enabling researchers to create diverse exploit samples for model training or system evaluation.

Mininode: Reducing the Attack Surface of Node.js Applications

Igibek Koishybayev and Alexandros Kapravelos, North Carolina State University

Available Media

JavaScript has gained traction as a programming language that qualifies for both the client-side and the server-side logic of applications. A new ecosystem of server-side code written in JavaScript has been enabled by Node.js, the use of the V8 JavaScript engine and a collection of modules that provide various core functionality. Node.js comes with its package manager, called NPM, to handle the dependencies of modern applications, which allow developers to build Node.js applications with hundreds of dependencies on other modules.

In this paper, we present Mininode, a static analysis tool for Node.js applications that measures and removes unused code and dependencies. Our tool can be integrated into the building pipeline of Node.js applications to produce applications with significantly reduced attack surface. We analyzed 672k Node.js applications and reported the current state of code bloating in the server-side JavaScript ecosystem. We leverage a vulnerability database to identify 1,660 vulnerable packages that are loaded from 119,433 applications as dependencies. Mininode is capable of removing 2,861 of these vulnerable dependencies. The complex expressiveness and the dynamic nature of the JavaScript language does not always allow us to statically resolve the dependencies and usage of modules. To evaluate the correctness of our reduction, we run Mininode against 37k Node.js applications that have unit tests and reduce correctly 95.4% of packages. Mininode was able to restrict access to the built-in fs and net modules in 79.4%and 96.2% of the reduced applications respectively.

Evaluating Changes to Fake Account Verification Systems

Fedor Kozlov, Isabella Yuen, Jakub Kowalczyk, Daniel Bernhardt, and David Freeman, Facebook, Inc; Paul Pearce, Facebook, Inc and Georgia Institute of Technology; Ivan Ivanov, Facebook, Inc

Available Media

Online social networks (OSNs) such as Facebook, Twitter, and LinkedIn give hundreds of millions of individuals around the world the ability to communicate and build communities. However, the extensive user base of OSNs provides considerable opportunity for malicious actors to abuse the system, with fake accounts generating the vast majority of harmful actions and content. Social networks employ sophisticated detection mechanisms based on machine-learning classifiers and graph analysis to identify and remediate the actions of fake accounts. Disabling or deleting these detected accounts is not tractable when the number of false positives (i.e., real users disabled) is significant in absolute terms. Using challenge-based verification systems, such as CAPTCHAs or phone confirmation, as a response for detected fake accounts, can enable erroneously detected real users to recover their access, while also making it difficult for attackers to abuse the platform.

In order to maintain a verification system's effectiveness over time, it is important to iterate on the system to improve the real user experience and adapt the platform's response to adversarial actions. However, at present there is no established method to evaluate how effective each iteration is at stopping fake accounts and letting real users through. This paper proposes a method of assessing the effectiveness of experimental iterations for OSN verification systems, and presents an evaluation of this method against human-labelled ground truth data using production Facebook data. Our method reduces the volume of necessary human labelled data by 70%, decreases the time necessary for classification by 81%, has suitable precision/recall for making decisions in response to experiments, and enables continuous monitoring of the effectiveness of the applied experimental changes.

Thursday, October 15

Malware

SourceFinder: Finding Malware Source-Code from Publicly Available Repositories in GitHub

Md Omar Faruk Rokon, Risul Islam, Ahmad Darki, Evangelos E. Papalexakis, and Michalis Faloutsos, UC Riverside

Available Media

Where can we find malware source code? This question is motivated by a real need: there is a dearth of malware source code, which impedes various types of security research. Our work is driven by the following insight: public archives, like GitHub, have a surprising number of malware repositories. Capitalizing on this opportunity, we propose, SourceFinder, a supervised-learning approach to identify repositories of malware source code efficiently. We evaluate and apply our approach using 97K repositories from GitHub. First, we show that our approach identifies malware repositories with 89% precision and 86% recall using a labeled dataset. Second, we use SourceFinder to identify 7504 malware source code repositories, which arguably constitutes the largest malware source code database. Finally, we study the fundamental properties and trends of the malware repositories and their authors. The number of such repositories appears to be growing by an order of magnitude every 4 years, and 18 malware authors seem to be "professionals" with a well-established online reputation. We argue that our approach and our large repository of malware source code can be a catalyst for research studies, which are currently not possible.

HyperLeech: Stealthy System Virtualization with Minimal Target Impact through DMA-Based Hypervisor Injection

Ralph Palutke, Simon Ruderich, Matthias Wild, and Felix Freiling, Friedrich-Alexander-Universität Erlangen-Nürnberg

Available Media

In the recent past, malware began to incorporate anti-forensic techniques in order to hinder analysts from gaining meaningful results. Consequently, methods that allow the stealthy analysis of a system became increasingly important. In this paper, we present HyperLeech, the first approach which uses DMA to stealthily inject a thin hypervisor into the memory of a target host, transparently shifting its operation into a hardware-accelerated virtual machine. For the code injection, we make use of external PCILeech hardware to enable DMA to the target memory. Combining the advantages of hardware-supported virtualization with the benefits provided by DMA-based code injection, our approach can serve analysts as a stealthy and privileged execution layer that enables powerful live forensics and atomic memory snapshots for already running systems. Our experiments revealed that HyperLeech is sufficient to virtualize multi-core Linux hosts without causing significant impact on a target’s processor and memory state during its installation, execution, and removal. Although our approach might be misused for malicious purposes, we conclude that it provides new knowledge to help researchers with the design of stealthy system introspection techniques that focus on preserving a target system’s state.

Effective Detection of Credential Thefts from Windows Memory: Learning Access Behaviours to Local Security Authority Subsystem Service

Patrick Ah-Fat and Michael Huth, Imperial College London; Rob Mead, Tim Burrell, and Joshua Neil, Microsoft

Available Media

Malicious actors that have already penetrated an enterprise network will exploit access to launch attacks within that network. Credential theft is a common preparatory action for such attacks, as it enables privilege escalation or lateral movement. Elaborate techniques for extracting credentials from Windows memory have been developed by actors with advanced capabilities. The state of the art in identifying the use of such techniques is based on malware detection, which can only alert on the presence of specific executable files that are known to perform such techniques. Therefore, actors can bypass detection of credential theft by evading the static detection of malicious code. In contrast, our work focuses directly on the memory read access behaviour to the process that enforces the system security policy. We use machine learning techniques driven by data from real enterprise networks to classify memory read behaviours as malicious or benign. As we show that Mimikatz is a popular tool seen across Microsoft Defender Advanced Threat Protection (MDATP) to steal credentials, our aim is to develop a generic model that detects the techniques it employs. Our classifier is based on novel features of memory read events and the characterisation of three popular techniques for credential theft. We integrated this classifier in a detector that is now running in production and is protecting customers of MDATP. Our experiments demonstrate that this detector has excellent false negative and false positive rates, and does alert on true positives that previous detectors were unable to identify.

Network & Cloud Security

EnclavePDP: A General Framework to Verify Data Integrity in Cloud Using Intel SGX

Yun He, Institute of Information Engineering, Chinese Academy of Sciences, and School of Cyber Security, University of Chinese Academy of Sciences; Yihua Xu, Metropolitan College, Boston University; Xiaoqi Jia, Institute of Information Engineering, Chinese Academy of Sciences, and School of Cyber Security, University of Chinese Academy of Sciences; Shengzhi Zhang, Metropolitan College, Boston University; Peng Liu, Pennsylvania State University; Shuai Chang, Institute of Information Engineering, Chinese Academy of Sciences, and School of Cyber Security, University of Chinese Academy of Sciences

Available Media

As the cloud storage service becomes pervasive, verifying the integrity of their outsourced data on cloud remotely turns out to be challenging for users. Existing Provable Data Possession (PDP) schemes mostly resort to a Third Party Auditor (TPA) to verify the integrity on behalf of users, thus reducing their communication and computation burden. However, such schemes demand fully trusted TPA, that is, placing TPA in the Trusted Computing Base (TCB), which is not always a reasonable assumption. In this paper, we propose EnclavePDP, a secure and general data integrity verification framework that relies on Intel SGX to establish the TCB for PDP schemes, thus eliminating the TPA from the TCB. EnclavePDP supports both new and existing PDP schemes by integrating core functionalities of cryptography libraries into Intel SGX. We choose 10 existing representative PDP schemes, and port them into EnclavePDP with reasonable effort. By deploying EnclavePDP in a real-world cloud storage platform and running the 10 PDP schemes respectively, we demonstrate that EnclavePDP can eliminate the dependence on TPA and introduce reasonable performance overhead.

Robust P2P Primitives Using SGX Enclaves

Yaoqi Jia, ACM Member; Shruti Tople, Microsoft Research; Tarik Moataz, Aroki Systems; Deli Gong, ACM Member; Prateek Saxena and Zhenkai Liang, National University of Singapore

Available Media

Peer-to-peer (P2P) systems such as BitTorrent and Bitcoin are susceptible to serious attacks from byzantine nodes that join as peers. Research has explored many adversarial models with additional assumptions, ranging from mild (such as pre-established PKI) to strong (such as the existence of common random coins). One such widely-studied model is the general-omission model, which yields simple protocols with good efficiency, but has been considered impractical or unrealizable since it artificially limits the adversary only to omitting messages.

In this work, we study the setting of a synchronous network wherein peer nodes have CPUs equipped with a recent trusted computing mechanism called Intel SGX. In this model, we observe that the byzantine adversary reduces to the adversary in the general-omission model. As a first result, we show that by leveraging SGX features, we eliminate any source of advantage for a byzantine adversary beyond that gained by omitting messages, making the general-omission model realizable. Second, we present new protocols that improve the communication complexity of two fundamental primitives — reliable broadcast and common random coins (or beacons) — in the synchronous setting, by utilizing SGX features. Our evaluation of 1000 nodes running on 40 DeterLab machines confirms theoretical efficiency claim.

aBBRate: Automating BBR Attack Exploration Using a Model-Based Approach

Anthony Peterson, Northeastern University; Samuel Jero, Purdue University; Endadul Hoque, Syracuse University; David Choffnes and Cristina Nita-Rotaru, Northeastern University

Available Media

BBR is a new congestion control algorithm proposed by Google that builds a model of the network path consisting of its bottleneck bandwidth and RTT to govern its sending rate rather than packet loss (like CUBIC and many other popular congestion control algorithms). Loss-based congestion control has been shown to be vulnerable to acknowledgment manipulation attacks. However, no prior work has investigated how to design such attacks for BBR, nor how effective they are in practice. In this paper we systematically analyze the vulnerability of BBR to acknowledgement manipulation attacks. We create the first detailed BBR finite state machine and a novel algorithm for inferring its current BBR state at runtime by passively observing network traffic. We then adapt and apply a TCP fuzzer to the Linux TCP BBR v1.0 implementation. Our approach generated 30,297 attack strategies, of which 8,859 misled BBR about actual network conditions. From these, we identify 5 classes of attacks causing BBR to send faster, slower or stall. We also found that BBR is immune to acknowledgment burst, division and duplication attacks that were previously shown to be effective against loss-based congestion control such as TCP New Reno.

ML-Based Security

Cyber Threat Intelligence Modeling Based on Heterogeneous Graph Convolutional Network

Jun Zhao, Beihang University; Qiben Yan, Michigan State University; Xudong Liu, Bo Li, and Guangsheng Zuo, Beihang University

Available Media

Cyber Threat Intelligence (CTI), as a collection of threat information, has been widely used in industry to defend against prevalent cyber attacks. CTI is commonly represented as Indicator of Compromise (IOC) for formalizing threat actors. However, current CTI studies pose three major limitations: first, the accuracy of IOC extraction is low; second, isolated IOC hardly depicts the comprehensive landscape of threat events; third, the interdependent relationships among heterogeneous IOCs, which can be leveraged to mine deep security insights, are unexplored. In this paper, we propose a novel CTI framework, HINTI, to model the interdependent relationships among heterogeneous IOCs to quantify their relevance. Specifically, we first propose multi-granular attention based IOC recognition method to boost the accuracy of IOC extraction. We then model the interdependent relationships among IOCs using a newly constructed heterogeneous information network (HIN). To explore intricate security knowledge, we propose a threat intelligence computing framework based on graph convolutional networks for effective knowledge discovery. Experimental results demonstrate that our proposed IOC extraction approach outperforms existing state-of-the-art methods, and HINTI can model and quantify the underlying relationships among heterogeneous IOCs, shedding new light on the evolving threat landscape.

Detecting Lateral Movement in Enterprise Computer Networks with Unsupervised Graph AI

Benjamin Bowman, Craig Laprade, Yuede Ji, and H. Howie Huang, Graph Computing Lab, George Washington University

Available Media

In this paper we present a technique for detecting lateral movement of Advanced Persistent Threats inside enterprise-level computer networks using unsupervised graph learning. Our detection technique utilizes information derived from industry standard logging practices, rendering it immediately deployable to real-world enterprise networks. Importantly, this technique is fully unsupervised, not requiring any labeled training data, making it highly generalizable to different environments. The approach consists of two core components: an authentication graph, and an unsupervised graph-based machine learning pipeline which learns latent representations of the authenticating entities, and subsequently performs anomaly detection by identifying low-probability authentication events via a learned logistic regression link predictor. We apply this technique to authentication data derived from two contrasting data sources: a small-scale simulated environment, and a large-scale real-world environment. We are able to detect malicious authentication events associated with lateral movement with a true positive rate of 85% and false positive rate of 0.9%, compared to 72% and 4.4% by traditional rule-based heuristics and non-graph anomaly detection algorithms. In addition, we have designed several filters to further reduce the false positive rate by nearly 40%, while reducing true positives by less than 1%.

An Object Detection based Solver for Google’s Image reCAPTCHA v2

Md Imran Hossen, Yazhou Tu, Md Fazle Rabby, and Md Nazmul Islam, University of Louisiana at Lafayette; Hui Cao, Xi'an Jiaotong University; Xiali Hei, University of Louisiana at Lafayette

Available Media

Previous work showed that reCAPTCHA v2's image challenges could be solved by automated programs armed with Deep Neural Network (DNN) image classifiers and vision APIs provided by off-the-shelf image recognition services. In response to emerging threats, Google has made significant updates to its image reCAPTCHA v2 challenges that can render the prior approaches ineffective to a great extent.

In this paper, we investigate the robustness of the latest version of reCAPTCHA v2 against advanced object detection based solvers. We propose a fully automated object detection based system that breaks the most advanced challenges of reCAPTCHA v2 with an online success rate of 83.25\%, the highest success rate to date, and it takes only 19.93 seconds (including network delays) on average to crack a challenge. We also study the updated security features of reCAPTCHA v2, such as anti-recognition mechanisms, improved anti-bot detection techniques, and adjustable security preferences. Our extensive experiments show that while these security features can provide some resistance against automated attacks, adversaries can still bypass most of them. Our experiment findings indicate that the recent advances in object detection technologies pose a severe threat to the security of image captcha designs relying on simple object detection as their underlying AI problem.

Breaking ML

Evasion Attacks against Banking Fraud Detection Systems

Michele Carminati, Luca Santini, Mario Polino, and Stefano Zanero, Politecnico di Milano

Available Media

Machine learning models are vulnerable to adversarial samples: inputs crafted to deceive a classifier. Adversarial samples crafted against one model can be effective also against related models. Therefore, even without a comprehensive knowledge of the target system, a malicious agent can attack it by training a surrogate model and crafting evasive samples. Unlike the image classification context, the banking fraud detection domain is characterized by samples with few aggregated features. This characteristic makes conventional approaches hardly applicable to the banking fraud context.

In this paper, we study the application of AML techniques to the banking fraud detection domain. To this end, we identify the main challenges and design a novel approach to perform evasion attacks. Using two real bank datasets, we evaluate the security of several state-of-the-art fraud detection systems by deploying evasion attacks with different degrees of attacker's knowledge. We show that the outcome of the attack is strictly dependent on the target fraud detector, with an evasion rate ranging from 60% to 100%. Interestingly, our results show that the increase of attacker knowledge does not significantly increase the attack success rate, except for the full knowledge scenario.

The Limitations of Federated Learning in Sybil Settings

Clement Fung, Carnegie Mellon University; Chris J. M. Yoon and Ivan Beschastnikh, University of British Columbia

Available Media

Federated learning over distributed multi-party data is an emerging paradigm that iteratively aggregates updates from a group of devices to train a globally shared model. Relying on a set of devices, however, opens up the door for sybil attacks: malicious devices may be controlled by a single adversary who directs these devices to attack the system.

We consider the susceptibility of federated learning to sybil attacks and propose a taxonomy of sybil objectives and strategies in this setting. We describe a new DoS attack that we term training inflation and present several ways to carry out this attack. We then evaluate recent distributed ML fault tolerance proposals and show that these are insufficient to mitigate several sybil-based attacks. Finally, we introduce a defense against targeted sybil-based poisoning called FoolsGold, which identifies sybils based on the diversity of client updates. We show that FoolsGold exceeds state of the art approaches when countering several types of poisoning attacks. Our work is open source and is available online: https://github.com/DistributedML/FoolsGold

GhostImage: Remote Perception Attacks against Camera-based Image Classification Systems

Yanmao Man and Ming Li, University of Arizona; Ryan Gerdes, Virginia Tech

Available Media

In vision-based object classification systems imaging sensors perceive the environment and then objects are detected and classified for decision-making purposes; e.g., to maneuver an automated vehicle around an obstacle or to raise an alarm to indicate the presence of an intruder in surveillance settings. In this work we demonstrate how the perception domain can be remotely and unobtrusively exploited to enable an attacker to create spurious objects or alter an existing object. An automated system relying on a detection/classification framework subject to our attack could be made to undertake actions with catastrophic results due to attacker-induced misperception.

We focus on camera-based systems and show that it is possible to remotely project adversarial patterns into camera systems by exploiting two common effects in optical imaging systems, viz., lens flare/ghost effects and auto-exposure control. To improve the robustness of the attack to channel effects, we generate optimal patterns by integrating adversarial machine learning techniques with a trained end-to-end channel model. We experimentally demonstrate our attacks using a low-cost projector, on three different image datasets, in indoor and outdoor environments, and with three different cameras. Experimental results show that, depending on the projector-camera distance, attack success rates can reach as high as 100% and under targeted conditions.

Friday, October 16

CPS Security

PLC-Sleuth: Detecting and Localizing PLC Intrusions Using Control Invariants

Zeyu Yang, Zhejiang University; Liang He, University of Colorado Denver; Peng Cheng and Jiming Chen, Zhejiang University; David K.Y. Yau, Singapore University of Technology and Design; Linkang Du, Zhejiang University

Available Media

Programmable Logic Controllers (PLCs) are the ground of control systems, which are however, vulnerable to a variety of cyber attacks, especially for networked control systems. To mitigate this issue, we design PLC-Sleuth, a novel non-invasive intrusion detection/localization system for PLCs, grounding on a set of control invariants — i.e., the correlations between sensor readings and the concomitantly triggered PLC commands — that exist pervasively in all control systems. Specifically, taking the system’s Supervisory Control and Data Acquisition log as input, PLC-Sleuth abstracts/identifies the system’s control invariants as a control graph using data-driven structure learning, and then monitors the weights of graph edges to detect anomalies thereof, which is in turn, a sign of intrusion. We have implemented and evaluated PLC-Sleuth using both a prototype of Secure Ethanol Distillation System (SEDS) and a realistically simulated Tennessee Eastman (TE) process.

Software-based Realtime Recovery from Sensor Attacks on Robotic Vehicles

Hongjun Choi and Sayali Kate, Purdue University; Yousra Aafer, University of Waterloo; Xiangyu Zhang and Dongyan Xu, Purdue University

Available Media

We present a novel technique to recover robotic vehicles (RVs) from various sensor attacks with so-called software sensors. Specifically, our technique builds a predictive state-space model based on the generic system identification technique. Sensor measurement prediction based on the state-space model runs as a software backup of the corresponding physical sensor. When physical sensors are under attacks, the corresponding software sensors can isolate and recover the compromised sensors individually to prevent further damage. We apply our prototype to various sensor attacks on six RV systems, including a real quadrotor and a rover. Our evaluation results demonstrate that our technique can practically and safely recover the vehicle from various attacks on multiple sensors under different maneuvers, preventing crashes.

SIEVE: Secure In-Vehicle Automatic Speech Recognition Systems

Shu Wang, George Mason University; Jiahao Cao, George Mason University and Tsinghua University; Kun Sun, George Mason University; Qi Li, Tsinghua University and Beijing National Research Center for Information Science and Technology

Available Media

Driverless vehicles are becoming an irreversible trend in our daily lives, and humans can interact with cars through in-vehicle voice control systems. However, the automatic speech recognition (ASR) module in the voice control systems is vulnerable to adversarial voice commands, which may cause unexpected behaviors or even accidents in driverless cars. Due to the high demand on security insurance, it remains as a challenge to defend in-vehicle ASR systems against adversarial voice commands from various sources in a noisy driving environment. In this paper, we develop a secure in-vehicle ASR system called SIEVE, which can effectively distinguish voice commands issued from the driver, passengers, or electronic speakers in three steps. First, it filters out multiple-source voice commands from multiple vehicle speakers by leveraging an autocorrelation analysis. Second, it identifies if a single-source voice command is from humans or electronic speakers using a novel dual-domain detection method. Finally, it leverages the directions of voice sources to distinguish the voice of the driver from those of the passengers. We implement a prototype of SIEVE and perform a real-world study under different driving conditions. Experimental results show SIEVE can defeat various adversarial voice commands over in-vehicle ASR systems.

Firmware and Low Level Security

μSBS: Static Binary Sanitization of Bare-metal Embedded Devices for Fault Observability

Majid Salehi and Danny Hughes, imec-Distrinet, KU Leuven; Bruno Crispo, imec-Distrinet, KU Leuven, and Trento University, Italy

Available Media

A large portion of the already deployed Internet of Things (IoT) devices are bare-metal. In a bare-metal device, the firmware executes directly on the hardware with no intermediary OS. While bare-metal devices increase efficiency and flexibility, they are also subject to memory corruption vulnerabilities that are regularly uncovered. Fuzzing is an effective and popular software testing method to discover vulnerabilities. The effectiveness of fuzzing approaches relies on the fact that memory corruption faults, by violating existing security mechanisms such as MMU, are observable, thus relatively easy to debug. Unfortunately, bare-metal devices lack such security mechanisms. Consequently, fuzzing approaches encounter silent memory corruptions with no visible effects making debugging extremely difficult. This paper tackles this problem by proposing $\mu$SBS, a novel approach that, by statically instrumenting the binaries, makes memory corruptions observable. In contrast to prior work, $\mu$SBS does not need to reverse engineer the firmware. The approach is practical as it does not require a modified compiler and can perform policy-based instrumentation of firmware without access to source code. Evaluation of $\mu$SBS shows that it reduces security analyst effort, while discovering the same set of memory error types as prior work.

BlueShield: Detecting Spoofing Attacks in Bluetooth Low Energy Networks

Jianliang Wu, Yuhong Nan, and Vireshwar Kumar, Purdue University; Mathias Payer, EPFL; Dongyan Xu, Purdue University

Available Media

Many IoT devices are equipped with Bluetooth Low Energy (BLE) to support communication in an energy-efficient manner. Unfortunately, BLE is prone to spoofing attacks where an attacker can impersonate a benign BLE device and feed malicious data to its users. Defending against spoofing attacks is extremely difficult as security patches to mitigate them may not be adopted across vendors promptly; not to mention the millions of legacy BLE devices with limited I/O capabilities that do not support firmware updates.

As a first line of defense against spoofing attacks, we propose BlueShield, a legacy-friendly, non-intrusive monitoring system. BlueShield is motivated by the observation that all spoofing attacks result in anomalies in certain cyber-physical features of the advertising packets containing the BLE device’s identity. BlueShield leverages these features to detect anomalous packets generated by an attacker. More importantly, the unique design of BlueShield makes it robust against an advanced attacker with the capability to mimic all features. BlueShield can be deployed on low-cost off-the-shelf platforms, and does not require any modification in the BLE device or its user. Our evaluation with nine common BLE devices deployed in a real-world office environment validates that BlueShield can effectively detect spoofing attacks at a very low false positive and false negative rate.

Dark Firmware: A Systematic Approach to Exploring Application Security Risks in the Presence of Untrusted Firmware

Duha Ibdah, Nada Lachtar, Abdulrahman Abu Elkhail, Anys Bacha, and Hafiz Malik, University of Michigan, Dearborn

Available Media

Compromising lower levels of the computing stack is attractive to attackers since malware that resides in layers that span firmware and hardware are notoriously difficult to detect and remove. This trend raises concerns about the security of the system components that we have grown accustomed to trusting, especially as the number of supply chain attacks continues to rise. In this work, we explore the risks associated with application security in the presence of untrusted firmware. We present a novel firmware attack that leverages system management cycles to covertly collect data from the application layer. We show that system interrupts that are used for managing the platform, can be leveraged to extract sensitive application data from outgoing requests even when the HTTPS protocol is used. We evaluate the robustness of our attack under diverse and stressful application usage conditions running on Ubuntu 18.04 and Android 8.1 operating systems. We conduct a proof-of-concept implementation of the attack using firmware configured to run with the aforementioned OSs and a mix of popular applications without disrupting the normal functionality of the system. Finally, we discuss a possible countermeasure that can be used to defend against firmware attacks.

Systems Security

A Framework for Software Diversification with ISA Heterogeneity

Xiaoguang Wang, SengMing Yeoh, and Robert Lyerly, Virginia Tech; Pierre Olivier, The University of Manchester; Sang-Hoon Kim, Ajou University; Binoy Ravindran, Virginia Tech

Available Media

Software diversification is one of the most effective ways to defeat memory corruption based attacks. Traditional software diversification such as code randomization techniques diversifies program memory layout and makes it difficult for attackers to pinpoint the precise location of a target vulnerability. Some recent work in the architecture community uses diverse ISA configurations to defeat code injection or code reuse attacks, showing that dynamically switching the ISA on which a program executes is a promising direction for future security systems. However, most of these work either remain in a simulation stage or require extra efforts to write the program.

In this paper, we propose HeterSec, a framework to secure applications utilizing a heterogeneous ISA setup composed of real-world machines. HeterSec runs on top of commodity x86_64 and ARM64 machines and gives the process the illusion that it runs on a multi-ISA chip multiprocessor (CMP) machine. With HeterSec, a process can dynamically select its underlying ISA environment. Therefore, a protected process would be capable of hiding the instruction set on which it executed or detecting abnormal program behavior by comparing execution results step-by-step from multiple ISA-diversified instances. To demonstrate the effectiveness of such a software framework, we implemented HeterSec on Linux and showcased its deployability by running it on a pair of x86_64 and ARM64 servers, connected over InfiniBand. We then conducted two case studies with HeterSec. In the first case, we implemented a multi-ISA moving target defense (MTD) system, which introduces uncertainty at the instruction set level. In the second case, we implemented a multi-ISA-based multi-version execution (MVX) system. The evaluation results show that HeterSec brings security benefits through ISA diversification with a reasonable performance overhead.

Confine: Automated System Call Policy Generation for Container Attack Surface Reduction

Seyedhamed Ghavamnia and Tapti Palit, Stony Brook University; Azzedine Benameur, Cloudhawk.io; Michalis Polychronakis, Stony Brook University

Available Media

Reducing the attack surface of the OS kernel is a promising defense-in-depth approach for mitigating the fragile isolation guarantees of container environments. In contrast to hypervisor-based systems, malicious containers can exploit vulnerabilities in the underlying kernel to fully compromise the host and all other containers running on it. Previous container attack surface reduction efforts have relied on dynamic analysis and training using realistic workloads to limit the set of system calls exposed to containers. These approaches, however, do not capture exhaustively all the code that can potentially be needed by future workloads or rare runtime conditions, and are thus not appropriate as a generic solution.

Aiming to provide a practical solution for the protection of arbitrary containers, in this paper we present a generic approach for the automated generation of restrictive system call policies for Docker containers. Our system, named Confine, uses static code analysis to inspect the containerized application and all its dependencies, identify the superset of system calls required for the correct operation of the container, and generate a corresponding Seccomp system call policy that can be readily enforced while loading the container. The results of our experimental evaluation with 150 publicly-available Docker images show that Confine can successfully reduce their attack surface by disabling 145 or more system calls (out of 326) for more than half of the containers, which neutralizes 51 previously disclosed kernel vulnerabilities.

sysfilter: Automated System Call Filtering for Commodity Software

Nicholas DeMarinis, Kent Williams-King, Di Jin, Rodrigo Fonseca, and Vasileios P. Kemerlis, Brown University

Available Media

Modern OSes provide a rich set of services to applications, primarily accessible via the system call API, to support the ever growing functionality of contemporary software. However, despite the fact that applications require access to part of the system call API (to function properly), OS kernels allow full and unrestricted use of the entire system call set. This not only violates the principle of least privilege, but also enables attackers to utilize extra OS services, after seizing control of vulnerable applications, or escalate privileges further via exploiting vulnerabilities in less-stressed kernel interfaces.

To tackle this problem, we present sysfilter: a binary analysis-based framework that automatically (1) limits what OS services attackers can (ab)use, by enforcing the principle of least privilege with respect to the system call API, and (2) reduces the attack surface of the kernel, by restricting the system call set available to userland processes. We implement sysfilter for x86-64 Linux, and present a set of program analyses for constructing system call sets statically, and in a scalable, precise, and complete (safe over-approximation) manner. In addition, we evaluate our prototype in terms of correctness using 411 binaries (real-world C/C++ applications) and ≈38.5K tests to assert their functionality. Furthermore, we measure the impact of our enforcement mechanism(s), demonstrating minimal, or negligible, run-time slowdown. Lastly, we conclude with a large scale study of the system call profile of ≈30K C/C++ applications (from Debian sid), reporting insights that justify our design and can aid that of future (system call-based) policing mechanisms.