The full Proceedings published by USENIX for the symposium are available for download below. Individual papers can also be downloaded from the presentation pages. Copyright to the individual works is retained by the author[s].
View the full program on the RAID 2019 website.
Monday, September 23, 2019
Jinghan Wang, University of California, Riverside; Yue Duan, Cornell University; Wei Song, Heng Yin, and Chengyu Song, University of California, Riverside
Coverage-guided greybox fuzzing has become one of the most prevalent techniques for finding software bugs. Coverage metric, which decides how a fuzzer selects new seeds, is an essential parameter of fuzzing and can greatly affect the results. While there are many existing works on the effectiveness of different coverage metrics on software testing, little is known about how different coverage metrics could actually affect the fuzzing results in practice. More importantly, it is unclear whether there exists one coverage metric that is absolutely superior than all the other metrics. In this paper, we report the first systematic study on the impact of different coverage metrics in fuzzing. To this end, we formally define and discuss the concept of sensitivity which can be used to theoretically compare different coverage metrics. We then present several coverage metrics with their variants. We conduct a study on these metrics with the DARPA CGC dataset, the LAVA-M dataset, and a set of real-world applications (a total of 221 binaries). We find that because each fuzzing instance has limited resources (time and computation power), (1) each metric has its unique merit in terms of flipping certain types of branches (thus vulnerability finding) and (2) there is no grand slam coverage metric that defeats all the others. We also explore combining different coverage metrics through cross-seeding and the result is very encouraging: this pure fuzzing based approach is able to crash at least the same numbers of binaries in CGC dataset as a previous approach (Driller) that combines fuzzing and concolic execution. At the same time, our approach uses fewer computing resources.
Rukayat Ayomide Erinfolami, Anh T Quach, and Aravind Prakash, Binghamton University
Due to the use of code pointers, polymorphism in C++ has been targeted by attackers and defenders alike. Vulnerable programs that violate the runtime object type integrity have been successfully exploited. Particularly, the virtual dispatch mechanism and the type confusion during casting have been targeted.
As a consequence, multiple defenses have been proposed in recent years to defend against attacks that target polymorphism. Particularly, compiler-based defenses incorporate design information—specifically class-hierarchy-related information—into the binary, and enforce runtime security policies to assert type integrity.
In this paper, we perform a systematic evaluation of the side-effects and unintended consequences of compiler-based security. Specifically, we show that application of modern defenses makes reverse engineering and semantic recovery easy. In particular, we show that modern defenses “leak" class hierarchy information, i.e., design information, thereby deter adoption in closed-source software. We consider a comprehensive set of 10 modern C++ defenses and show that 9 out of the 10 at least partially reveal design information as an unintended consequence of the defense. We argue a necessity for design-leakage-sensitive defenses that are preferable for closed-source use.
Ali Davanian, Zhenxiao Qi, Yu Qu, and Heng Yin, University of California, Riverside
Whole-system dynamic taint analysis has many unique applications such as malware analysis and fuzz testing. Compared with process-level taint analysis, it offers a wider analysis scope, better transparency and tamper resistance. The main barrier of applying whole-system dynamic taint analysis in practice is the large slowdown that can be sometimes up to 30 times. Existing optimization schemes either have considerable baseline overheads (when there is no tainted data) or rely on specific hardware features. In this paper, we propose an elastic whole-system dynamic taint approach and implement a prototype called DECAF++. Elastic whole-system dynamic taint analysis strives to perform taint analysis as least frequent as possible while maintaining the precision and accuracy. Although similar ideas are explored before for process-level taint analysis, we are the first to successfully achieve true elasticity for whole-system taint analysis via pure software approaches. We evaluated our prototype DECAF++ on nbench, apache bench, and SPEC CPU2006. Under taint analysis load, DECAF++ achieves 202% speedup on nbench and 66% speedup on apache bench. Under no taint analysis load with SPEC CPU2006, DECAF++ imposes only 4% overhead.
Zhen Cheng, Zhejiang University; Xinrui Hou, Xidian University; Runhuai Li and Yajin Zhou, Zhejiang University; Xiapu Luo, The Hong Kong Polytechnic University; Jinku Li, Xidian University; Kui Ren, Zhejiang University
We performed the first systematic study of a new attack on Ethereum that steals cryptocurrencies. The attack is due to the unprotected JSON-RPC endpoints existed in Ethereum nodes that could be exploited by attackers to transfer the Ether and ERC20 tokens to attackers-controlled accounts.
This study aims to shed light on the attack, including malicious behaviors and profits of attackers. Specifically, we first designed and implemented a honeypot that could capture real attacks in the wild. We then deployed the honeypot and reported results of the collected data in a period of six months. In total, our system captured more than $308$ million requests from $1,072$ distinct IP addresses. We further grouped attackers into $36$ groups with $59$ distinct Ethereum accounts. Among them, attackers of $34$ groups were stealing the Ether, while other $2$ groups were targeting ERC20 tokens. The further behavior analysis showed that attackers were following a three-steps pattern to steal the Ether. Moreover, we observed an interesting type of transaction called zero gas transaction, which has been leveraged by attackers to steal ERC20 tokens. At last, we estimated the overall profits of attackers. To engage the whole community, the dataset of captured attacks is released on https://github.com/zjuicsr/eth-honey.
Vincent Ghiütte, Harm Griffioen, and Christian Doerr, TU Delft
In SSH brute forcing attacks, adversaries try a lot of different user name and password combinations in order to compromise a system. As such activities are easily recognizable in log files, sophisticated adversaries distribute brute forcing attacks over a large number of origins. Effectively finding such distributed campaigns proves however to be a difficult problem.
In practice, when adversaries would spread out brute-forcing over multiple sources, they would likely reuse the same kind of software across all of these origins to simplify their operation and reduce cost. This means if we are able to identify the tooling used in these attempts, we could cluster similar tool usage into likely collaborating hosts and thus campaigns. In this paper, we demonstrate that it is possible to utilize cipher suites and SSH version strings to generate a unique fingerprint for a brute-forcing tool used by the attacker.
Based on a study using a large honeynet with over 4,500 hosts, which received approximately 35 million compromisation attempts over the period of one month, we are able to identify 49 tools from the collected data, which correspond to off-the-shelf tools, as well as custom implementations. The method is also able to fingerprint individual versions of tools, and by revealing mismatches between advertised and actually implemented features detect hosts that spoof identifying information. Based on the generated fingerprints, we are able to correlate login credentials to distinguish distributed campaigns. We uncovered specific adversarial behaviors, tactics and procedures, frequently exhibiting clear timing patterns and tight coordination.
Chih-Yuan Lin and Simin Nadjm-Tehrani, Linköping Universitet
Supervisory Control and Data Acquisition (SCADA) systems operate critical infrastructures in our modern society despite their vulnerability to attacks and misuse. There are several anomaly detection systems based on the cycles of polling mechanisms used in SCADA systems, but the feasibility of anomaly detection systems based on non-polling traffic, so called spontaneous events, is not well-studied. This paper presents a novel approach to modeling the timing characteristics of spontaneous events in an IEC-60870-5-104 network and exploits the model for anomaly detection. The system is tested with a dataset from a real power utility with injected timing effects from two attack scenarios. One attack causes timing anomalies due to persistent malfunctioning in the field devices, and the other generates intermittent anomalies caused by malware on the field devices, which is considered as stealthy. The detection accuracy and timing performance are promising for all the experiments with persistent anomalies. With intermittent anomalies, we found that our approach is effective for anomalies in low-volume traffic or attacks lasting over 1 hour.
Amin Kharraz, University of Illinois at Urbana Champaign; Brandon L. Daley and Graham Z. Baker, MIT Lincoln Laboratory; William Robertson and Engin Kirda, Northeastern University
Targeted attacks via transient devices are not new. How- ever, the introduction of BadUSB attacks has shifted the attack paradigm tremendously. Such attacks embed malicious code in device firmware and exploit the lack of access control in the USB protocol. In this paper, we propose USBESAFE as a mediator of the USB communication mechanism. By lever- aging the insights from millions of USB packets, we propose techniques to generate a protection model that can identify covert USB attacks by distinguishing BadUSB devices as a set of novel observations. Our results show that USBESAFE works well in practice by achieving a true positive [TP] rate of 95.7% with 0.21% false positives [FP] with latency as low as three malicious USB packets on USB traffic. We tested USBESAFE by deploying the model at several end-points for 20 days and running multiple types of BadUSB-style attacks with different levels of sophistication. Our analysis shows that USBESAFE can detect a large number of mimicry attacks without introducing any significant changes to the standard USB protocol or the underlying systems. The performance evaluation also shows that USBESAFE is transparent to the operating system, and imposes no discernible performance overhead during the enumeration phase or USB communication compared to the unmodified Linux USB subsystem.
Shijun Zhao, Institute of Software Chinese Academy of Sciences; Qianying Zhang, Capital Normal University Information Engineering College; Yu Qin, Wei Feng, and Dengguo Feng, Institute of Software Chinese Academy of Sciences
ARM specifications recommend that software residing in TEE's (Trusted Execution Environment) secure world should be located in the on-chip memory to prevent board level physical attacks. However, the on-chip memory is very limited, placing significant limits on TEE's functionality. The minimal kernel operating system architecture addresses this problem by building a small kernel which executes the whole TEE system only on the on-chip memory on demand and cryptographically protects all the data/code stored outside of SoC. In the architecture, a small kernel is built inside the TEE OS kernel space and achieves the minimal size by only including the very essential components used to execute and protect the TEE system. The minimal kernel consists of a minimal demand-paging system, which sets the on-chip memory as the only working memory for the TEE system and the off-chip memory as a backing store, and a memory protection component, which provides confidentiality and integrity protection on the backing store. A Merkle tree based memory protection scheme, reducing the requirement for on-chip memory, allows the minimal kernel to protect large trusted applications (TAs). This OS organization makes it possible to achieve the goal of physical security without losing any TEE's functionality. We have incorporated a prototype of minimal kernel into OP-TEE, a popular open source TEE OS. Our implementation only requires a runtime footprint of 100 KB on-chip memory but can protect the entire OP-TEE kernel and TAs, which are dozens of megabytes.
Flavio Toffalini, Singapore University of Technology and Design; Eleonora Losiouk and Andrea Biondo, University of Padua; Jianying Zhou, Singapore University of Technology and Design; Mauro Conti, University of Padua
The introduction of remote attestation (RA) schemes has allowed academia and industry to enhance the security of their systems. The commercial products currently available enable only the validation of static properties, such as applications fingerprint, and do not handle runtime properties, such as control-flow correctness. This limitation pushed researchers towards the identification of new approaches, called runtime RA. However, those mainly work on embedded devices, which share very few common features with complex systems, such as virtual machines in a cloud. A naive deployment of runtime RA schemes for embedded devices on complex systems faces scalability problems, such as the representation of complex control-flows or slow verification phase.
In this work, we present ScaRR: the first Scalable Runtime Remote attestation schema for complex systems. Thanks to its novel control-flow model, ScaRR enables the deployment of runtime RA on any application regardless of its complexity, by also achieving good performance. We implemented ScaRR and tested it on the benchmark suite SPEC CPU 2017. We show that ScaRR can validate on average 2M control-flow events per second, definitely outperforming existing solutions.
Eric Gustafson, UC Santa Barbara; Marius Muench, EURECOM; Chad Spensky, Nilo Redini, and Aravind Machiry, UC Santa Barbara; Yanick Fratantonio, Davide Balzarotti, and Aurelien Francillon, EURECOM; Yung Ryn Choe, Sandia National Laboratories; Christopher Kruegel and Giovanni Vigna, UC Santa Barbara
The recent paradigm shift introduced by the Internet of Things (IoT) has brought embedded systems into focus as a target for both security analysts and malicious adversaries. Typified by their lack of standardized hardware, diverse software, and opaque functionality, IoT devices present unique challenges to security analysts due to the tight coupling between their firmware and the hardware for which it was designed. In order to take advantage of modern program analysis techniques, such as fuzzing or symbolic execution, with any kind of scale or depth, analysts must have the ability to execute firmware code in emulated (or virtualized) environments. However, these emulation environments are rarely available and cumbersome to create through manual reverse engineering, greatly limiting the analysis of binary firmware.
In this work, we explore the problem of firmware re-hosting, the process by which firmware is migrated from its original hardware environment into a virtualized one. We show that an approach capable of creating virtual, interactive environments in an automated manner is a necessity to enable firmware analysis at scale. We present the first proof-of-concept system aiming to achieve this goal, called PRETENDER, which uses observations of the interactions between the original hardware and the firmware to automatically create models of peripherals, and allows for the execution of the firmware in a fully-emulated environment. Unlike previous approaches, these models are interactive, stateful, and transferable, meaning they are designed to allow the program to receive and process new input, a requirement of many analyses. We demonstrate our approach on multiple hardware platforms and firmware samples, and show that the models are flexible enough to allow for virtualized code execution, the exploration of new code paths, and the identification of security vulnerabilities.
Li Zhang, Jinan University; Jiongyi Chen, The Chinese University of Hong Kong; Wenrui Diao and Shanqing Guo, Shandong University; Jian Weng, Jinan University; Kehuan Zhang, The Chinese University of Hong Kong
Cryptographic functions play a critical role in the secure transmission and storage of application data. Although most crypto functions are well-defined and carefully-implemented in standard libraries, in practice, they could be easily misused or incorrectly encapsulated due to its error-prone nature and inexperience of developers. This situation is even worse in the IoT domain, given that developers tend to sacrifice security for performance in order to suit resource-constrained IoT devices. Given the severity and the pervasiveness of such bad practice, it is crucial to raise public awareness about this issue, find the misuses and shed light on best practices.
In this paper, we design and implement CryptoREX, a framework to identify crypto misuse of IoT devices under diverse architectures and in a scalable manner. In particular, CryptoREX lifts binary code to a unified IR and performs static taint analysis across multiple executables. To aggressively capture and identify misuses of self-defined crypto APIs, CryptoREX dynamically updates the API list during taint analysis and automatically tracks the function arguments.
Running on 521 firmware images with 165 pre-defined crypto APIs, it successfully discovered 679 crypto misuse issues in total, which on average costs only 1120 seconds per firmware. Our study shows 24.2% of firmware images violate at least one misuse rule, and most of the discovered misuses are unknown before. The misuses could result in sensitive data leakage, authentication bypass, password brute-force, etc. Our findings highlight the poor implementation and weak protection in today's IoT development.
Hamid Reza Ghaeini, Singapore University of Technology and Design; Matthew Chan, Rutgers University; Raad Bahmani and Ferdinand Brasser, TU Darmstadt; Luis Garcia, University of California, Los Angeles; Jianying Zhou, Singapore University of Technology and Design; Ahmad-Reza Sadeghi, TU Darmstadt; Nils Ole Tippenhauer, CISPA, Helmholtz Center for Information Security; Saman Zonouz, Rutgers University
Ensuring the integrity of embedded programmable logic controllers (PLCs) is critical for safe operation of industrial control systems. In particular, a cyber-attack could manipulate control logic running on the PLCs to bring the process of safety-critical application into unsafe states. Unfortunately, PLCs are typically not equipped with hardware support that allows the use of techniques such as remote attestation to verify the integrity of the logic code. In addition, so far remote attestation is not able to verify the integrity of the physical process controlled by the PLC.
In this work, we present PAtt, a system that combines remote software attestation with control process validation. PAtt leverages operation permutations—subtle changes in the operation sequences based on integrity measurements—which do not affect the physical process but yield unique traces of sensor readings during execution. By encoding integrity measurements of the PLC’s memory state (software and data) into its control operation, our system allows to remotely verify the integrity of the control logic based on the resulting sensor traces. We implement the proposed system on a real PLC controlling a robot arm, and demonstrate its feasibility. Our implementation enables the detection of attackers that manipulate the PLC logic to change process state and/or report spoofed sensor readings (with an accuracy of 97% against tested attacks).
Kimia Zamiri Azar, Farnoud Farahmand, Hadi Mardani Kamali, Shervin Roshanisefat, and Houman Homayoun, George Mason University; William Diehl, Virginia Tech; Kris Gaj and Avesta Sasan, George Mason University
In this paper, we introduce a novel Communication and Obfuscation Management Architecture (COMA) to handle the storage of the obfuscation key and to secure the communication to/from untrusted yet obfuscated circuits. COMA addresses three challenges related to the obfuscated circuits: First, it removes the need for the storage of the obfuscation unlock key at the untrusted chip. Second, it implements a mechanism by which the key sent for unlocking an obfuscated circuit changes after each activation (even for the same device), transforming the key into a dynamically changing license. Third, it protects the communication to/from the COMA protected device and additionally introduces two novel mechanisms for the exchange of data to/from COMA protected architectures: (1) a highly secure but slow double encryption, which is used for exchange of key and sensitive data (2) a high-performance and low-energy yet leaky encryption, secured by means of frequent key renewal. We demonstrate that compared to state-of-the-art key management architectures, COMA reduces the area overhead by 14%, while allowing additional features including unique chip authentication, enabling activation as a service (for IoT devices), reducing the side channel attack on key management architecture, and providing two new means of the secure communication to/from an COMA-secured untrusted chip.
Tuesday, September 24, 2019
Privacy Enhancing Techniques
Shruti Tople, Microsoft; Yaoqi Jia, Ziliqa Research; Prateek Saxena, NUS
Oblivious RAM is a well-known cryptographic primitive to hide data access patterns. However, the best known ORAM schemes require a logarithmic computation time in the general case which makes it infeasible for use in real-world applications. In practice, hiding data access patterns should incur a constant latency per access.
In this work, we present PRO-ORAM—an ORAM construction that achieves constant latencies per access in a large class of applications. PRO-ORAM theoretically and empirically guarantees this for read-only data access patterns, wherein data is written once followed by read requests. It makes hiding data access pattern practical for read-only workloads, incurring sub-second computational latencies per access for data blocks of 256 KB, over large (gigabyte-sized) datasets. PRO-ORAM supports throughputs of tens to hundreds of MBps for fetching blocks, which exceeds network bandwidth available to average users today. Our experiments suggest that dominant factor in latency offered by PRO-ORAM is the inherent network throughput of transferring final blocks, rather than the computational latencies of the protocol. At its heart, PRO-ORAM utilizes key observations enabling an aggressively parallelized algorithm of an ORAM construction and a permutation operation, as well as the use of trusted computing technique (SGX) that not only provides safety but also offers the advantage of lowering communication costs.
Alfonso Iacovazzi, ST Engineering-SUTD Cyber Security Laboratory, Singapore University of Technology and Design; Daniel Frassinelli, CISPA, Helmholtz Center for Information Security, Germany; Yuval Elovici, Department of Software and Information Systems Engineering and Cyber Security Research Center, Ben-Gurion University of the Negev, Israel, and iTrust—Centre for Research in Cyber Security, Singapore University of Technology and Design, Singapore
Tor is a distributed network composed of volunteer relays which is designed to preserve the sender-receiver anonymity of communications on the Internet. Despite the use of the onion routing paradigm, Tor is vulnerable to traffic analysis attacks. In this paper we present Duster, an active traffic analysis attack based on flow watermarking that exploits a vulnerability in Tor's congestion control mechanism in order to link a Tor onion service with its real IP address. The proposed watermarking system embeds a watermark at the destination of a Tor circuit which is propagated throughout the Tor network and can be detected by our modified Tor relays in the proximity of the onion service. Furthermore, upon detection the watermark is cancelled so that the target onion service remains unaware of its presence. We performed a set of experiments over the real Tor network in order to evaluate the feasibility of this attack. Our results show that true positive rates above 94% and false positive rates below 0.05% can be easily obtained. Finally we discuss a solution to mitigate this and other traffic analysis attacks which exploit Tor's congestion control.
Konstantinos Solomos, FORTH; Panagiotis Ilia, University of Illinois at Chicago; Sotiris Ioannidis, FORTH; Nicolas Kourtellis, Telefonica Research
Although digital advertising fuels much of today’s free Web, it typically does so at the cost of online users’ privacy, due to the continuous tracking and leakage of users’ personal data. In search for new ways to optimize the effectiveness of ads, advertisers have introduced new advanced paradigms such as cross-device tracking (CDT), to monitor users’ browsing on multiple devices and screens, and deliver (re)targeted ads in the most appropriate screen. Unfortunately, this practice leads to greater privacy concerns for the end-user.
Going beyond the state-of-the-art, we propose a novel methodology for detecting CDT and measuring the factors affecting its performance, in a repeatable and systematic way. This new methodology is based on emulating realistic browsing activity of end-users, from different devices, and thus triggering and detecting cross-device targeted ads. We design and build Talon, a CDT measurement framework that implements our methodology and allows experimentation with multiple parallel devices, experimental setups and settings. By employing Talon, we perform several critical experiments, and we are able to not only detect and measure CDT with average AUC score of 0.78-0.96, but also to provide significant insights about the behavior of CDT entities and the impact on users’ privacy. In the hands of privacy researchers, policy makers and end-users, Talon can be an invaluable tool for raising awareness and increasing transparency on tracking practices used by the ad-ecosystem.
Android Security I
Nir Sivan, Ron Bitton, and Asaf Shabtai, Ben Gurion University of the Negev
In recent years we have witnessed a shift towards personalized, context-based services for mobile devices. A key component of many of these services is the ability to infer the current location and predict the future location of users based on location sensors embedded in the devices. Such knowledge enables service providers to present relevant and timely offers to their users and better manage traffic congestion control, thus increasing customer satisfaction and engagement. However, such services suffer from location data leakage which has become one of today's most concerning privacy issues for smartphone users. In this paper we focus specifically on location data that is exposed by Android applications via Internet network traffic in plaintext without the user's awareness. We present an empirical evaluation involving the network traffic of real mobile device users, aimed at: (1) measuring the extent of relevant location data leakage in the Internet traffic of Android-based smartphone devices; (2) understanding the value of this data and the ability to infer users' points of interests (POIs); and (3) deriving a step-by-step attack aimed at inferring the user's POIs under realistic, real-world assumptions. This was achieved by analyzing the Internet traffic recorded from the smartphones of a group of 71 participants for an average period of 37 days. We also propose a procedure for mining and filtering location data from raw network traffic and utilize geolocation clustering methods to infer users' POIs. The key findings of this research center on the extent of this phenomenon in terms of both ubiquity and severity; we found that over 85% of the users' devices leaked location data, and the exposure rate of users' POIs, derived from the relatively sparse leakage indicators, is around 61%.
Wenrui Diao, Shandong University; Yue Zhang and Li Zhang, Jinan University; Zhou Li, University of California, Irvine; Fenghao Xu, The Chinese University of Hong Kong; Xiaorui Pan, Indiana University Bloomington; Xiangyu Liu, Alibaba Inc.; Jian Weng, Jinan University; Kehuan Zhang, The Chinese University of Hong Kong; XiaoFeng Wang, Indiana University Bloomington
The assistive technologies have been integrated into nearly all mainstream operating systems, which assist users with disabilities or difficulties in operating their devices. On Android, Google provides app developers with the accessibility APIs to make their apps accessible. Previous research has demonstrated a variety of stealthy attacks could be launched by exploiting accessibility capabilities (with BIND_ACCESSIBILITY_SERVICE permission granted). However, none of them systematically studied the underlying design of the Android accessibility framework, making the security implications of deploying accessibility features not fully understood.
In this paper, we make the first attempt to systemically evaluate the usage of the accessibility APIs and the design of their supporting architecture. Through code review and a large-scale app scanning study, we find the accessibility APIs have been misused widely. Further, we identify a series of fundamental design shortcomings of the Android accessibility framework: (1) no restriction on the purposes of using the accessibility APIs; (2) no strong guarantee to the integrity of accessibility event processing; (3) no restriction on the properties of custom accessibility events. Based on these observations, we demonstrate two practical attacks—installation hijacking and notification phishing—as showcases. As a result, tens of millions of users are under these threats. The flaws and attack cases described in this paper have been responsibly reported to the Android security team and the corresponding vendors. Besides, we propose some improvement recommendations to mitigate those security threats.
Yue Duan, Cornell University; Lian Gao, Jie Hu, and Heng Yin, University of California Riverside
Third-Party libraries, which are ubiquitous in Android apps, have exposed great security threats to end users as they rarely get timely updates from the app developers, leaving many security vulnerabilities unpatched. This outdatedness is mainly due to the fact that manually updating libraries can be non-trivial and time-consuming for app developers since it usually involves code modifications. In this paper, we propose a technique that performs automatic generation of non-intrusive updates for third-party libraries in Android apps. Given an Android app with an outdated library and a newer version of the library, we automatically update the old library in a way that is guaranteed to be fully backward compatible and impose zero impact to the library's interactions with other components. To understand the potential impact of code changes, we propose a novel Value-sensitive Differential Slicing algorithm that leverages the diffing information between two versions of a library. The new slicing algorithm greatly reduces the over-conservativeness of the traditional slicing while still preserving the soundness with respect to updates generation. We have implemented a prototype called LibBandAid. We further evaluated its efficacy on 9 popular libraries with 173 security commits across 83 different versions and 100 real-world open-source apps. The experimental results show that LibBandAid can achieve a high average successful updating rate of 80.6% for security vulnerabilities and an even higher rate of 94.07% when further combined with potentially patchable vulnerabilities.
Machine Learning & Watermarking
Fei Zuo, Bokai Yang, Xiaopeng Li, Lannan Luo, and Qiang Zeng, University of South Carolina
Despite the great achievements made by neural networks on tasks such as image classification, they are brittle and vulnerable to adversarial example (AE) attacks, which are crafted by adding human-imperceptible perturbations to inputs in order that a neural-network-based classifier incorrectly labels them. In particular, L0 AEs are a category of widely discussed threats where adversaries are restricted in the number of pixels that they can corrupt. However, our observation is that, while L0 attacks modify as few pixels as possible, they tend to cause large-amplitude perturbations to the modified pixels. We consider this as an inherent limitation of L0 AEs, and thwart such attacks by both detecting and rectifying them. The main novelty of the proposed detector is that we convert the AE detection problem into a comparison problem by exploiting the inherent limitation of L0 attacks. More concretely, given an image I, it is pre-processed to obtain another image I'. A Siamese network, which is known to be effective in comparison, takes I and I' as the input pair to determine whether I is an AE. A trained Siamese network automatically and precisely captures the discrepancies between I and I' to detect L0 perturbations. In addition, we show that the pre-processing technique, inpainting, used for detection can also work as an effective defense, which has a high probability of removing the adversarial influence of L0 perturbations. Thus, our system, called AEPECKER, demonstrates not only high AE detection accuracies, but also a notable capability to correct the classification results.
Jianqiang Wang, Shanghai Jiao Tong University; Siqi Ma, CSIRO DATA61; Yuanyuan Zhang and Juanru Li, Shanghai Jiao Tong University; Zheyu Ma, Northwestern Polytechnical University; Long Mai, Tiancheng Chen, and Dawu Gu, Shanghai Jiao Tong University
Memory corruption vulnerabilities are serious threats to software security, which is often triggered by improperly use of memory operation functions. The detection of memory corruptions relies on identifying memory operation functions and examining the corresponding manipulation applied on memories. Nevertheless, distinguishing memory operation functions is challenging that both standard and customized memory operation functions are declared in real-world software. In this paper, we propose NLP-EYE, an NLP-based memory corruption detection system. NLP-EYE is able to identify memory operation functions through a semantic-aware source code analysis automatically. It first creates a programming language friendly corpus in order to parse function prototypes. Based on the similarity comparison by utilizing both semantic and syntax information, NLP-EYE identifies and labels both standard and customized memory operation functions. It finally uses symbolic execution to check whether a memory operation causes incorrect memory usages.
Instead of analyzing data dependencies of the entire source code, NLP-EYE only focuses on memory operation parts. We evaluated the performance of NLP-EYE by using seven real-world libraries and programs, including Vim, Git, CPython, etc. NLP-EYE successfully identifies 27 null pointer de-reference, two double-free and three use-after-free that are not discovered before in the latest versions of analysis targets.
Erman Ayday, Case Western Reserve University and Bilkent University; Emre Yilmaz, Case Western Reserve University; Arif Yilmaz, Bilkent University
In this work, we address the liability issues that may arise due to unauthorized sharing of personal data. We consider a scenario in which an individual shares his sequential data (such as genomic data or location patterns) with several service providers (SPs). In such a scenario, if his data is shared with other third parties without his consent, the individual wants to determine the service provider that is responsible for this unauthorized sharing. To provide this functionality, we propose a novel optimization-based watermarking scheme for sharing of sequential data. The proposed scheme guarantees with a high probability that (i) the malicious SP that receives the data cannot understand the watermarked data points, (ii) when more than one malicious SPs aggregate their data, they still cannot determine the watermarked data points, (iii) even if the unauthorized sharing involves only a portion of the original data or modified data (to damage the watermark), the corresponding malicious SP can be kept responsible for the leakage, and (iv) the added watermark is compliant with the nature of the corresponding data. That is, if there are inherent correlations in the data, the added watermark still preserves such correlations. The proposed scheme also minimizes the utility loss due to changing certain parts of the data while it provides the aforementioned security guarantees. Furthermore, we conduct a case study of the proposed scheme on genomic data and show the security and utility guarantees of the proposed scheme.
Smart Malware that Uses Leaked Control Data of Robotic Applications: The Case of Raven-II Surgical Robots
Keywhan Chung and Xiao Li, University of Illinois at Urbana-Champaign; Peicheng Tang, Rose-Hulman Institute of Technology; Zeran Zhu, Zbigniew T. Kalbarczyk, Ravishankar K. Iyer, and Thenkurussi Kesavadas, University of Illinois at Urbana-Champaign
In this paper, we demonstrate a new type of threat that leverages machine learning techniques to maximize its impact. We use the Raven-II surgical robot and its haptic feedback rendering algorithm as an application. We exploit ROS vulnerabilities and implement smart self-learning malware that can track the movements of the robot’s arms and trigger the attack payload when the robot is in a critical stage of a (hypothetical) surgical procedure. By keeping the learning procedure internal to the malicious node that runs outside the physical components of the robotic application, an adversary can hide most of the malicious activities from security monitors that might be deployed in the system. Also, if an attack payload mimics an accidental failure, it is likely that the system administrator will fail to identify the malicious intention and will treat the attack as an accidental failure. After demonstrating the security threats, we devise methods (i.e., a safety engine) to protect the robotic system against the identified risk.
Samuel Weiser, Luca Mayr, Michael Schwarz, and Daniel Gruss, Graz University of Technology
Trusted execution environments, such as Intel SGX, allow executing enclaves shielded from the rest of the system. This fosters new application scenarios not only in cloud settings but also for securing various types of end-user applications. However, with these technologies new threats emerged. Due to the strong isolation guarantees of SGX, enclaves can effectively hide malicious payload from antivirus software. Were these scenarios already outlined years ago, we are evidencing functional attacks in the recent past. Unfortunately, no reasonable defense against enclave malware has been proposed.
In this work, we present the first practical defense mechanism protecting against various types of enclave misbehavior. By studying known and future attack vectors we identified the root cause for the enclave malware threat as a too permissive host interface for SGX enclaves, leading to a dangerous asymmetry between enclaves and applications. To overcome this asymmetry, we design SGXJail, an enclave compartmentalization mechanism making use of flexible memory access policies. SGXJail effectively defeats a wide range of enclave malware threats while at the same time being compatible with existing enclave infrastructure. Our proof-of-concept software implementation confirms the efficiency of SGXJail on commodity systems. We furthermore present slight extensions to the SGX specification, which allow for even more efficient enclave compartmentalization by leveraging Intel memory protection keys. Apart from defeating enclave malware, SGXJail enables new use cases beyond the original SGX threat model. We envision SGXJail not only for site isolation in modern browsers, i.e., confining different browser tabs but also for third-party plugin or library management.
Richard Li, University of Utah; Min Du, University of California Berkeley; David Johnson, Robert Ricci, Jacobus Van der Merwe, and Eric Eide, University of Utah
Kernel-resident malware remains a significant threat. An effective way to detect such malware is to examine the kernel memory of many similar (virtual) machines, as one might find in an enterprise network or cloud, in search of anomalies: i.e., the relatively rare infected hosts within a large population of healthy hosts. It is challenging, however, to compare the kernel memories of different hosts against each other. Previous work has relied on knowledge of specific kernels—e.g., the locations of important variables and the layouts of key data structures—to cross the "semantic gap" and allow kernels to be compared. As a result, those previous systems work only with the kernels they were built for, and they make assumptions about the malware being searched for.
Wednesday, September 25, 2019
Timothy Barron, Najmeh Miramirkhani, and Nick Nikiforakis, Stony Brook University
Domain names are a valuable resource on the web. Most domains are available to the public on a first-come, first-serve basis and once a domain is purchased, the owners keep them for a period of at least one year before they may choose to renew them. Common wisdom suggests that even if a domain name stops being useful to its owner, the owner will merely wait until the domain organically expires and choose not to renew.
In this paper, contrary to common wisdom, we report on the discovery that domain names are often deleted before their expiration date. This is concerning because this practice offers no advantage for legitimate users, while malicious actors deleting domains may hamper forensic analysis of malicious campaigns, and registrars deleting domains instead of suspending them enable re-registration and continued abuse. Specifically, we present the first systematic analysis of early domain name disappearances from the largest top-level domains (TLDs). We find more than 386,000 cases where domain names were deleted before expiring and we discover individuals with more than 1,000 domains deleted in a single day. Moreover, we identify the specific registrars that choose to delete domain names instead of suspending them. We compare lexical features of these domains, finding significant differences between domains that are deleted early, suspended, and organically expiring. Furthermore, we explore potential reasons for deletion finding over 7,000 domain names squatting more popular domains and more than 10,000 associated with malicious registrants.
HinDom: A Robust Malicious Domain Detection System based on Heterogeneous Information Network with Transductive Classification
Xiaoqing Sun, Mingkai Tong, and Jiahai Yang, Institute for Network Sciences and Cyberspace, Tsinghua University, Beijing, China; Liu Xinran, National Computer Network Emergency Response Technical Team/Coordination Center of China, Beijing, China; Liu Heng, China Electronics Cyberspace Great Wall Co., Ltd, Beijing, China
Domain name system (DNS) is a crucial part of the Internet, yet has been widely exploited by cyber attackers. Apart from making static methods like blacklists or sinkholes infeasible, some weasel attackers can even bypass detection systems with machine learning based classifiers. As a solution to this problem, we propose a more robust domain detection system named HinDom. Instead of relying on local features, HinDom obtains a global view by constructing a heterogeneous information network (HIN) of clients, domains, IP addresses and their diverse relationships. Besides, the metapath-based transductive classification method enables HinDom to detect malicious domains with only a small fraction of labeled samples. So far as we know, this is the first work to apply HIN in malicious domain detection. We build a prototype of HinDom and evaluate it in CERNET2 and TUNET. The results reveal that HinDom is accurate, robust and can identify previously unknown malicious domains.
Daiki Chiba, Ayako Akiyama Hasegawa, and Takashi Koide, NTT Secure Platform Laboratories; Yuta Sawabe and Shigeki Goto, Waseda University; Mitsuaki Akiyama, NTT Secure Platform Laboratories
Cyber attackers create domain names that are visually similar to those of legitimate/popular brands by abusing valid internationalized domain names (IDNs). In this work, we systematize such domain names, which we call deceptive IDNs, and understand the risks associated with them. In particular, we propose a new system called DomainScouter to detect various deceptive IDNs and calculate a deceptive IDN score, a new metric indicating the number of users that are likely to be misled by a deceptive IDN. We perform a comprehensive measurement study on the identified deceptive IDNs using over 4.4 million registered IDNs under 570 top level domains (TLDs). The measurement results demonstrate that there are many previously unexplored deceptive IDNs targeting non-English brands or combining other domain squatting methods. Furthermore, we conduct online surveys to examine and highlight vulnerabilities in user perceptions when encountering such IDNs. Finally, we discuss the practical countermeasures that stakeholders can take against deceptive IDNs.
Dynamically Finding Minimal Eviction Sets Can Be Quicker Than You Think for Side-Channel Attacks against the LLC
Wei Song, Institute of Information Engineering, CAS; Peng Liu, Pennsylvania State University
Minimal eviction sets are essential for conflict-based cache side-channel attacks targeting the last-level cache. In the most restricted case where attacker have no control over the mapping from virtual addresses to cache sets, finding rather than computing minimal eviction sets become the only solution. It was believed that finding minimal eviction sets is a long process until a recent discovery finding that they can be found in linear time.
This paper focuses on further improving the speed of finding minimal eviction sets. A systematic analysis of the existing algorithms has been done using an ideal cache model. Our analysis shows: The latency upper bound of finding minimal eviction sets can be further reduced from O(w2n) to O(wn); the average latency is seriously less than the upper bound; the latency assumption used by recent defences is significantly overestimated. Practical experiments are produced on three modern processors. Using a handful of new techniques proposed in this paper, including using concurrent multithread execution to circumvent the thrashing resistant cache replacement policies, we demonstrates that minimal eviction sets can be found within a fraction of a second on all processors, including a latest Coffee Lake one. It is also the first time to show that it is possible to find minimal eviction sets with totally random addresses without fixing the page offset bits.
Wubing Wang, Yinqian Zhang, and Zhiqiang Lin, The Ohio State University
While Intel SGX provides confidentiality and integrity guarantees to programs running inside enclaves, side channels remain a primary concern of SGX security. Previous works have broadly considered the side-channel attacks against SGX enclaves at the levels of pages, caches, and branches, using a variety of attack vectors and techniques. Most of these studies have only exploited the "order" attribute of the memory access patterns (e.g., sequences of page accesses) as side channels. However, the other attribute of memory access patterns, “time”, which characterizes the interval between two specific memory accesses, is mostly unexplored. In this paper, we present ANABLEPS, a tool to automate the detection of side-channel vulnerabilities in enclave binaries, considering both order and time. ANABLEPS leverages concolic execution and fuzzing techniques to generate input sets for an arbitrary enclave program, constructing extended dynamic control-flow graph representation of execution traces using Intel PT, and automatically analyzing and identifying side-channel vulnerabilities using graph analysis.
Security in Data Centers and the Cloud
Abdulhakim Sabur, Ankur Chowdhary, and Dijiang Huang, Arizona State University; Myong Kang, Anya Kim, and Alexander Velazquez, Naval Research Lab
With an average network size approaching 8000 servers, data-center networks need scalable security-state monitoring solutions. Using Attack Graph (AG) to identify possible attack paths and a network risk is a common approach. However,existing AG generation approaches suffer from the state-space explosion issue. The size of AG increases exponentially as the number of services and vulnerabilities increase. To address this issue, we propose a network segmentation-based scalable security state management framework, called S3, which applies a divide-and-conquer approach to create multiple small-scale AGs (i.e., sub-AGs) by partitioning a large net-work into manageable smaller segments, and then merge them to establish the entire AG for the whole system. S3 utilizes SDN-based distributed firewall(DFW) for managing service reachability among different network segments. Therefore, it avoids reconstructing the entire system-level AG due to the dependencies among vulnerabilities.A series of experimentations are conducted to demonstrates that S3 (i) reduces AG generation and analysis complexity by reducing AG’s density compared to existing AG-based solutions; (ii) utilizes SDN-based DFW to provide a granular security management framework, by incorporating security policies at the level of individual hosts and segments.Therefore, S3 helps in limiting targeted slow and low attacks involving lateral movement.
Wu Luo, Qingni Shen, Yutang Xia, and Zhonghai Wu, Peking University, Beijing, China
Container-based virtualization has been widely utilized and brought unprecedented influence on traditional IT architecture. How to build trust for containers has become an important security issue as well. Despite the fact that substantial efforts have been made to solve this issue, there are still some challenges to be handled, i.e. how to prevent from exposing information of the underlying host and other users' containers to a remote verifier, how to measure the integrity status of a designated container along with its reliant services in the underlying host and generate a hardware-based integrity evidence. None of the currently solutions can counter these challenges and guarantee efficiency simultaneously.
In this paper, we present Container-IMA, a novel solution to cope with these challenges. We firstly analyze the essential evidence to validate the integrity of a designated container. Afterwards we make a division of the traditional Measurement Log (ML), which ensures privacy and decreases the latency of attestation. A container-based Platform Configuration Register (cPCR) mechanism is introduced to protect each ML partition with a hardware-based Root of Trust. The attestation mechanism is proposed as well. We implement a prototype based on Docker. The experiment results demonstrate the effectiveness and efficiency of our solution.
Jiahao Cao, Tsinghua University and George Mason University; Zijie Yang, Tsinghua University; Kun Sun, George Mason University; Qi Li, Mingwei Xu, Tsinghua University; Peiyi Han, Beijing University of Posts and Telecommunications
By decoupling control and data planes, Software-Defined Networking (SDN) enriches network functionalities with deploying diversified applications in a logically centralized controller. As the applications reveal the presence or absence of internal network services and functionalities, they appear as black-boxes, which are invisible to network users. In this paper, we show an adversary can infer what applications run on SDN controllers by analyzing low-level and encrypted control traffic. Such information can help an adversary to identify valuable targets, know the possible presence of network defense, and thus schedule a better plan for a later stage of an attack. We design deep learning based methods to accurately and efficiently fingerprint all SDN applications from mixed control traffic. To evaluate the feasibility of the attack, we collect massive traces of control traffic from a real SDN testbed running various applications. Extensive experiments demonstrate an adversary can accurately identify various SDN applications with a 95.4% accuracy on average.
Android Security II
Dario Nisi, EURECOM; Antonio Bianchi, University of Iowa; Yanick Fratantonio, EURECOM
Within the realm of program analysis, dynamic analysis approaches are at the foundation of many frameworks. In the context of Android security, the vast majority of existing frameworks perform API-level tracing (i.e., they aim at obtaining the trace of the APIs invoked by a given app), and use this information to determine whether the app under analysis contains unwanted or malicious functionality. However, previous works have shown that these API-level tracing and instrumentation mechanisms can be easily evaded, regardless of their specific implementation details. An alternative to API-level tracing is syscall-level tracing. This approach works at a lower level and it extracts the sequence of syscalls invoked by a given app: the advantage is that this approach can be implemented in kernel space and, thus, it cannot be evaded and it can be very challenging, if not outright impossible, to be detected by code running in user space. However, while this approach offers more security guarantees, it is affected by a significant limitation: most of the semantics of the app’s behavior is lost. These syscalls are in fact low-level and do not carry as much information as the highly semantics-rich Android APIs. In other words, there is a significant semantic gap.
This paper presents the first exploration of how much it would take to bridge this gap and how challenging this endeavor would be. We propose an approach, an analysis framework, and a pipeline to gain insights into the peculiarities of this problem and we show that it is much more challenging than what previously thought.
Lun-Pin Yuan, Penn State University; Wenjun Hu, Palo Alto Networks Inc.; Ting Yu, Qatar Computing Research Institute; Peng Liu and Sencun Zhu, Penn State University
Android malware writers often utilize online malware scanners to check how well their malware can evade detection, and indeed we can find malware scan reports that were generated before the major outbreaks of such malware. If we could identify in-development malware before malware deployment, we would have developed effective defense mechanisms to prevent malware from causing devastating consequences. To this end, we propose Lshand to discover undiscovered malware before day zero, which we refer to as negative-day malware. The challenge includes scalability and the fact that malware writers would apply detection evasion techniques and submission anonymization techniques. Our approach is based on the observation that malware development is a continuous process and thus malware variants inevitably will share certain characteristics throughout its development process. Accordingly, Lshand clusters scan reports based on selective features and then performs further analysis on those seemingly benign apps that share similarity with malware variants. We implemented and evaluated Lshand with submissions to VirusTotal. Our results show that Lshand is capable of hunting down undiscovered malware in a large scale, and our manual analysis and a third-party scanner have confirmed our negative-day malware findings to be malware or grayware.
Aisha Ali-Gombe, Towson University; Sneha Sudhakaran, Louisiana State University; Andrew Case, Volatility Foundation; Golden G. Richard III, Louisiana State University
There is a growing need for post-mortem analysis in forensics investigations involving mobile devices, particularly when application-specific behaviors must be analyzed. This is especially true for architectures such as Android, where traditional kernel-level memory analysis frameworks such as Volatility face serious challenges recovering and providing context for user-space artifacts. In this research work, we developed an app-agnostic userland memory analysis technique that targets the new Android Runtime (ART). Leveraging its latest memory allocation algorithms, called region-based memory management, we develop a system called DroidScraper that recovers vital runtime data structures for applications by enumerating and reconstructing allocated objects from a process memory image. The result of our evaluation shows DroidScraper can recover and decode nearly 90% of all live objects in all allocated memory regions.