HotDep '10 Workshop Session Abstracts

WORKSHOP PROGRAM ABSTRACTS

Sunday, October 3, 2010

8:30 a.m.–10:00 a.m.

Storyboard: Optimistic Deterministic Multithreading
Back to Program
State-machine replication is a general approach to address the increasing importance of network-based services by improving their availability and reliability via replicated execution. If a service is deterministic, multiple replicas will produce the same results, and faults can be tolerated by means of agreement protocols. Unfortunately, real-life services are often not deterministic. One major source of non-determinism is multi-threaded execution with shared data access in which the thread execution order is determined by the run-time system and the outcome may depend on which thread accesses data first. We present Storyboard, an approach that ensures deterministic execution of multi-threaded programs. Storyboard achieves this by utilizing application-specific knowledge to minimize costly inter-replica coordination and to exploit concurrency in a similar way as non-deterministic execution. This is accomplished by making a forecast for a likely execution path, provided as an ordered sequence of locks that protect critical sections. If this forecast is correct, a request is executed in parallel to other running requests without further actions. Only in case of an incorrect forecast will an alternative execution path be resolved by inter-replica coordination.

Scalable Agreement: Toward Ordering as a Service
Back to Program
Replicated state machines use agreement protocols such as Paxos to order client requests. These protocols are not scalable and can quickly become a performance bottleneck as the degree of fault-tolerance and the demand for throughput performance increase. We propose a scalable agreement protocol that can utilize additional resources to provide higher throughput, while guaranteeing linearizability for client requests. Our protocol can build on existing optimizations, as it can use protocols like Paxos as a building block. A preliminary performance evaluation shows a throughput increase of 50%–179% over a baseline strategy, even without adding any hardware; and with additional hardware we are able to achieve even higher performance gains.

Active Quorum Systems
Back to Program
This paper outlines a flexible suite of object replication protocols that brings together Byzantine quorum systems registers and state machine replication. These protocols enable the implementation of Byzantine fault-tolerant applications that make minimal assumptions about the environment and that run in at most two more communication steps in almost all cases of non-favorable executions (in comparison with favorable executions).

10:30 a.m.–Noon

We Crashed, Now What?
Back to Program
We present an in-depth analysis of the crash-recovery problem and propose a novel approach to recover from otherwise fatal operating system (OS) crashes. We show how an unconventional, but careful, OS design, aided by automatic compiler-based code instrumentation, offers a practical solution towards the survivability of the entire system. Current results are encouraging and show that our approach is able to recover even the most critical OS subsystems without exposing the failure to user applications or hampering the scalability of the system.

Improved Device Driver Reliability Through Verification Reuse
Back to Program
Faulty device drivers are a major source of operating system failures. We argue that the underlying cause of many driver faults is the separation of two highly-related tasks: device verification and driver development. These two tasks have a lot in common, and result in software that is conceptually and functionally similar, yet kept totally separate. The result is a particularly bad case of duplication of effort: the verification code is correct, but is discarded after the device has been manufactured; the driver code is inferior, but used in actual device operation. We claim that the two tasks, and the software they produce, can and should be unified, and this will result in drastic improvement of device-driver quality and reduction in the development cost and time to market. In this paper we discuss technical issues involved in achieving such unification and present our solutions to these issues. We report the results of a case study that applies this approach to implement a driver for an Ethernet controller device.

Towards Automatically Checking Thousands of Failures with Micro-specifications
Back to Program
Recent data-loss incidents have shown that existing large distributed systems are still vulnerable to failures. To improve the situation, we propose two new testing approaches: failure testing service (FTS) and declarative testing specification (DTS). FTS enables us to systematically push a system into thousands of failure scenarios, leading us to many critical recovery bugs. With DTS, we introduce "micro-specifications", clear and concise specifications written in Datalog style, which enables developers to easily write, refine, and manage potentially hundreds of specifications.

1:30 p.m.–3:00 p.m.

Focus Replay Debugging Effort on the Control Plane
Back to Program
Replay debugging systems enable the reproduction and debugging of non-deterministic failures in production application runs. However, no existing replay system is suitable for datacenter applications like Cassandra, Hadoop, and Hypertable. On these large scale, distributed, and data intensive programs, existing replay methods either incur excessive production recording overheads or are unable to provide high fidelity replay. In this position paper, we hypothesize and empirically verify that control plane determinism is the key to record-efficient and high-fidelity replay of datacenter applications. The key idea behind control plane determinism is that debugging does not always require a precise replica of the original application run. Instead, it often suffices to produce some run that exhibits the original behavior of the control-plane—the application code responsible for controlling and managing data flow through a datacenter system.

A Rising Tide Lifts All Boats: How Memory Error Prediction and Prevention Can Help with Virtualized System Longevity
Back to Program
Memory is the most frequently failing component that can cause system crash, which significantly affects the emerging data centers that are based on system virtualization (e.g., clouds). Such environment differs from previously studied large systems and thus poses renewed challenge to the reliability, availability, and serviceability (RAS) of today's production site that hosts a large population of commodity servers. The paper advocates addressing this problem by exploiting memory error characteristics and employing a cost-effective self-healing mechanism. Specifically, we propose a memory error prediction and prevention model, which takes as input error events and system utilization, assesses memory error risk, and manipulates memory mappings accordingly (by page/DIMM replacement or VM live migration) to avoid potential damage and loss.

A Design for Comprehensive Kernel Instrumentation
Back to Program
Dynamic binary instrumentation (DBI) has been used extensively at the user level to develop bug-finding and security tools, such as Memcheck and Program Shepherding. However, comprehensive DBI frameworks do not exist for operating system kernels, thwarting the development of dependability and security tools for kernels. In this paper, we identify the key challenges in designing an in-kernel DBI framework and propose a design that addresses them.

3:30 p.m.–4:30 p.m.

Behavior-Based Problem Localization for Parallel File Systems
Back to Program
We present a behavior-based problem-diagnosis approach for PVFS that analyzes a novel source of instrumentation—CPU instruction-pointer samples and function-call traces—to localize the faulty server and to enable root-cause analysis of the resource at fault. We validate our approach by injecting realistic storage and network problems into three different workloads (dd, IO-zone, and PostMark) on a PVFS cluster.

What Consistency Does Your Key-Value Store Actually Provide?
Back to Program
Many key-value stores have recently been proposed as platforms for always-on, globally-distributed, Internet-scale applications. To meet their needs, these stores often sacrifice consistency for availability. Yet, few tools exist that can verify the consistency actually provided by a key-value store, and quantify the violations if any. How can a user check if a storage system meets its promise of consistency? If a system only promises eventual consistency, how bad is it really? In this paper, we present efficient algorithms that help answer these questions. By analyzing the trace of interactions between the client machines and a key-value store, the algorithms can report whether the trace is safe, regular, or atomic, and if not, how many violations there are in the trace. We run these algorithms on traces of our eventually consistent key-value store called Pahoehoe and find few or no violations, thus showing that it often behaves like a strongly consistent system during our tests.

Need help? Use our Contacts page.

Back to Program
Last changed: 9 Sept. 2010 jel