8:00 am–9:00 am |
Thursday |
Continental Breakfast
Mezzanine East/West
|
9:00 am–10:20 am |
Thursday |
Session Chair: Matt Welsh, Google
Bryce Kellogg, Vamsi Talla, Shyamnath Gollakota, and Joshua R. Smith, University of Washington
Awarded Best Paper Wi-Fi has traditionally been considered a power-consuming communication system and has not been widely adopting in the sensor network and IoT space. We introduce Passive Wi-Fi that demonstrates for the first time that one can generate 802.11b transmissions using backscatter communication, while consuming 3–4 orders of magnitude lower power than existing Wi-Fi chipsets. Passive Wi-Fi transmissions can be decoded on any Wi-Fi device including routers, mobile phones and tablets. Building on this, we also present a network stack design that enables passive Wi-Fi transmitters to coexist with other devices in the ISM band, without incurring the power consumption of carrier sense and medium access control operations. We build prototype hardware and implement all four 802.11b bit rates on an FPGA platform. Our experimental evaluation shows that passive Wi-Fi transmissions can be decoded on off-the-shelf smartphones and Wi-Fi chipsets over distances of 30–100 feet in various line-of-sight and through-the-wall scenarios. Finally, we design a passive Wi-Fi IC that shows that 1 and 11 Mbps transmissions consume 14.5 and 59.2 µW respectively. This translates to 10000x lower power than existing Wi-Fi chipsets and 1000x lower power than Bluetooth LTE and ZigBee.
Deepak Vasisht, MIT CSAIL; Swarun Kumar, Carnegie Mellon University; Dina Katabi, MIT CSAIL We present Chronos, a system that enables a single WiFi access point to localize clients to within tens of centimeters. Such a system can bring indoor positioning to homes and small businesses which typically have a single access point.
The key enabler underlying Chronos is a novel algorithm that can compute sub-nanosecond time-of-flight using commodity WiFi cards. By multiplying the time-of- flight with the speed of light, a MIMO access point computes the distance between each of its antennas and the client, hence localizing it. Our implementation on commodity WiFi cards demonstrates that Chronos’s accuracy is comparable to state-of-the-art localization systems, which use four or five access points.
Adriana B. Flores, Sadia Quadri, and Edward W. Knightly, Rice University Mobile devices have fewer antennas than APs due to size and energy constraints. This antenna asymmetry restricts uplink capacity to the client antenna array size rather than the AP’s. To overcome antenna asymmetry, multiple clients can be grouped into a simultaneous multiuser transmission to achieve a full rank transmission that matches the number of antennas at the AP. In this paper, we design, implement, and experimentally evaluate MUSE, the first distributed and scalable system to achieve full-rank uplink multi-user capacity without control signaling for channel estimation, channel reporting, or user selection. Our experiments demonstrate full-rank multiplexing gains in the evaluated scenarios that show linear gains as the number of users increase while maintaining constant overhead.
Sanjib Sur, Xinyu Zhang, and Parmesh Ramanathan, University of Wisconsin—Madison; Ranveer Chandra, Microsoft Research Due to high directionality and small wavelengths, 60 GHz links are highly vulnerable to human blockage. To overcome blockage, 60 GHz radios can use a phased-array antenna to search for and switch to unblocked beam directions. However, these techniques are reactive, and only trigger after the blockage has occurred, and hence, they take time to recover the link. In this paper, we propose BeamSpy, that can instantaneously predict the quality of 60 GHz beams, even under blockage, without the costly beam searching. BeamSpy captures unique spatial and blockage-invariant correlation among beams through a novel prediction model, exploiting which we can immediately select the best alternative beam direction whenever the current beam’s quality degrades. We apply BeamSpy to a run-time fast beam adaptation protocol, and a blockage-risk assessment scheme that can guide blockage-resilient link deployment. Our experiments on a reconfigurable 60 GHz platform demonstrate the effectiveness of BeamSpy's prediction framework, and its usefulness in enabling robust 60 GHz links.
|
10:20 am–10:50 am |
Thursday |
Break with Refreshments
Mezzanine East/West
|
10:50 am–12:10 pm |
Thursday |
Session Chair: Shyam Gollakota, University of Washington
Srinivas Narayana, Mina Tahmasbi, Jennifer Rexford, and David Walker, Princeton University Measuring the flow of traffic along network paths is crucial for many management tasks, including traffic engineering, diagnosing congestion, and mitigating DDoS attacks. We introduce a declarative query language for efficient path-based traffic monitoring. Path queries are specified as regular expressions over predicates on packet locations and header values, with SQLlike “groupby” constructs for aggregating results anywhere along a path. A run-time system compiles queries into a deterministic finite automaton. The automaton’s transition function is then partitioned, compiled into match-action rules, and distributed over the switches. Switches stamp packets with automaton states to track the progress towards fulfilling a query. Only when packets satisfy a query are the packets counted, sampled, or sent to collectors for further analysis. By processing queries in the data plane, users “pay as they go”, as data-collection overhead is limited to exactly those packets that satisfy the query. We implemented our system on top of the Pyretic SDN controller and evaluated its performance on a campus topology. Our experiments indicate that the system can enable “interactive debugging”— compiling multiple queries in a few seconds—while fitting rules comfortably in modern switch TCAMs and the automaton state into two bytes (e.g., a VLAN header).
Victor Heorhiadi and Michael K. Reiter, University of North Carolina at Chapel Hill; Vyas Sekar, Carnegie Mellon University Realizing the benefits of SDN for many network management applications (e.g., traffic engineering, service chaining, topology reconfiguration) involves addressing complex optimizations that are central to these problems. Unfortunately, such optimization problems require (a) significant manual effort and expertise to express and (b) non-trivial computation and/or carefully crafted heuristics to solve. Our goal is to simplify the deployment of SDN applications using general high-level abstractions for capturing optimization requirements from which we can efficiently generate optimal solutions. To this end, we present SOL, a framework that demonstrates that it is possible to simultaneously achieve generality and efficiency. The insight underlying SOL is that many SDN applications can be recast within a unifying path-based optimization abstraction. Using this, SOL can efficiently generate near-optimal solutions and device configurations to implement them. We show that SOL provides comparable or better scalability than custom optimization solutions for diverse applications, allows a balancing of optimality and route churn per reconfiguration, and interfaces with modern SDN controllers.
Junaid Khalid, Aaron Gember-Jacobson, Roney Michael, Anubhavnidhi Abhashkumar, and Aditya Akella, University of Wisconsin—Madison Important Network Functions Virtualization (NFV) scenarios such as ensuring middlebox fault tolerance or elasticity require redistribution of internal middlebox state. While many useful frameworks exist today for migrating/cloning internal state, they require modications to middlebox code to identify needed state. is process is tedious and manual, hindering the adoption of such frameworks. We present a framework-independent system, StateAlyzr, that embodies novel algorithms adapted from program analysis to provably and automatically identify all state that must be migrated/cloned to ensure consistent middlebox output in the face of redistribution. We find that StateAlyzr reduces man-hours required for code modication by nearly 20✕. We apply StateAlyzr to four open source middleboxes and find its algorithms to be highly precise. We find that a large amount of, but not all, live state matters toward packet processing in these middleboxes. StateAlyzr’s algorithms can reduce the amount of state that needs redistribution by 600- 8000✕ compared to naive schemes.
Chang Lan, Justine Sherry, Raluca Ada Popa, and Sylvia Ratnasamy, University of California, Berkeley; Zhi Liu, Tsinghua University It is increasingly common for enterprises and other organizations to outsource network processing to the cloud. For example, enterprises may outsource firewalling, caching, and deep packet inspection, just as they outsource compute and storage. However, this poses a threat to enterprise confidentiality because the cloud provider gains access to the organization’s traffic.
We design and build Embark, the first system that enables a cloud provider to support middlebox outsourcing while maintaining the client’s confidentiality. Embark encrypts the traffic that reaches the cloud and enables the cloud to process the encrypted traffic without decrypting it. Embark supports a wide-range of middleboxes such as firewalls, NATs, web proxies, load balancers, and data ex- filtration systems. Our evaluation shows that Embark supports these applications with competitive performance.
|
12:30 pm–2:00 pm |
Thursday |
Open Networking Summit Expo Hall Lunch
Expo Hall A/B
|
2:00 pm–3:40 pm |
Thursday |
Session Chair: Wyatt Lloyd, University of Southern California
Seyed K. Fayaz, Tianlong Yu, Yoshiaki Tobioka, Sagar Chaki, and Vyas Sekar, Carnegie Mellon University Checking whether a network correctly implements intended policies is challenging even for basic reachability policies (Can X talk to Y?) in simple stateless networks with L2/L3 devices. In practice, operators implement more complex context-dependent policies by composing stateful network functions; e.g., if the IDS flags X for sending too many failed connections, then subsequent packets from X must be sent to a deep-packet inspection device. Unfortunately, existing approaches in network verification have fundamental expressiveness and scalability challenges in handling such scenarios. To bridge this gap, we present BUZZ, a practical model-based testing framework. BUZZ’s design makes two key contributions: (1) Expressive and scalable models of the data plane, using a novel high-level traffic unit abstraction and by modeling complex network functions as an ensemble of finite-state machines; and (2) A scalable application of symbolic execution to tackle state-space explosion. We show that BUZZ generates test cases for a network with hundreds of network functions within two minutes (five orders of magnitude faster than alternative designs). We also show that BUZZ uncovers a range of both new and known policy violations in SDN/NFV systems.
Colin Scott and Aurojit Panda, University of California, Berkeley; Vjekoslav Brajkovic, International Computer Science Institute; George Necula, University of California, Berkeley; Arvind Krishnamurthy, University of Washington; Scott Shenker, University of California, Berkeley, and International Computer Science Institute When troubleshooting buggy executions of distributed
systems, developers typically start by manually separating
out events that are responsible for triggering the
bug (signal) from those that are extraneous (noise). We
present DEMi, a tool for automatically performing this
minimization. We apply DEMi to buggy executions of two
very different distributed systems, Raft and Spark, and
find that it produces minimized executions that are between
1X and 4.6X the size of optimal executions.
Yuliang Li and Rui Miao, University of Southern California; Changhoon Kim, Barefoot Networks; Minlan Yu, University of Southern California NetFlow has been a widely used monitoring tool with
a variety of applications. NetFlow maintains an active
working set of flows in a hash table that supports flow
insertion, collision resolution, and flow removing. This
is hard to implement in merchant silicon at data center
switches, which has limited per-packet processing
time. Therefore, many NetFlow implementations and
other monitoring solutions have to sample or select a
subset of packets to monitor. In this paper, we observe
the need to monitor all the flows without sampling in
short time scales. Thus, we design FlowRadar, a new
way to maintain flows and their counters that scales to a
large number of flows with small memory and bandwidth
overhead. The key idea of FlowRadar is to encode per-
flow counters with a small memory and constant insertion
time at switches, and then to leverage the computing
power at the remote collector to perform network-wide
decoding and analysis of the flow counters. Our evaluation
shows that the memory usage of FlowRadar is
close to traditional NetFlow with perfect hashing. With
FlowRadar, operators can get better views into their networks
as demonstrated by two new monitoring applications
we build on top of FlowRadar.
Ítalo Cunha, Universidade Federal de Minas Gerais; Pietro Marchetta, University of Napoli Federico II; Matt Calder, Yi-Ching Chiu, and Brandon Schlinker, University of Southern California; Bruno V. A. Machado, Universidade Federal de Minas Gerais; Antonio Pescapè, University of Napoli Federico II; Vasileios Giotsas, University of California, San Diego/CAIDA; Harsha V. Madhyastha, University of Michigan; Ethan Katz-Bassett, University of Southern California Network operators measure Internet routes to troubleshoot problems, and researchers measure routes to characterize the Internet. However, they still rely on decades-old tools like traceroute, BGP route collectors, and Looking Glasses, all of which permit only a single query about Internet routes—what is the path from here to there? This limited interface complicates answering queries about routes such as "find routes traversing the Level3/AT&T peering in Atlanta," to understand the scope of a reported problem there.
This paper presents Sibyl, a system that takes rich queries that researchers and operators express as regular expressions, then issues and returns traceroutes that match even if it has never measured a matching path in the past. Sibyl achieves this goal in three steps. First, to maximize its coverage of Internet routing, Sibyl integrates together diverse sets of traceroute vantage points that provide complementary views, measuring from thousands of networks in total. Second, because users may not know which measurements will traverse paths of interest, and because vantage point resource constraints keep Sibyl from tracing to all destinations from all sources, Sibyl uses historical measurements to predict which new ones are likely to match a query. Finally, based on these predictions, Sibyl optimizes across concurrent queries to decide which measurements to issue given resource constraints. We show that Sibyl provides researchers and operators with the routing information they need—in fact, it matches 76% of the queries that it could match if an oracle told it which measurements to issue.
Matthias Vallentin, University of California, Berkeley; Vern Paxson, University of California, Berkeley, and International Computer Science Institute; Robin Sommer, International Computer Science Institute and Lawrence Berkeley National Laboratory Network forensics and incident response play a vital role in
site operations, but for large networks can pose daunting dif-
ficulties to cope with the ever-growing volume of activity and
resulting logs. On the one hand, logging sources can generate
tens of thousands of events per second, which a system supporting
comprehensive forensics must somehow continually ingest.
On the other hand, operators greatly benefit from interactive
exploration of disparate types of activity when analyzing an
incident.
In this paper, we present the design, implementation, and
evaluation of VAST (Visibility Across Space and Time), a distributed
platform for high-performance network forensics and
incident response that provides both continuous ingestion of
voluminous event streams and interactive query performance.
VAST leverages a native implementation of the actor model
to scale both intra-machine across available CPU cores, and
inter-machine over a cluster of commodity systems.
|
3:40 pm–4:10 pm |
Thursday |
Break with Refreshments
Mezzanine East/West
|
4:10 pm–5:30 pm |
Thursday |
Session Chair: Siddhartha Sen, Microsoft Research
Shivaram Venkataraman, Zongheng Yang, Michael Franklin, Benjamin Recht, and Ion Stoica, University of California, Berkeley Recent workload trends indicate rapid growth in the
deployment of machine learning, genomics and scientific
workloads on cloud computing infrastructure. However,
efficiently running these applications on shared infrastructure
is challenging and we find that choosing the right
hardware configuration can significantly improve performance
and cost. The key to address the above challenge
is having the ability to predict performance of applications
under various resource configurations so that we
can automatically choose the optimal configuration.
Our insight is that a number of jobs have predictable
structure in terms of computation and communication.
Thus we can build performance models based on the behavior
of the job on small samples of data and then predict
its performance on larger datasets and cluster sizes.
To minimize the time and resources spent in building a
model, we use optimal experiment design, a statistical
technique that allows us to collect as few training points
as required. We have built Ernest, a performance prediction
framework for large scale analytics and our evaluation
on Amazon EC2 using several workloads shows
that our prediction error is low while having a training
overhead of less than 5% for long-running jobs.
Asaf Cidon and Assaf Eisenman, Stanford University; Mohammad Alizadeh, MIT CSAIL; Sachin Katti, Stanford University Web-scale applications are heavily reliant on memory cache systems such as Memcached to improve throughput and reduce user latency. Small performance improvements in these systems can result in large end-to-end gains. For example, a marginal increase in hit rate of 1% can reduce the application layer latency by over 35%. However, existing web cache resource allocation policies are workload oblivious and first-come-first-serve. By analyzing measurements from a widely used caching service, Memcachier, we demonstrate that existing cache allocation techniques leave significant room for improvement. We develop Cliffhanger, a lightweight iterative algorithm that runs on memory cache servers, which incrementally optimizes the resource allocations across and within applications based on dynamically changing workloads. It has been shown that cache allocation algorithms underperform when there are performance cliffs, in which minor changes in cache allocation cause large changes in the hit rate. We design a novel technique for dealing with performance cliffs incrementally and locally. We demonstrate that for the Memcachier applications, on average, Cliffhanger increases the overall hit rate 1.2%, reduces the total number of cache misses by 36.7% and achieves the same hit rate with 45% less memory capacity.
Qifan Pu and Haoyuan Li, University of California, Berkeley; Matei Zaharia, Massachusetts Institute of Technology; Ali Ghodsi and Ion Stoica, University of California, Berkeley Memory caches continue to be a critical component to many systems. In recent years, there has been larger amounts of data into main memory, especially in shared environments such as the cloud. The nature of such environments requires resource allocations to provide both performance isolation for multiple users/applications and high utilization for the systems. We study the problem of fair allocation of memory cache for multiple users with shared files. We find that, surprisingly, no memory allocation policy can provide all three desirable properties (isolation-guarantee, strategy-proofness and Paretoefficiency) that are typically achievable by other types of resources, e.g., CPU or network. We also show that there exist policies that achieve any two of the three properties. We find that the only way to achieve both isolation-guarantee and strategy-proofness is through blocking, which we efficiently adapt in a new policy called FairRide . We implement FairRide in a popular memorycentric storage system using an efficient form of blocking, named as expected delaying, and demonstrate that FairRide can lead to better cache efficiency (2.6× over isolated caches) and fairness in many scenarios.
Mosharaf Chowdhury, University of Michigan; Zhenhua Liu, Stony Brook University; Ali Ghodsi and Ion Stoica, University of California, Berkeley, and Databricks Inc. In this paper, we study how to optimally provide isolation guarantees in multi-resource environments, such as public clouds, where a tenant’s demands on different resources (links) are correlated. Unlike prior work such as Dominant Resource Fairness (DRF) that assumes static and fixed demands, we consider elastic demands. Our approach generalizes canonical max-min fairness to the multi-resource setting with correlated demands, and extends DRF to elastic demands. We consider two natural optimization objectives: isolation guarantee from a tenant’s viewpoint and system utilization (work conservation) from an operator’s perspective. We prove that in non-cooperative environments like public cloud networks, there is a strong tradeoff between optimal isolation guarantee and work conservation when demands are elastic. Even worse, work conservation can even decrease network utilization instead of improving it when demands are inelastic. We identify the root cause behind the tradeoff and present a provably optimal allocation algorithm, High Utilization with Guarantees (HUG), to achieve maximum attainable network utilization without sacrificing the optimal isolation guarantee, strategyproofness, and other useful properties of DRF. In cooperative environments like private datacenter networks, HUG achieves both the optimal isolation guarantee and work conservation. Analyses, simulations, and experiments show that HUG provides better isolation guarantees, higher system utilization, and better tenant-level performance than its counterparts.
|
6:30 pm–8:00 pm |
Thursday |
Check out the cool new ideas and the latest preliminary work on display at the Poster Session and Happy Hour. Take advantage of an opportunity to mingle with colleagues who may be interested in the same area while enjoying complimentary food and drinks. The list of accepted posters is now available.
|