FAST '17 Training Program

Our Guarantee

If you're not happy, we're not happy. If you feel a tutorial does not meet the high standards you have come to expect from USENIX, let us know by the first break and we will change you to any other available tutorial immediately.

Continuing Education Units (CEUs)

USENIX provides Continuing Education Units for a small additional administrative fee. The CEU is a nationally recognized standard unit of measure for continuing education and training and is used by thousands of organizations.

Two half-day tutorials qualify for 0.6 CEUs. You can request CEU credit by completing the CEU section on the registration form. USENIX provides a certificate for each attendee taking a tutorial for CEU credit. CEUs are not the same as college credits. Consult your employer or school to determine their applicability.

Training Materials on USB Drives

Training materials will be provided to you on an 8GB USB drive. If you'd like to access them during your class, remember to bring a laptop.

Monday, February 27, 2017

Half Day Morning

Write optimization refers to a set of techniques used to improve the performance of databases and file systems. Examples of write optimized data structures include Log-Structured Merge trees (LSMs) and Bε-trees. Systems that use such data structures include BetrFS, HBase, LevelDB, TableFS, TokuMX, TokuDB, etc.

This tutorial reviews write-optimization from the perspective of the analysis and engineering. We provide a framework for understanding which data structure will perform well on which workloads.

M2 Understanding Large-Scale Storage Systems
Updated!
Brent Welch, Google
9:00 am12:30 pm

This tutorial is oriented toward administrators and developers who manage and use large-scale storage systems. An important goal of the tutorial is to give the audience the foundation for effectively comparing different storage system options, as well as a better understanding of the systems they already have.

Cluster-based parallel storage technologies are used to manage millions of files, thousands of concurrent jobs, and performance that scales from 10s to 100s of GB/sec. This tutorial will examine current state-of-the-art high-performance file systems and the underlying technologies employed to deliver scalable performance across a range of scientific and industrial applications.

The tutorial starts with a look at storage devices and SSDs in particular, which are growing in importance in all storage systems. Next we look at how a file system is put together, comparing and contrasting SAN file systems, scale-out NAS, object-based parallel file systems, and cloud-based storage systems.

Topics include SSD technology, scaling the data path, scaling metadata, fault tolerance, manageability, and cloud storage. Specific systems are discussed, including Ceph, Lustre, GPFS, PanFS, HDFS (Hadoop File System), and OpenStack.

Half Day Afternoon

M3 Innovations, Challenges, and Lessons Learned in HPC Storage Yesterday, Today, and Tomorrow
Gary A. Grider, Los Alamos National Laboratory
John Bent, Seagate Government Solutions
1:30 pm5:00 pm

In this tutorial, we will introduce the audience to the lunatic fringe of extreme high-performance computing and its storage systems. The most difficult challenge in HPC storage is caused by millions (soon to be billions) of simultaneously writing threads. Although cloud providers handle workloads of comparable, or larger, aggregate scale, the HPC challenge is unique because the concurrent writers are modifying shared data.

We will begin with a brief history of HPC computing covering the previous few decades, bringing us into the petaflop era which started in 2009. Then we will discuss the unique computational science in HPC so that the audience can understand the unavoidability of its unique storage challenges. We will then move into a discussion of archival storage and the hardware and software technologies needed to store today’s exabytes of data forever. From archive we will move into the parallel file systems of today and will end the lecture portion of the tutorial with a discussion of anticipated HPC storage systems of tomorrow. Of particular focus will be namespaces handling concurrent modifications to billions of entries as this is what we believe will be the largest challenge in the exascale era.

The tutorial will end with a free-ranging audience directed panel.

Topics include:
  • A brief history lesson about the past 30 years of supercomputers
  • An understanding of what makes HPC computing unique and the entailing storage challenges
  • An overview of current HPC storage technologies such as burst buffers, parallel file systems, and archival storage
  • A glimpse into the future of HPC storage technologies for both hardware and software
  • Insights into unique research opportunities to advance HPC storage
M4 Persistent Memory Programming: Challenges and Solutions in Multiple Languages
Andy Rudoff, Data Center Group, Intel Corporation
1:30 pm5:00 pm

Both Windows and Linux now contain support for Persistent Memory, an emerging non-volatile memory (NVM) technology. Persistent Memory is available today in the form of NVDIMMs and is expected to explode in capacity in the near future. Unlike other NVM technologies, such as SSDs, Persistent Memory provides a byte-addressable programming model, allowing direct memory access like DRAM, but retaining its contents across power loss. Technologies such as Intel’s 3D XPoint are expected to provide terabytes of NVM per CPU socket, with performance near DRAM speeds. The result offers applications a new tier for data placement in addition to the traditional memory and storage tiers: the persistent memory tier. While there are numerous ways for an OS to leverage Persistent Memory in a way that is transparent to the application, converting an application to be "persistent memory aware" will allow the highest performance benefit.

This tutorial will start with the basic SNIA NVM Programming Model used by operating systems to expose Persistent Memory to applications. We will walk through code examples showing how applications get access to Persistent Memory and we will pay special attention to safe programming practices such as flushing to persistence, atomic operations, and writing power-fail safe code. We will look at CPU instructions designed for atomic operations, cache flushing, and fencing, and how they interact with Persistent Memory.

Next, the tutorial will provide a brief survey of available libraries, compilers, and research in this area. We will then walk through some more complex examples of persistent memory programming in C, C++, and Java. Using the open source NVM Libraries from http://pmem.io we will show how to solve the common programming pain points and how the higher-level languages can help avoid common persistent memory programming mistakes.

Topics include:
  • The SNIA NVM Programming Model
  • How the Intel Architecture Supports Persistent Memory
  • The Challenges of Persistent Memory Programming
  • The Current State of the Persistent Memory Ecosystem
  • Programming Using the NVM Libraries from http://pmem.io
  • C, C++, and Java Persistent Memory Programming Techniques