- FAST '13 Home
- Registration Information
- Registration Discounts
- At a Glance
- Training Program
- Technical Sessions
- Purchase the Box Set
- Posters and WiPs
- Birds-of-a-Feather Sessions
- Hotel and Travel Information
- Help Promote
- For Participants
- Call for Papers
- Past Proceedings
Full Training Program
Half Day Morning
Jeff Darcy (T1) has worked on network and distributed storage problems for twenty years, including an instrumental role in developing MPFS (a precursor of modern pNFS) while at EMC and leading the HekaFS project more recently. He is currently a member of the GlusterFS architecture team at Red Hat, coordinating the integration of HekaFS's features and leading the asynchronous-replication development effort.
The trend toward moving computation into the cloud has resulted in new expectations for storage in the cloud. This tutorial will provide information necessary to build your own cloud-appropriate storage system.
Primarily, people who wish to implement their own task-specific cloud storage systems. Secondarily, those who wish to understand why existing cloud storage systems have been designed the way they are, and what tradeoffs they have made to achieve their respective goals.
- New requirements: Application-level users of cloud storage have come to expect a variety of data and consistency/ordering models well beyond those provided by traditional file, block, or relational-database systems.
- New constraints: Systems deployed in the cloud are often characterized by low levels of trust (user/user and user/provider) and lack of hardware access of configuration flexibility.
- Techniques: Implementing a system to meet these new requirements and constraints will require a thorough knowledge of cluster and distributed-system techniques such as vector clocks, Merkle trees, Bloom filters, and various kinds of append-only storage.
- Case studies: Existing systems representing successful use of these techniques will be examined.
Jim Plank (T2) is a professor in the EECS department at the University of Tennessee. He has done research on fault-tolerant computing and storage systems for over 20 years. For the past eight years, his sole focus has been on the design, implementation, and performance of erasure codes in storage systems. He has published numerous papers on the topic, including a very popular tutorial on Reed-Solomon codes and a complete treatment of Minimum Density codes for RAID-6. His open-source libraries for Galois Field arithmetic and for general erasure-coding have been in widespread use by industry and academia.
Cheng Huang (T2) is a researcher at Microsoft Research, Redmond. He has worked extensively on erasure coding and invented technologies that have been incorporated in a wide variety of Microsoft products, such as in Lync for smooth video conferencing and in Xbox for bounding communication latency between consoles and the cloud. His latest work is LRC, a new class of erasure codes deployed in Windows Azure Storage, which saves the Microsoft Cloud millions of dollars (see http://research.microsoft.com/en-us/news/features/erasurecoding-090512.aspx) and also received the best paper award at USENIX ATC '12.
From disk arrays through clouds to archival systems, storage systems must tolerate failures and prevent data loss. Erasure coding provides the fundamental technology for storage systems to add redundancy and tolerate failures. This tutorial will cover the fundamentals of erasure coding, the mechanics of many erasure codes that apply to today's storage systems, and the properties of various erasure codes designed for a variety of storage scenarios.
- General matrix-based codes, starting with classic Reed-Solomon codes
- Galois Field arithmetic for erasure-coding, and how to implement it efficiently
- RAID-6 codes: RDP, EVENODD, Minimum Density, X-Code
- More general codes implemented with only the XOR operation: Generalized RDP/EVENODD
- Cauchy Reed-Solomon codes
- Open source library support for erasure codes
- The reconstruction problem and techniques to reduce bandwidth and I/O
- Regenerating codes
- Practical MDS codes with efficient reconstruction: Rotated Reed-Solomon
- Practical non-MDS codes with efficient reconstruction and their application in cloud storage: Pyramid codes, LRC and its deployment in Windows Azure Storage, PMDS
- Erasure coding for Flash
Half Day Afternoon
Dr. Sudipta Sengupta (T3) is currently at Microsoft Research, where he is working on data center systems and networking for cloud computing, non-volatile memory for cloud/server applications, data deduplication, and storage virtualization. Previously, he spent five years at Bell Laboratories, the Research Division of Lucent Technologies. His work on primary data deduplication will ship as a new feature in Windows Server 2012. His work on flash-memory based key-value stores has been incorporated in the data processing and serving pipeline for multiple properties in Microsoft's Bing system. He designed the network topology and routing algorithms for VL2, a low-cost, flexible, and agile next generation data center network, which has been deployed in Microsoft's cloud data centers.
Dr. Sengupta received the IEEE William R. Bennett Prize and the IEEE Leonard G. Abraham Prize for his work on oblivious routing of Internet traffic. At Bell Labs, he received the President's Teamwork Achievement Award for technology transfer of research into Lucent products. At Microsoft, he received the Gold Star Award which recognizes excellence in leadership and contributions for Microsoft's long term success.
Dr. Sengupta has taught advanced courses/tutorials at many academic/research and industry conferences. He has published 75+ research papers in some of the top conferences, journals, and technical magazines. He has authored 45+ patents (granted or pending) in the area of computer systems, storage, and networking. Dr. Sengupta received a Ph.D. and an M.S. from MIT (Cambridge, MA, USA) and a B.Tech. from IIT-Kanpur (India), all in Computer Science. He was awarded the President of India Gold Medal at IIT-Kanpur for graduating at the top of his class across all disciplines.
The tutorial will serve to introduce the state of the art in data deduplication systems for storage. We will make the presentation of most of the material self-contained. We expect attendees to have some background in the basic concepts of storage systems.
The storage market is witnessing unprecedented growth, with enterprise storage growing 50–60% per year and cloud storage growing even faster. Data deduplication is the #1 feature for which customers ask when they invest in storage solutions. Data deduplication detects and eliminates redundancies in data, with the benefits applying to both storage capacity savings ("data at rest") and network bandwidth savings ("data on wire"). In addition to taming the growth in storage total-cost-of-ownership, the storage capacity savings can help to make high IOPS devices like flash-based SSDs more feasible in terms of cost. The network bandwidth savings can help to mitigate WAN bottlenecks, thus enabling user-to-cloud and hybrid private-public cloud storage scenarios.
Backup data deduplication has been around for about a decade, championed by early startups in the space such as Data Domain. Recent developments bring data deduplication to the more expensive and faster primary storage tier, where deduplication space savings is more valuable, translating to reductions in the amount of data that needs to be replicated, geo-replicated, cached, backed up, and transferred over the network.
In this tutorial, we will survey technologies in the data deduplication area at both the algorithmic and systems levels. We will follow the progression of ideas over time and identify current trends in research and industry. We will outline the challenges that need to be addressed going forward. Topics covered will include research aspects of the entire data deduplication pipeline—data chunking, data indexing, primary data access, storage maintenance operations—as well as case studies of commercially deployed systems.
Graduate students and researchers working in the areas of storage, enterprise computing, cloud computing, and enterprise/Web services; practicing storage professionals in the technology industry, especially in enterprise and cloud data center space.
Dr. Sandeep Uttamchandani (T4) is the Technical Director for Storage at VMware. Sandeep has worked on a wide variety of enterprise storage products and technologies, as well as has been closely involved in operational management of petabyte scale deployments for large business critical deployments. Sandeep holds 22 issued patents, and has 28 peer-reviewed publications in key storage conferences including FAST, USENIX ATC, and SIGMOD. Prior to VMware, Sandeep was the Chief Architect for Advanced Storage Technologies at IBM GTS, and was responsible for shaping the technical storage strategy for a $10B services business. Previously, Sandeep was Master Inventor at IBM Storage Research Center at Almaden. He holds a Masters and PhD in Computer Science from University of Illinois at Urbana Champaign (UIUC).
The key objective of this tutorial is to provide an understanding of how the design choices made for the key building blocks (such as metadata service, replication, locking, etc.) impacts the overall properties of the shared nothing storage architecture, and also mapping it to the application data model and storage workload requirements.
Eric Brewer coined the CAP theorem to convey that the design of a scale-out system involves trade-offs. CAP is commonly oversimplified to mean that between Consistency, Availability, and Partition tolerance, only two of the three attributes can be realized in a system. In general, the architecture of any shared nothing scale-out storage involves a collection of design choices and trade-offs that ultimately dictate the observable behavior of the system. Following are some choices involved in the design of a shared nothing storage solution:
- Data locality versus cluster scalability?
- Master versus masterless metadata architectures?
- Locking versus multi-version concurrency control?
- Strong versus eventual versus weak consistency?
- Replication versus RAID?
- Node-to-node communication: UDP versus TCP versus RDMA?
- Two-phase commit versus Paxos versus Multi-Paxos?
- In-memory data grids versus disk-based DAS architectures?
- Data models: ACID versus BASE (Basically Available, Soft state, Eventually consistent)?
We will start the tutorial with a bare-bones skeleton of the architecture, then incrementally populate the building blocks. For each building block, we discuss popular design choices, followed by an interactive discussion on the implications of mix-and-match of these building blocks (for example, matching coarse-grained data sharding for better data locality performance, with appropriate patterns for scaling and distributed data recovery). The tutorial assumes a basic knowledge of distributed systems. Additionally, to better appreciate the under-the-hood exploration, we expect an awareness of the cloud storage landscape, and a high-level understanding of the popular solutions.
Storage architects, engineers, administrators, and students, who are interested in a deep-dive of building blocks and design patterns of software-defined shared nothing storage architectures (a.k.a. cloud storage).