• Donate
  • Log In
Home
  • About
    • About
      • About Us
      • Our Board of Directors
      • Board Meeting Minutes
      • Board Elections
      • Updates & Announcements
      • Our Staff
      • Governance & Financials
      • Lifetime Achievement Award
  • Events
    • Events
      • Upcoming
      • Past
      • Conference FAQ
      • Conference Policies
      • Code of Conduct
      • Calls for Papers
      • Author Resources
      • Grant Opportunities
      • Best Papers
      • Test of Time Awards
  • Join & Support
    • Join & Support
      • Become a Member
      • Ways to Give
      • Our Supporters
      • Student Opportunities
      • Sponsorship Opportunities
  • Archive
    • Archive
      • Proceedings
      • Multimedia
      • ;login: Archive
      • Short Topics in System Administration Series
      • Journal of Education in System Administration (JESA)
      • Journal of Election Technology and Systems (JETS)
      • Computing Systems Journal
  • Search
Join the conversation
Back to ;login: Online

Understanding Software Dynamics

by Dick Sites
March 3, 2022
Bookreview
Authors: 
Rik Farrow
Article shepherded by: 
Rik Farrow

I started reading this book in December, and am still reading it as of March 2022. I needed that much time as there is a lot to digest in Sites' book. Also, I've enjoyed reading it, and like other books I enjoy reading, I often put it down when I've finished a section I want to spend more time thinking about.

While you might think that a book with this title would only be important to programmers, its audience should be a lot wider. SREs, operating systems designers, realtime systems designers and hardware designers will all find much useful information in this book. The author's focus is on uncovering the subtle causes of long tail-latencies, but there is much to learn here.

The book is divided into four parts. The first two parts, more than half the book, explains measurement and observation, tools and techniques needed to understand the design of KUTrace, but also providing great advice for SREs and programmers. In the first seven chapters, Sites demonstrates the importance of measuring the four major components of computer systems: CPU, memory, disks/SSDs, and network. He includes Jeff Dean's famous chart depicting the approximate time for completing various system activities, such as reading from L1 cache, from main memory on a cache miss, or time to read from a disk. Sites adds to this a column providing the order of magnitude for each of the times given.

Sites strongly encourages readers to estimate how long their systems should take to complete transactions. He starts with simple arithmetic program examples, running long loops to make operations requiring nanoseconds take long enough to easily measure, then pointing out where compiler optimizations will completely wipe out loops that do nothing further with their variables. In chapter five, he starts with a matrix operation, one that is memory bound, and shows how interference with how cache lines get chosen slows down the matrix multiplication. In the end, Sites has reorganized how data has been accessed, improving performance by an order of magnitude. He then challenges readers to improve his sample program to squeeze out another 20% performance gain as an exercise.

In Part two, observations, Sites provides clear information about how to log, collect, and display information. Following his theme of measurement, he points out how best to log data so as to avoid slowing down the very systems you need to observe. The focus is on being able to instrument systems in production, where the maximum slowdown acceptable must be 1% or less. Doing so involves how data is collected, how often, and how it is stored, with this strict focus on usability and efficiency. He goes as far as describing how data best gets used in dashboards.

While this might seem to have strayed far from programming, Sites points out that that you need the ability to accurately measure the systems you are observing, and having monitoring that distorts what you are measuring is useless in finding out where the issues that are creating long-tail latency are coming from. Profiling, for example, can show you where you code executes most often based on timer interrupts, and miss those rare occasions that are causing the very long tail latency that you are striving to uncover.

Part three describes the design of KUTrace, kernel-user trace. KUTrace is a complete tool chain that includes kernel patches, a kernel module, and tools for converting the millions of data points into comprehensible figures. There is a toolchain required for moving from the trace output, inserting log observations, converting time skews between systems, converting into JSON and creating an SVG and HTML page that can display the data in useful form using browsers.

Part four provides examples of using KUTrace. You can get a feel for these chapters by reading Site's June 2020 ;login: article that incorporates examples from the book (https://www.usenix.org/system/files/login/articles/login_summer20_05_sit...). This final section, entitled Reasoning, covers  execution, slow instruction execution, waiting for CPU, memory, disk, network, software locks, queues, and timers. Like other places in the book, I found words of wisdom here that anyone interested in improving the performance of software services can learn from. When thinking about whether running multiple instances of the same program, Sites writes:

Mixing programs that run well against themselves likely will encounter little if any interference.

That was in a chapter examining interference between compute-bound programs. It appears obvious, but so do a lot of things you can read in this book. And they aren't really obvious until they have been explained.

The HTML files created using the toolchain contain an incredible amount of information. The diagrams use symbols, text labels, 256 colors, and even Morse code, and it takes practice to make sense out of what you are seeing. The illustrations in both the print and electronic versions of the book are in color, but I sometimes need to magnify figures so I can see the details I was missing. For example, small, pointed triangles indicating the IPC at different points overlapped so closely that I couldn't make them out without magnification. Younger eyes may not have any trouble with the illustrations. When working with the HTML files in a browser, you can zoom in, mark areas of interest, as well as select other ways to display the data.

In the Preface, Sites mentions that he got many helpful suggestions while teaching graduate-level courses after retiring from Google. Unpacking this a bit, you can imagine that this is a book for graduate students and advanced, professional programmers, written by an older man who worked at Google for many years. I think that any senior CS student or professional can benefit by reading this book. While all the material in the first half of the book leads up to the use of KUTrace, the first two parts are worth reading on their own by anyone who wants to better understand the systems they are building and using.

Understanding Software Dynamics

by Richard L. Sites

Addison-Wesley, 2022, 465 pages

ISBN-13: 978-0-13-758973-9

Article Categories: 
Operating Systems
Programming
Hardware
Last updated February 8, 2023
Authors: 

Rik Farrow has been a consultant for 40 years. He has written two books, as well as worked as the technical editor for a UNIX magazine and for two editions of a popular operating system book. He also taught UNIX system administration and Internet security during the 90s internationally, and worked as a volunteer for USENIX program and steering committees. Rik has been the editor of ;login: since 2005.

[email protected]
  • Log in to post comments
USENIX logo
  • Contact USENIX
  • Privacy Policy

© USENIX 2025
EIN 13-3055038

Website designed and built by Giant Rabbit LLC
Powered by Backdrop CMS

We need contributions from individuals like you.

USENIX conferences directly influence the development of computing systems and products used worldwide. Contribute today to support this vital work for the next 50 years.

Secure the Future of USENIX

Donate
Close