Book reviewUSENIX

 

Adrian Cockcroft with Richard Pettit
Sun Performance and Tuning: Java and the Internet
Sun Microsystems Press, 1998. ISBN 0-13-095249-4. Pp. 587. $38.00

Reviewed by Andrew Hume <andrew@research.att.com>

If you own or run a Sun system, then you must get this book. Nitpicking aside, there is such a wealth of information and experience presented here that it would be folly not to read this book. I'd prefer it wasn't so necessary to have this book to make my Sun perform well. I'd prefer that more than a few of the Sun support hotline troops had read it. Never mind; just get the book.

The book covers many areas: chapter 1 is a quick summary of tips and hints for those too impatient to read the whole book. Chapters 2 and 3 cover the general principles of performance management and measurement, including reviews of various commercial products. Chapters 4 and 5 cover "the Internet," or at least how to deal with http servers and Java. Chapter 6 covers the way too complicated area of how to optimize code, particularly with respect to instruction sets and architectures.

Chapter 7 covers high-level application tuning, that is, how an application deals with Solaris, such as tracing and file systems. Chapter 8 is a long and involved discussion of disks, RAID boxes, and controllers. Chapter 9 covers networking, and chapter 10 talks about processor analysis (mutexes, memory, CPU caches). Chapter 11 is a gory description of various Sun architectures, down to memory, I/O bus, and backplane issues. Chapter 12 talks about various system caches (generic, filesystem, and networking). Chapter 13 details how Solaris uses your RAM and swap space.

Chapter 14 covers many of the kernel algorithms pertinent to, or controllable by, the user and how to measure and tune them. Chapter 15 covers, sometimes in exasperating detail, the techniques for getting useful metrics out of Solaris. Chapters 16 and 17 cover the SymbEL language and environment, which is a domain-specific language aimed at performance monitoring and analysis. Appendix A is a terse summary of the useful kernel tunables, and appendix B is a compendium of useful references and resources. An index rounds out the book, which totals 587 pages.

In general, this book is a good, solid read; the information is accurate and well presented. The various sections on disks, in particular, are excellent, and the detailed example (pp. 186-194) of figuring out a small disk I/O problem is outstanding. The discussions of the various CPU and machine architectures, and how to optimize applications for them, is also very good. Finally, the se system is just a good idea, and I appreciated that all the scripts described in the book come with the se software and can be tried as you go.

It is rather a pity that such a good book has a few small problems. As I nitpick, if I seem a tad testy, it's only because I have been having a tuning battle with Solaris for several months.

The book is not well proofread. I picked five pages at random and found typos or other errors on two of them (for example, the contents line [page xv] for page 432). There is too much Sun official line for my taste. (How many times do we need to be told Sun bought Encore? And that Encore's RAID systems are really good? I would guess once, not three or four times.) And the gratuitous "official position" on dynamic libraries is very defensive in tone and doesn't address the real problems that it brings to applications. And the constant touting of Sun products wears thin really quickly. (If the RSM2000 RAID box is that good, why is there still a problem on reboots with Solaris not recognizing all the RSM2000 devices? The stock reply of "just one more reboot -r" sounds like it might have come from Redmond.)

All the maxims presented seem true, but several have important caveats missing. For example, on page 171, we have "It is pointless putting large amounts of memory inside a hardware RAID controller and trying to cache reads with it." Ordinarily, this is true; the disk buffer cache eliminates these references. Except, as Cockcroft mentions later on, if your performance requirements mandate using the disk as a raw or direct I/O device, then the buffer cache is not involved.

There is a fair amount of chest pounding on Sun performance, especially disk and file system. I simply don't believe any of it (although I'd like to). Claims are presented without any setup or measurement details. For example, on page 215, Cockcroft asserts the RSM2000 RAID can sustain 66MB/s. I've been trying to do this for a few months ­ the best I can get is 45MB/s. I'd love to know what the details are behind the 66MB/s (for example, reading or writing? striped? RAID? block size? kernel parameters?). And as for trying to read any file at 20-30MB/s (page 181), you will run into problems with any filesystem, not just UFS. It would really be helpful here if Cockcroft could actually give real details on how you might accomplish such speed. And although the StorageTek Redwood drives are fast, they are not 15MB/s (page 184). Their rated speed is about 11MB/s; on my system, we occasionally see 10MB/s or so.

Cockcroft exercised one of my pet peeves when he discusses ptime(1) on page 158; it provides "accurate and high-resolution process timing." He gives an example of comparing two timings: one from long ago where (apparently) usr+sys was .6 seconds, and a recent one where usr+sys was .014 seconds, and concludes a speedup of 43. This is such sloppy science, Cockcroft must have had a brain cloud when he wrote this. Ptime's precision is apparently 0.5 ms (although, charmingly, this is not stated on the manual page), but the accuracy is rather less than this. As an example, I ran the same command with ptime three times on my Sun E10000; the usr+sys were 28.2+14.5, 31.4+18.3, and 33.6+16.7. (I rounded up to the nearest .1s.) Here the accuracy is about ±10%, or 3s. So much for millisecond precision.

Finally, I regret Cockcroft didn't say more about variables I have found to be fairly important. The tunable maxphys specifies the maximum size of a physical I/O to disk; its default value is 128KB, which is way too low for many disks. The xcal column from mpstat represents, we are told, the number of cross-processor interrupts per second; what are good/bad values here? The answer probably varies by configuration, but I bet there is a heuristic, as there is for many other values (such as the scan rate sr).

Despite these problems, let me emphasize that I liked this book, and regard it as mandatory for any serious Sun user or administrator.

 

?Need help? Use our Contacts page.
First posted: 5th November 1998 jr
Last changed: 5th November 1998 jr
Issue index
;login: index
USENIX home