FeatureUSENIX

 

NT to the max...(NoT)

gunther

by Neil Gunther
<ngunther@ricochet.net>

Neil Gunther is founder and principal consultant for Performance Dynamics Company in Mountain View, CA. Dr. Gunther has worked in the Silicon Valley for 18 years. He is a member of IEEE, ACM, and CMG.

To NT or NoT to NT? As someone who analyzes the performance of large-scale UNIX servers, I finally decided to face this nagging question by attending the recent USENIX Windows NT Workshop in Seattle. Seattle's weather being in the 90s all week was as much of a surprise as having the trade names USENIX and NT in the same conference title! Surprises, however, turned out to be a theme of the conference.

Overall, I came away being much more impressed with NT than I had expected, and I'm very glad I attended. But there were some unexpected low points too, and it's one of those I would like to bring to your attention. It has do with a topic near and dear to the hearts of many UNIX sysadmins and others involved in managing large UNIX systems ­ server scalability [1]. It is also the next frontier for Microsoft.

Naturally, one presentation I was looking forward to was the keynote address by Jim Gray, "Windows NT to the Max ­ Just How Far Can It Scale Up?" Gray is a respected figure in the database community whose career spans companies like Tandem, DEC, and now Microsoft. Moreover, he's possibly best known as one of the leading evangelists [2] for standardizing database benchmarks that ultimately led to the formation of the Transaction Processing Performance Council (a.k.a. TPC).

Imagine my surprise to hear from Gray that NT could outperform UNIX by scaling to billions of transactions at prices much lower than UNIX platforms. Imagine my dismay at some of the subtle misinformation invoked to make a point! OK. You can't imagine my dismay: either you weren't there or the details were unfamiliar to you. Either way, I would like to examine some points in Gray's presentation with a view to separating the technical "wheat" from the marketing "chaff." Since this will require excursions off the beaten UNIX path, more details will appear in forthcoming ;login: articles. In this opening salvo, there is only space to highlight the issues I plan to revisit. Eventually, I hope this discussion will better help you understand the scalability of both NT and UNIX.

What's Wrong With This Picture?

So, what did the man say? Figure 1 represents OLTP (On-Line Transaction Processing) throughput using the TPC Benchmark-C. The TPC-C benchmark workload models inventory control in a distributed warehouse, and performance is measured in transactions per minute (or tpmC) [3]. The benchmark must be audited by the TPC and documented before being announced publicly; otherwise it's a technical violation of TPC rules.

Figure 1 appeared in Gray's presentation with the title: "NT Scales Better Than Solaris." Although not a usage violation of TPC data, the reader needs to exercise extreme caution when reading comparative charts of this type. The major variables in any TPC benchmark are:

  • Platform (e.g., Intel or Sun processors or disks)
  • Operating system (e.g., NT or Solaris)
  • RDBMS (e.g., database management software; SQL Server or Sybase)
  • Application (e.g., the TPC-C benchmarking workload)

The performance analyst's golden rule is: Only change one thing at a time [1]. In Figure 1 there are many things that are different.

No Disk File .

Figure 1. Comparative server scalability (no 4-way result was published by Sun)

Here are some changed variables to watch out for:

  • Are all SQLServer data points measured on the same Wintel architecture?
  • SQLServer and Sybase are not the same RDBMS.
  • Are there any Sybase results on Wintel platforms, and how do they compare?
  • Microsoft has full control over SQLServer code and its performance.
  • Why are there no TPC-C data points above a 6-way Wintel server?
  • Do other RDBMSs scale better than SQLServer on Wintel platforms?
  • How contemporaneous were these measurements?

Incidentally, Gray publicly blamed Intel-based hardware for limiting scalability performance above 6-way servers. We'll return to these points in a subsequent article.

Billions and Billions!

To give some idea of the future potential of large-scale NT servers, Gray reported on several scalability projects:

  • Online Atlas (1.1 million US place names with SPIN-3 images)
  • Tandem's "Two-Ton" (DSS workload on 64 CPUs with 2 TBytes on 480 disks)
  • Compaq "Debit-Credit" (OLTP workload on 140 CPUs and 900 disks)

The last of these projects apparently supports over 1 billion database transactions per day. These large-scale prototype systems are nontrivial to construct and the results in themselves are very impressive. Note that 1 billion transactions per day is about 700,000 transactions per minute. But Microsoft has apparently entered the Carl Sagan zone because these are not TPC-C transactions or any other TPC benchmark transactions. Why not? Since Gray did not explain this point to the audience, I asked him about it afterward. He told me they couldn't publish a TPC-C benchmark because SQLServer failed to handle "transparency" (a TPC benchmarking technical requirement).

Clearly, the intended message for the audience was that SQLServer is more than capable of exhibiting superior TPC-like benchmark performance; they just couldn't hack an official TPC number because of an annoying technicality. This is precisely the kind of self-promotional clandestine benchmarking that TPC was formed to discourage.

There is another problem in using a debit-credit workload. As the name indicates, debit-credit applies a simple ATM banking transaction like the TPC-A benchmark (now defunct), not the more complex transaction of TPC-C. A rule of thumb states that TPC-A throughput is about five times greater than TPC-C throughput [1]. One might guess that the Compaq/OLTP throughput would be more like 100,000 TPC-C equivalent transactions per minute. For historical reference, Tandem reported 20,000 tpmC over three years ago on a 120-node MIPS-R4000 Himalaya server.

Cluster (un)Availability

Attracted by the desire to support billions of desktop clients, Gray emphasized Microsoft's focus on cluster technologies. The essential idea is to strap multiple servers together using a high-performance interconnect network to enable both scalable performance and reliability (i.e., no single point of failure ­ see [1]). But the commercial cluster concept is not new.

There are several historical precedents for scalable clusters that support commercial database workloads: in particular, Tandem for OLTP, Teradata for DSS (Decision Support Systems), and the IBM Parallel Sysplex. More recently Siemens-Pyramid and Sun Microsystems have developed and are continuing to develop UNIX cluster-based database server products.

Despite Gray's claim that "NT clusters are easy," perhaps the most telling indicator of the current state of NT cluster technology was the failure to demonstrate 2-node failover! Another surprise: after the big wind-up it was the small stuff that bombed!

Conclusion

I hope my review has given you some sense of my surprise. Be aware that I am not anti-NT by any means. On the contrary, from my standpoint NT looks like modern UNIX (commodity Mach) with integrated windows ­ something I've wanted since my days at Xerox PARC. So, maybe it would be kinder and more accurate to retitle this opening piece: "NT to the Max . . . (NoT) . . . Yet." More, next time.

References

  1. N.J. Gunther, The Practical Performance Analyst, McGraw-Hill, 1997. In press. See <http://members.aol.com/CoDynamo/Book.toc.htm> for publication status.
  2. Anon et al., "A Measure of Transaction Processing Power," Datamation, 31(7): 112-118, 1985.
  3. The interested reader can learn more about the TPC-C and TPC-D at <www.tpc.org>.

 

?Need help? Use our Contacts page.
First posted: 3rd December 1997 efc
Last changed: 3rd December 1997 efc
Issue index
;login: index
USENIX home