################################################ # # # ## ## ###### ####### ## ## ## ## ## # # ## ## ## ## ## ### ## ## ## ## # # ## ## ## ## #### ## ## ## ## # # ## ## ###### ###### ## ## ## ## ### # # ## ## ## ## ## #### ## ## ## # # ## ## ## ## ## ## ### ## ## ## # # ####### ###### ####### ## ## ## ## ## # # # ################################################ The following paper was originally published in the Proceedings of the Tenth USENIX System Administration Conference Chicago, IL, USA, Sept. 29 - Oct. 4,1996. For more information about USENIX Association contact: 1. Phone: (510) 528-8649 2. FAX: (510) 548-5738 3. Email: office@usenix.org 4. WWW URL: https://www.usenix.org Shuse: Multi-Host Account Administration Henry Spencer - SP Systems ABSTRACT At the beginning of 1995, Sheridan College urgently needed an organized way of administering a large number of user accounts spread across multiple Unix systems. With 6000+ accounts on a network that had recently undergone dramatic and ill-coordinated growth, the situation was already nearly unmanageable; with the user population forecast to double in autumn, disaster loomed. NIS served reasonably well for the simple task of distributing password files, but maintaining the master copy was proving problematic, creating directories and configuration files for new users was a very ad-hoc process, and there was no obvious place to record assorted supplementary information. The response was to create a new software package, dubbed ``Shuse'' for ``Sheridan user management''. A central daemon maintains the user database, which is in a fully extensible text- based format. Rather than use a commercial database package, the daemon simply keeps the entire database in its (virtual) memory, and the master copy on disk is optimized for rapid updates rather than efficient access. (RAM is cheaper than database packages nowadays.) Update requests go to the central daemon; it invokes auxiliary processes on other hosts as necessary to create, destroy, and move user files. Shuse is written essentially entirely in Expect, an extended variant of Tcl. Inter-host communication is done by using Expect's process-control primitives to fire up telnet processes; bulk data transfer is done via NFS. About 100 lines of C code, in three small auxiliary programs, provide services that are not present in Expect. A not-accidental byproduct of this approach is near- automatic portability and correct functioning even in a heterogeneous network. Shuse is in operational use, currently administering over 20,000 user accounts (the forecasts were low). Various problems were encountered along the way, some easily solved and some requiring considerable unforeseen effort. The use of Expect has been a clear success, performance problems were easily resolved, and the central-daemon approach has worked well. The Problem Sheridan College[1] is a large community college, with several campuses located in the outer suburbs of Toronto. ---------------- [1]The one area where LISA attendees might perhaps have heard of Sheri- dan is that its computer-animation program has an international reputation. Its computing facilities are centered on a set of DEC Alphas running DEC UNIX (formerly named OSF/1), although there are also large numbers of PCs, a scat- tering of high-end graphics machines for the animation courses, and a wide variety of odds and ends (everything from 486/Pentium boxes running BSD/OS to one or two moldering VMS machines). The facilities have expanded enormously in the last few years, and there has been only limited advance planning on how to deal with the rapid growth. Sheridan has been moving steadily toward an accounts-for-everyone policy, exacerbating the usual difficulties of large numbers of accounts on multiple hosts. In early 1995 there were over 6000 users; this was forecast to double within the year. Many of the users have very little experience with computing, especially shared multi-user computing, and the combination of heavy course- loads, non-technical backgrounds, and high turnover limits what can be done with user education. (Instructions like ``change your password by doing an rlogin to server so-and-so and running yppasswd'' are worse than useless.) By the beginning of 1995, rapid growth and limited planning had made the situation almost intolerable. Funding shortages prevented major increases in staff, routine maintenance chores like creation of new accounts were absorbing large amounts of staff time (to the point where software problems were not getting solved because nobody had time), and continued growth threatened total collapse. Improvements were urgently needed, and in particular, really had to be in place before the September 1995 onslaught. At this point, I was brought in to Do Something About It. Some potential difficulties were not present. All hosts do share a common password file, partly because they share file systems quite extensively via NFS. While there is a lot of heterogeneity ``around the edges'', the major servers are all the same type of machine running the same operating system. (This didn't help as much as one might think, however, because it was not clear that this simple situation would persist.) The number of hosts involved as major participants is quite small: the primary problem was large numbers of users, not large numbers of computers. Finally, the network and the major servers have been reasonably reliable, and it was decided that there is no requirement to preserve full functionality in the presence of dead servers or partitioned networks. Existing Software An admittedly rather cursory look at existing solutions to this problem revealed little that seemed helpful. Some system suppliers offer proprietary account-management software for networks of their machines, but Sheridan's network was already slightly heterogeneous and might easily become more so, so a system-specific approach was unattractive. The supplier packages also have an annoying habit of being menu-driven GUI-based interactive programs, which may be easy to use when cre- ating a single account, but are severely unsuited to environments where 5000 accounts must be created in a week or two. Besides, DEC would be the only reasonable supplier for such a thing in this case, and DEC didn't appear to offer anything suitable. The MIT Athena project has a Service Management System [1] which addresses this problem. (Indeed, it is somewhat similar to what we eventually built.) Unfortunately, it relies on a commercial database package and on other Athena software, and this didn't look like it was going to drop easily into Sheridan's existing environment. At the time, we were not aware of GeNUAdmin [2] or AGUS [3], which might perhaps have been suitable. Finally, one thing that was very clear was that Sheridan wanted a defini- tive solution to the problem, not a temporary bandage for the wound. This precluded various ad-hoc solutions which might have postponed the crisis at the cost of more effort later. Design Although some complications have been added to the original design, the basic elements have been fairly stable. The fundamental approach was largely determined by consideration of one issue: coordinating updates. The orthodox way to do this, in a shared system, is some kind of locking protocol... but that presents problems in an NFS envi- ronment. None of the usual Unix file-based locking techniques work reliably with NFS's shoddy imitation of Unix file semantics. Reliable locking in such an environment requires using a supplementary protocol to consult a daemon somewhere; this is the approach taken by NFS's own locking primitives, but unfortunately they are notoriously buggy. Given that we were going to have to implement our own daemon anyway, the obvious approach was just to have it do all the work. Locking is unnecessary when all requests are funneled through a single ``secretary'' process. A ded- icated central-information-server system was available to serve as the dae- mon's host, and its reliability and uptime were sufficiently good that the extra complications of distributed redundancy could be avoided. So we decided to implement a single central daemon, which would respond to queries, perform database updates, and invoke subordinates on other servers as necessary. With this approach, synchronization is a non-issue, since only one pro- cess ever modifies the database. At least for starters, we decided to avoid re-introducing concurrency via threads: there is a single stream of control, operating in an event-based loop. The payoffs are a complete absence of lock- ing overhead, fast updates, and vastly easier debugging (that last being par- ticularly attractive in view of the hard deadline). This approach was even more attractive because it permits a very useful optimization: the central daemon can be left running permanently, and can sim- ply cache the entire database in its memory. One might think that this tech- nique would be suitable only for small databases, but done well, it scales up quite effectively - memory is cheap. In any case, a brief analysis indicated that the amount of information for a particular user was unlikely to exceed a few hundred bytes, and this would require only a few megabytes for the expected user population. (More importantly, some quick tests showed that nothing dire would happen if this prediction was significantly exceeded.) Obviously, it is still necessary to have an on-disk copy of the database, so that updates will survive both planned downtime and crashes. With all read-only accesses satisfied from the daemon's memory, the on-disk copy can be optimized for cheap and simple updates rather than rapid access to large amounts of data. After some quick feasibility testing, we decided to simply make each user's data a separate file in a simple text format. This does make the daemon's startup rather slow, since it has to read thousands of tiny files, but with a permanently-running daemon this is not needed often. Exper- iments indicated that on an otherwise-quiet system, reading 10-20,000 small files took only a few minutes, which seemed tolerable for a relatively rare event. We considered subdividing the files into a directory tree, but experi- ments indicated that just keeping them all in one directory was quite workable on a modern system. For the file format itself, we briefly considered various extended ver- sions of the classical passwd-file format, but decided against it. Any format with a fixed number of fields suffers when requirements change, as witness all the creative things that have been done with the ``GCOS'' field in the passwd file. While it would be necessary to generate a passwd file from the database, we wanted the database itself to be flexible and extensible, so it could contain all the information about a user and would not need supplement- ing with auxiliary databases as new needs appeared. We did opt for a text- based format, partly ----------name------spencerh-------------------------------------- passwd 76hgfu645fmvt passwd@ 806860944 (Thu Jul 27 12:02:24 1995) spencerh uid 8172 gid 15 home /home/apollo/it/spencerh shell /bin/sh server it schema n status active status@ 807477421 (Thu Aug 3 15:17:01 1995) root fullname Henry Spencer workphone 1-416-555-4444 office E108 mailname henry.spencer changed 807988782 (Wed Aug 9 13:19:42 1995) root Figure 1: User database entry just for simplicity, partly because this makes a wide variety of Unix tools useful for setting up the database or doing emergency surgery on it. A user's database file looks something like Figure 1. The server field contains a code identifying the system the user's home directory resides on; there is a sepa- rate control file which maps server codes to host names, to simplify changes in host configuration. The schema field contains a code indicating how to build an initial home directory for a new user; it is passed as an argument to the script that actually builds the directory. Fields like office and work- phone contain information that is assembled into a suitable ``GCOS'' field for the passwd file; since we want the Shuse database to be the primary database, not a derived one, we store the information broken down by meaning, instead of a preformatted version appropriate to one specific version of the passwd file. A single centralized daemon could not do the entire job. In particular, when creating or deleting users, it would be necessary to operate as root on the host holding the user's home directory, and the limitations of NFS required that to be done locally. This was also necessary for a different rea- son: Sheridan mounts only subsets of its filesystems on its individual servers, so the central server host cannot necessarily see the filesystem which would have to be updated. It seemed that it would be necessary for the daemon to invoke an auxiliary program on the other, ``slave'', servers. (Hav- ing this done as needed by the daemon, rather than at regular intervals by cron, would propagate updates more quickly and avoid unnecessary overhead.) We decided that the auxiliary program would read a description of what users should be on its host, then examine the host to find out which users were actually present, and then act to correct any discrepancies. This seemed likely to produce more robust operation than having the daemon send update commands to the auxiliary. At this point we were starting to need names for the programs. We dubbed the whole system ``Shuse'' (pronounced like ``shoes''), the daemon ``shused'', and the slave-server auxiliary program ``shusetie''. The one other piece of machinery which had to be fitted in, somehow, was a user interface for talking to shused. The actual user interface was a some- what secondary concern, especially since it seemed that there might have to be more than one, but we needed a way to talk to the daemon. To simplify imple- mentation and separate the major concerns somewhat, we decided to have a sepa- rate ``gatekeeper'' program, shusedgate, invoked by inetd as required. The gatekeepers implement whatever authentication of credentials is appropriate, and then pass commands to the daemon and responses back, communicating with the daemon via a set of FIFOs. Apart from network interface and authentica- tion, their role is to enforce timeouts on interaction and shield the daemon from possible interference by uncooperative users. When network connections get involved, security is an obvious worry. The right way to deal with this in the long run, clearly, is with encryption. As a stopgap measure, since Sheridan already relied heavily on NFS being trust- worthy[2], ---------------- [1]The one area where LISA attendees might perhaps have heard of Sheri- dan is that its computer-animation program has an international reputation. ---------------- [2]Not necessarily a safe assumption, but that's the way it was. we decided that network connections would pass only ``hey, wake up and look at this'' requests, with all crucial information being passed via the file system. Although the initial user community was to be mostly the system adminis- trators, a fairly flexible permission scheme was clearly desirable. At one extreme, users had to be able to change their own passwords (we briefly con- sidered using more traditional mechanisms for that, but decided that having Shuse handle everything was simpler than dividing the responsibility). At the other extreme, the system administrators had to be able to make fairly arbi- trary changes. And there are a variety of interesting levels in between, such as help-desk personnel, who should be able to interrogate the database and do some limited operations like changing passwords but should not be permitted to do more drastic alterations. To provide a flexible permission scheme, we tag each daemon operation with a ``category'', and a control file specifies which categories of opera- tions are open to which users. At one extreme, a few read-only operations are in category ``harmless'' and are available to everyone. At the other, arbi- trary editing operations and the ability to shut down the daemon are in cate- gory ``overlord'', which is restricted to a small set of users calling only from the central server machine itself. Implementation Approach With the design outlined, implementation started. The main constraint on it was that Shuse simply had to be functioning for the September 1995 student intake. With some trepidation, we decided that Shuse would be written essentially entirely in Tcl [4,5]. Experiments suggested that performance would be ade- quate, and the use of a very-high-level language looked like it would speed up development considerably. Crucial portions could always be re-coded in C if necessary. This basic approach was quite successful in earlier projects [6], and Tcl seemed a better choice than the Unix shell (which tends to be slow and clumsy unless the task at hand is suited to the Unix utilities) or Perl (which tends to be ugly and unmaintainable). We quickly settled on Expect [7,8], one of the most popular Tcl exten- sions, rather than ``raw'' Tcl. Initially, this was done because we antici- pated uses for Expect's ability to start and control other processes. It turned out that Expect also has a number of small amenities which make it a more complete programming environment than Tcl, which was envisioned as a min- imal extension language rather than an independent programming language. (For example, Expect can catch Unix signals.) We used Expect's special I/O primitives in fairly minor ways (although see the discussion of shuselace later). The one area where we used Expect's facilities more seriously was in calls to slave servers, to invoke shusetie. The actual invocation of shusetie was done by inetd on the slave server, but the daemon did have to be able to make a call across the network. Rather than add primitives for this, or adopt one of the existing Tcl networking exten- sions[3], ---------------- [1]The one area where LISA attendees might perhaps have heard of Sheri- dan is that its computer-animation program has an international reputation. ---------------- [2]Not necessarily a safe assumption, but that's the way it was. ---------------- [3]Tcl itself acquired networking primitives in release 7.5, early in 1996, but that was about a year late for Shuse. we used Expect's primitives to invoke telnet, specifying a non-standard port number to reach shusetie instead of telnetd. This may sound a little ugly, but in fact it is quite simple and practical, and has the bonus that portability is almost automatic: all the system-specific complications of networking are invisible. To make Shuse easier to maintain, each of its programs reads in a config- uration file as part of startup. Rather than parsing the configuration file and interpreting its contents, the program simply ``sources'' it, running it as part of the program's own Tcl source. This makes it possible for the configuration file to contain not only the obvious variable settings - permis- sions, pathnames, etc. - but also Tcl procedures. As a case in point, the pro- cedure that builds a passwd line from a Shuse database entry resides in shused's configuration file, so that arbitrary changes in passwd format can be accommodated without diving into the main sources. Gritty Details Certain aspects of the implementation posed unexpected difficulties. We anticipated some of these, while others came as unpleasant surprises. (In some cases, we accepted marginally-satisfactory early solutions simply to get Shuse functioning in time; not all of the issues mentioned here were fully resolved for September.) We had originally envisioned that extracts from the database, such as the passwd file or the user lists for shusetie, would simply be assembled by shused as necessary. It turns out that digging through the whole database every time such a thing is needed is relatively costly. This means that (for example) building a passwd file is expensive, and looking up a user by student number or mailbox name is very slow. Moreover, the information rarely changes much, so rediscovering it each time is wasteful. Shused now builds internal auxiliary databases at startup, and updates them accordingly when relevant information changes. In some cases, the more costly updates are postponed until shused appears to be idle. For example, after startup shused uses idle time to build a copy of each user's passwd line, and keeps those copies around. This permits pumping out a complete copy of the passwd file in a few seconds, when needed. The single-threaded design of the daemon is awkward when long-running chores have to be done, because an interactive request should not be delayed arbitrarily waiting for such a chore to finish. Long-running chores must either be broken up into small pieces, so that interactive requests need not wait too long, or be farmed out to auxiliary processes to take them out of the critical path entirely. A particular problem area is that an update of a slave server can be very slow. The bigger slave servers contain thousands of home directories, and merely enumerating them all for comparison with a user list is a slow opera- tion when user load is heavy. The biggest performance problem in early Shuse operations was long delays in interactive requests when shused was waiting for responses from shusetie running on a slow slave. When the systems were busy, the extremes of the response time were utterly unacceptable. We briefly considered multi-threading shused, but apart from certain practical problems - it's not something Tcl does well - it seemed unnecessar- ily general for what was, after all, a somewhat specialized problem. We tack- led this one from the other end: during startup, shused spins off a ``flunky'' process, dubbed shuselace[4], ---------------- [1]The one area where LISA attendees might perhaps have heard of Sheri- dan is that its computer-animation program has an international reputation. ---------------- [2]Not necessarily a safe assumption, but that's the way it was. ---------------- [3]Tcl itself acquired networking primitives in release 7.5, early in 1996, but that was about a year late for Shuse. ---------------- [4]In retrospect, we should probably have named the slave-server program shuselace and the central-server flunky shusetie, since the flunky manipulates the slave-server programs rather than vice versa, but it's too late now. which does all calls to the slave servers. The flunky makes calls to shused using (almost) the standard user command interface, with minor special privileges. (Expect's I/O primitives make it trivial for shused to listen for input from two sources instead of one.) Shused itself maintains a queue of work to be done by the flunky, and provides ``user'' interfaces which do things like removing one item from the queue. The flunky uses a shused command to pick up a work item to be done (e.g., ``update server nova''), goes away and does it (taking as long as necessary), and then uses another shused com- mand to report success or failure. The first version of the flunky was mostly code transplanted intact from the innards of shused, and setting it up took only a day or two's work. It was entirely successful, and response time has never again been a significant problem. During development of Shuse, we were generally preoccupied with the dae- mon and its auxiliaries, and did not give much attention to the user inter- face. We obviously needed some sort of command interface to test the daemon, so a simple program that sends the daemon a single command gradually appeared, more as a debugging tool than a finished user interface. Naturally enough, it was fairly promptly pressed into service as a user interface. While it is somewhat inefficient - for bulk operations, one would prefer to be able to send the daemon more than one command at a time - it works sufficiently well that there has been little incentive to replace it. In particular, it is exactly what is wanted for writing scripts. The one additional user interface that had to be provided was a naive- user password changer. We re-implemented the passwd command (and yppasswd as well) as an Expect script that requests old and new passwords, does appropriate checks[5], and then calls shused to make the change. ---------------- [1]The one area where LISA attendees might perhaps have heard of Sheri- dan is that its computer-animation program has an international reputation. ---------------- [2]Not necessarily a safe assumption, but that's the way it was. ---------------- [3]Tcl itself acquired networking primitives in release 7.5, early in 1996, but that was about a year late for Shuse. ---------------- [4]In retrospect, we should probably have named the slave-server program shuselace and the central-server flunky shusetie, since the flunky manipulates the slave-server programs rather than vice versa, but it's too late now. ---------------- [5]We note that it is vastly easier to change or improve the is- this-a- good-password test when the program is written in a very-high- level interpretive language! (Naturally, shused itself also does some checks before permitting the change!) This required adding another auxiliary C program, 30 lines of code which invokes the password-encryption routines and outputs the result. The idea of making slave-server updates idempotent, by having shusetie compare existing users against a list of users who should be there, was a good one. It turned out to be a bit harder to implement than we expected. For one thing, it's purely and simply difficult to enumerate all the home directories on a server unless the server's directory structures are laid out to make this easy. For another, the comparison approach handles additions and deletions relatively easily, but can't be gracefully extended to handle moving or renam- ing users. We ended up doing substantial revisions to the structure of both shuselace and shusetie to implement a more general command facility within them, so shused could order specific operations done. Moreover, this involves some relatively fancy footwork to ensure that such operations are not lost if one of the servers crashes at an inopportune moment, and also some slightly more sophisticated authentication to assure shusetie that the thing sending commands is really shuselace. One particular problem in the implementation of shusetie was disk quotas. The so-called user interface of the quota system is a disgrace to Unix: inflexible, interactive only, and completely lacking in reasonable primitives for system administration. To cap it off, DEC reinvented the wheel here: when they implemented a new filesystem type, instead of extending the existing quota commands to handle it, they added a new set in parallel - with the same crippling deficiencies in functionality, and some unhelpful changes in data format - so that on a mixed system, you may need to edit a user's disk quotas twice, with two different commands, to get them all! Fortunately, Expect came to the rescue here. Since the quota facility does at least let you chose which editor you want to use to edit the quota data, we originally thought we'd just have it invoke ed, which we could drive with an Expect script. In the end, it turned out to be simpler to move some of the intelligence into the editor, so shusetie now manipulates the environment to make the quota commands invoke a little customized editor written in Expect. There is still the annoyance of having to do this twice, via two dif- ferent sets of quota commands, but some careful design of the editing primi- tives made it possible to do this fairly painlessly. Successes The bottom line is: it works. We're still discovering things that need improvement, but the September 1995 crisis was averted, and as of May 1996 the system was managing over 20,000 user accounts. (The predictions turned out to be low - instead of doubling, the user base more than tripled.) Response time is good since the implementation of shuselace, and the staff workload for rou- tine administrative chores is declining. The central-daemon approach is a weak point in theory but it seems to be adequate in practice. Our opinion is that unless unusual requirements are pre- sent, it's better to put effort into making a central server reliable than into making the software do without one. Compared to a more distributed approach, a single central daemon enormously simplifies debugging, synchro- nization, and management. Using Expect was a big win. We couldn't possibly have met the schedule using C or the equivalent; in fact, we barely met it using Expect. Very few of the problems were a consequence of the interpretive language, and many of the rapid and simple solutions were a consequence of it. Although Tcl, and hence Expect, is extensible, we did not find it neces- sary to do this. The option to add language extensions written in C always existed, but in practice we found that the few missing primitives were more easily implemented as separate programs, invoked as needed. For example, the gatekeeper invokes a 30-line C program, which does a getpeername() and a geth- ostbyaddr() and prints the result: the name of the host an incoming call is from. The in-memory-database approach works well. We did end up adding some more RAM to the central server. (We note with some annoyance that conventional system interfaces are too ready to page out seemingly-idle process memory, and don't provide a way to say ``let this process make as much of a pig of itself as it wants, and don't page it out unless you really must''.) The response time for simple database queries is entirely dominated by the communications arrangements. The update performance of the file-per-user on-disk database has been excellent, and although it takes several minutes for the daemon to start up and read in all those files, this is a minor nuisance rather than a serious problem. We have occasionally contemplated implementing facilities to dump out the daemon's in-memory database in some form that would permit rapid reload- ing, given that most daemon restarts are planned, but to date it hasn't been worth the trouble. The extensible text-based format of the database entries themselves has permitted a number of unplanned additions and amendments. There will surely be more. While interest in sophisticated user interfaces, e.g., for the help desk, remains, the simple send-one-command interface has been amazingly successful. In particular, an extensive body of scripts has grown up to reflect local pol- icy and frequently-run database operations. We very strongly believe that we made the right decision: do the command-line interface first, leave the fancy graphics for later. Problems Not everything went smoothly. Apart from the implementation difficulties mentioned earlier, some broader issues deserve mention. As one might predict, the customer wishlist changed and grew once an ini- tial system was operating. Things that weren't even mentioned in the original specifications turned out to be major issues that needed substantial reworking of the software. For example, the original design included a very simple facility for automatically executing commands at specific times, vaguely mod- elled on the Unix at command, and this saw such extensive use that some major re-engineering work was needed to make it more practical and efficient. The original design had little ad-hoc protocols for each communications path. Only the protocol used between the gatekeeper and the daemon was fully fleshed out and pinned down. Since then, many of the paths which originally needed very little sophistication have grown to need the full nine yards; for example, shusetie now provides a full command interface to shuselace. One thrust of recent work has been to encapsulate the gatekeeper-daemon protocol in a library module, and convert everything to use it; this is almost com- plete, and has been a definite success. Telnet connections, while adequate for commands, are suboptimal for bulk data transfer. Early versions of shused operations which returned very large amounts of data had mysterious problems with little bits of data loss. Debug- ging this was difficult, but we eventually established that the problem was in telnet, not in Shuse - it would seem that we were overstressing something in DEC's telnet or pty implementation. As a workaround, the few operations which routinely need to transfer large amounts of data were revised to do the trans- fer via the file system. The exact cause of the problem was never fully deter- mined, and in fact we suspect that a system upgrade somewhere along the way may have fixed it. The new protocol library checks the length of data trans- ferred in all operations, as a precaution. We're also interested in the possibility of reimplementing some of the Shuse telnet communications paths using Tcl's new networking primitives. While this is of no real importance for Shuse's internal communications, improving the user-interface response time would be nice, and it looks like most of the time spent there is in setup and teardown overhead rather than actual communication. One area that has not yet been fully sorted out is logging and trouble reporting. After some unsatisfactory early experiences with coordinating mul- tiple log files, the protocol library and some other facilities were extended slightly to let everything send log entries to the main daemon. This has helped, but we still need to do some more work in the area; it's particularly difficult to get satisfactory reporting in cases where final execution of an operation has to be delayed, e.g., because a server is down. Queueing up the work until it can be done is only half the job. It would be useful to have shused (or supporting software) maintain a current-status report on each slave server, to make ongoing problems more visible. As mentioned earlier, response-time constraints and the single-threaded nature of the daemon require that time-consuming internal operations be broken up into smaller pieces. This has gotten easier as experience has accumulated, but that experience really needs to be distilled into a set of library rou- tines that would make it relatively painless. There are a few infrequent oper- ations which would benefit from being split up, but which are still in one piece because it's too much trouble. The current implementation of Shuse is very much organized around running a single database: the users. In practice, this has been adequate for Sheri- dan's needs, and extensions in this area have been low priority. For example, Sheridan makes relatively limited use of Unix groups, so the group database is still managed manually. This would be less satisfactory for an installation which made more sophisticated use of group memberships. In retrospect, the exact split of responsibilities between the software contractor and the Sheridan staff was not quite right. In particular, shusetie invokes external shell scripts to do things like creating or deleting user home directories. This works, but it would work better if the scripts were better integrated; in particular, error diagnosis would be improved. The lack of integration is an artifact of the responsibility split, and was largely forced on us by time constraints, but it's still a blemish. Conclusion Despite being written in an interpretive language and having to fire up other programs for, e.g., network communication, Shuse works fine and does the job. Performance has not been a problem since minor design errors were cor- rected, new functionality is easily added, and the system is coping well with databases of 20,000+ users. Sheridan is happy, and commercial marketing of the software is being explored. Acknowledgements Although I wrote almost all of the Shuse software, a number of other peo- ple were involved in various ways. Cheri Weaver, then head of Sheridan's sys- tem-administration group, got me into all this :-) in the first place. John Barber, her successor, has happily funded ongoing work and enhancements. Simon Galton handled the Sheridan side of Shuse, during its initial transition into production use, with skill and no small amount of bravery (``you're going on vacation WHEN, Henry?!?''). Seela Balkissoon, Rob Naccarato, Trevor Stott, and the other members of Sheridan's CSG have patiently used, commented on, and complained about Shuse while it was struggling towards operational maturity. Availability The Shuse software belongs to Sheridan College. Times are hard for educa- tional institutions in Ontario, and there is local commercial interest in Shuse, so at this time it is not available free. Author Information Henry Spencer is a freelance software engineer and author. His degrees are from University of Saskatchewan and University of Toronto. He is the author of several freely-redistributable software packages, notably the origi- nal public-domain getopt, the redistributable regular-expression library, and the awf text formatter, and is co-author of C News. He is currently immersed in the complexities of implementing POSIX regular expressions. He can be reached as henry@zoo.toronto.edu. References [1] Mark A. Rosenstein, Daniel E. Geer, & Peter J. Levine, The Athena Service Management System, in Proceedings of the Usenix Technical Conference, Win- ter 1988 (Dallas), Usenix Association 1988. [2] Dr. Magnus Harlander, Central System Administration in a Heterogeneous Unix Environment: GeNUAdmin, in Proceedings of the Eighth Systems Adminis- tration Conference (LISA 94, San Diego), Usenix Association 1994. [3] Paul Riddle, Paul Danckaert, & Matt Metaferia, AGUS: An Automatic Multi- Platform Account Generation System, in Proceedings of the Ninth Systems Administration Conference (LISA 95, Monterey), Usenix Association 1995. [4] John K. Ousterhout, Tcl: An Embeddable Command Language, in Proceedings of the Usenix Technical Conference, Winter 1990 (Washington), Usenix Associa- tion 1990. [5] John K. Ousterhout, Tcl and the Tk Toolkit, Addison-Wesley 1994. [6] Geoff Collyer & Henry Spencer, News Need Not Be Slow, in Proceedings of the Usenix Technical Conference, Winter 1987 (Washington), Usenix Associa- tion 1987. [7] Don Libes, Expect: Curing Those Uncontrollable Fits of Interaction, in Proceedings of the Usenix Technical Conference, Summer 1990 (Anaheim), Usenix Association 1990. [8] Don Libes, Exploring Expect, O'Reilly & Associates 1995.