Works In ProgressSummaries by Gordon Galligher
The Works in Progress (WIP) session has been a mainstay at USENIX conferences. It continues to be a vehicle for people working on interesting projects who want to share that information with others in the community. Furthermore, it can be used as a "validation" of sorts, allowing the presenters to see if their idea is approaching a particular problem in a correct manner. This year's session was no different, with six presentations ranging from potential new paging cache systems to speed-up kernels to "disk-caching disk" technology to speed-up disk writes of small files to the new version of the Next Generation of LPR (LPRng).
The main goal of this work is to improve the performance of various paging systems that exist for virtual memory-based operating systems. The annotation approach is to allow the programmer to give "hints" to the paging algorithm about the use for a particular page, the priority of that particular page in the Grand Scheme of Things, how and when it can be flushed, etc. There are two methods by which this annotation scheme is used: declarative and operational.
In the declarative method, there are specified events that cause the paging system to check the various declarations. In the operational method, the system is more dynamic (i.e., does the page need to be flushed, should we prefetch a page, should we write back a page, if yes, then check the declarations). Noritaka has a prototype of this new system running on a Pentium 90 with 16 MB of RAM using the MACH-lite kernel. He has noticed some overhead of the declarative methods when accessing memory and making system calls. Accessing all pages sequentially, he noticed a 50% speed improvement over the normal virtual memory system available in MACH-lite.
Glen Back presented a work done by his graduate student, John Lougal, who has written a general purpose C++ class library for use with X programming. It was not John's intent to do this, but all other libraries came up "short" on a number of areas for his project. Back presented a nice visual example of what John provides in his library, including built-in printing support, file dialog boxes, etc. He mentioned that it was more useful for a document processing system, which can support multiple "background" tasks as well as multiple displays. This library is free to all, but it is not under the GNU General Public License. To get more information, the following URL was provided: http://www.cco.caltech.edu/~jafl/jx.
For small offices and engineering organizations, use of the computing facilities typically involves a number of small physical updates to files on the disk. When one is working on a code fragment, for example, it might involve editing a very small file, saving it, and then recompiling, which might generate a number of other small files that the compiler uses as temporary space holders. The newest disks on the market, however, typically favor very large writes in a single operation, and their speed numbers are based on large blocks of data written simultaneously. This dichotomy has led to some systems appearing to perform poorly, when in truth they perform just as the hardware design intended.
Qing Yeng, a graduate school professor, has his students working on a Disk Caching Disk (DCD) method, which uses a write cache between the application and the storage medium. This part is not so novel; write caching has been done for years. What his students have come up with, however, is a write cache that uses physical disk instead of memory.
The design goal is to get one or two orders of magnitude of performance improvement without having to change the operating system. Having a hook into the filesystem that traps small writes (between 0.5 MB and 1 MB) and sends them to a hard disk using a log-based filesystem was one possibility to realize this improvement. When that disk is then idle, the recent writes are all batched up and written to the actual data disk in a single write. The log-based filesystem can either be on a separate physical hard disk (a physical DCD), or it can be a "partition" of the actual data disk (a logical DCD).
In his tests on a single workstation, the logical DCD performed 200 times faster, and a physical DCD performed 3 times faster than the normal filesystem. On the departmental fileserver, the logical DCD approach performed 170 times faster, and the physical DCD performed 4 times faster. There was no mention as to why the logical DCD outperformed the physical DCD.
This is currently implemented as a device driver on Solaris kernels, but there are plans in place to put this as embedded software on a disk controller card. There are further ideas to create a specialized piece of hardware to perform this function.
The Revocation of Unread Mail project was written for a specific subset of users at MIT who wanted to allow the sender to rescind a previously sent email message. It is based on a cooperative setup that allowed the sender to revoke email from the recipient's mail box if it had not been read. There were a number of assumptions such as messages that have been read are not kept in the spool area. Kevin mentioned that there are a number of weak areas in terms of the assumption of trust in who can revoke mail. It was seen more as a convenience factor than something that was widely needed.
This particular project was met with relative coolness by the audience. One person questioned the ethics of allowing this, mentioning that once you send a USPS letter, it cannot be revoked. Kevin correctly pointed out that there were situations where the letter could be retrieved, so this particular analogy did not hold. A reference implementation using MH and slocal was mentioned.
The RC5 algorithm is a block-cipher encryption model. This particular project pointed out several mathematical possibilities where faults may occur at various points in the algorithm. The information was presented rapidly due to the time constraints of the WIP program, and the heavy mathematical content was difficult to grasp. For more information on these situations, the reader is referred to the following URL: http://snafu.mit.edu/~fubob/rc5-dfa-paper.ps.
This is a very interesting proposition in which the existing data infrastructure (Ethernet) can be used to carry the voice traffic as well as the data traffic. The goal is to contain the rising PBX costs by effectively doing away with the PBX completely. She is also going to look into doing video conferencing over the lines as well. A lot of features are possible with this technology, such as decoupling the actual numbers with a particular station, the ability to easily forward and exchange information, message taking directly on the computers, etc. There is a current port that will work only on Windows and MAC workstations because they are using the CallPath software from IBM to do the majority of the implementation, with LDAP providing the directory services function. She expects to have this deployed and available within a year.
Patrick Powell worked extensively on making changes to the BSD LPR program, and this is the long-awaited next version. It has authentication and hooks for encryption methods-a general interface that can have PGP or Kerberos, etc., plugged in. Some of the other new features include:
Originally published in ;login: Vol. 22, No.2, April 1997.
Last changed: May 28, 1997 pc