OPTIMISTIC LOOKUP OF WHOLE NFS PATHS IN A SINGLE OPERATION

                            Dan Duchamp
                        Columbia University

                             Abstract

VFS lookup code examines and translates path names one component at a
time, checking for special cases such as mount points and symlinks.
VFS calls the NFS lookup operation as necessary.  NFS employs caching
to reduce the number of lookup operations that go to the server.
However, when part or all of a path is not cached, NFS lookup
operations go back to the server. Although NFS's caching is effective,
component-by-component translation of an uncached path is inefficient,
enough so that lookup is typically the operation most commonly
processed by servers.  We study the effect of augmenting the VFS
lookup algorithm and the NFS protocol so that a client can ask a
server to translate an entire path in a single operation. The
preconditions for a successful request are usually but not always
satisfied, so the algorithm is optimistic.  This small change can
deliver substantial improvements in client latency and server load.


1      INTRODUCTION

The NFS lookup operation frequently goes "over the wire" from client
to server.  For example, on the main file servers of Columbia's
Computer Science department, lookups constitute approximately 31% of
all NFS operations serviced.  This makes lookup the most common
operation in our environment, followed closely by null and getattr,
and then by read.1  Similar results are typical at other installations;
lookup is the most common, or at least one of the two or three most
common, NFS operation to reach the server.2

These results are obtained despite the presence of a cache (called
"directory-name lookup cache," or DNLC) that is quite effective in
mapping a (directory-vnode, name-within-directory) tuple to the vnode
for the name.  Measuring in the same environment, we found an average
DNLC hit rate of 75% for name lookups in NFS file systems.
Apparently, the NFS client side calls lookup so often that a quarter
of the calls (namely, the DNLC misses) are sufficient, by themselves,
to make lookup the most common operation at the server.

The seemingly high number of over-the-wire lookups has led us to
wonder if they are all necessary and whether some steps might be taken
to reduce their number.  The most obvious approach is to increase the
effectiveness of DNLC. We did this, in two ways:

   1.  The DNLC implementation of SunOS 4.1.3 will not cache a
       (directory-vnode, name-within-directory) tuple if the name is
       more than 15 characters long.  Preliminary measurements
       indicated that a non-negligible fraction (12%) of DNLC misses
       were caused by component names being longer than 15 characters.
       Accordingly, we increased the maximum name size to 31
       characters.

       This change reduced to zero the number of DNLC misses due to
       over-long component names.  However,  the effect on the number
       of NFS lookups was negligible (a fraction of a percent).
       Investigation  revealed  that  over-long  names  occurred
       predominantly  in  the  local  UNIX  file  system (UFS) rather
       than in remote NFS file systems.4  This finding is obviously
       site-dependent and workload-dependent, so it might still be
       worthwhile to raise the 15-character limit, though perhaps to
       some number smaller than 31.

   2.  In the typical configuration of SunOS 4.1.3, the size of DNLC
       is set according to the formula "17*MAXUSERS + 90."  MAXUSERS
       is set to 48, leading to 906 cache entries.  We doubled this
       number, with the result of increasing the hit rate for NFS
       lookups by about half a percent. 

The two changes together resulted in increasing the DNLC hit rate for
NFS lookups by less than one percent.  We conclude that most lookup
operations that go to the server are for pathnames that have not been
looked up before, or else were looked up in the "distant past."

So the simple approach of increasing DNLC size will, by itself, not
substantially reduce lookup traffic to the server.  This should not be
surprising, since DNLC has been available for many years, and its
performance has presumably been tuned with some care.  At least in our
environment, it seems that the size of DNLC has been set to beyond the
point of diminishing returns.

To substantially reduce lookup traffic to the server requires a more
efficient method for looking up "new" pathnames.  In the next two
sections we describe and evaluate such a method.


2      PATH  LOOKUP  ALGORITHM

Roughly speaking, the existing lookup algorithm used at the VFS level is:

dir = vnode for start of path;
for (;;) {
   component = next_component(path);
   if (component  is  ..)  {
      if (goes beyond process' root)
         return error;
      while (dir is a mount point)  {
         dir = cross back over mount point;
         if (goes beyond process' root)
            return error;
      }
   }
   vnode = VOP_LOOKUP(dir, component);
   if (reached end of path)
      return  vnode;
   while (vnode is mounted on)
      vnode = root of overlaid f/s;
   if (vnode is symlink)
      prepend symlink to remaining path;
   else
      dir = vnode;
}

The VOP_LOOKUP macro expands to call the lookup operation of the right
type of underlying file system (e.g., NFS, HSFS,5 etc.).  That
operation may use DNLC to reduce the number of lookup calls that go to
the server; for example, both NFS and UFS do this.

This algorithm translates component names into vnodes one-by-one,
testing for three major special cases at each iteration:

   1.  the vnode is a symlink

   2.  the vnode is mounted-on

   3.  the component is ".."

These special cases form the main reason why lookup happens one
component at a time.  Symlinks are hardest to handle, since they are a
source of uncertainty. That is, a component cannot be known to be a
symlink until the server indicates that it is, and expansion of the
symlink can change the path arbitrarily.  In particular, the
unpredictability of the content of symlinks means that not all mount
points are evident in a pathname when lookup begins.  Crossing a mount
point is a major operation, as it potentially changes the server to
which lookup operations should be directed. Finally, references to the
parent directory (i.e., ".."  or "dot-dot") might also lead to
crossing a mount point (in the "up" direction, as opposed to the
"down" direction of the previous case).

An additional reason why the VFS lookup algorithm proceeds
component-by-component is that the NFS protocol has been designed not
to contain pathname syntax in the protocol because of the desirability
of keeping operating system dependent detail out of the protocol
specification [7].  Since NFS and UFS are the major file systems below
the VFS layer, VFS algorithms have been designed to cater to their
constraints.


2.1      OVERVIEW

The design of the VFS lookup algorithm is sensible, since every
component must be checked for the special cases.  However, the
component-by-component analysis of the path is the cause of the large
number of lookups that go to the server.  If there were no special
cases, then whole paths could be looked up in a single server
operation.

In fact, the special cases seldom arise.  Measuring in the same
environment mentioned earlier -- name lookups generated by a
multi-user workload applied to eight servers over several days -- we
found that 97.7% of paths resolved by the VFS lookup algorithm crossed
no mount points and 98.9% contained no symlinks.

Our work capitalizes on these facts.  We develop a "path lookup"
operation that can translate several components of a path. This
operation assumes that the path includes no special cases. After a
path-lookup, we apply some checks for the special cases. If any is
found, then further path-lookups may be necessary, and it is possible
that some of the work performed by the first path-lookup may have been
wasted.  Hence, path-lookup is an "optimistic" operation.  The number
of path-lookup operations and the extent to which some of them may
perform wasted work varies for each path.  However, for the
overwhelming majority of paths, a single path-lookup suffices to
translate the path into a vnode.  At worst, the path-lookup call will
translate only the first component; so the ordinary lookup operation
is the degenerate case of path-lookup.

Our approach is, first, to add a path-to-vnode cache at the VFS level
and, second, to augment NFS as necessary to lookup whole paths
whenever the path cache misses.  Specifically, an additional
path-lookup call is added to the NFS protocol; this call accepts a
pathname which the server translates until the first symlink (if any)
is encountered.  The response contains three fields:

   1.  The longest symlink-free prefix of the path.  The prefix may be
       null.

   2.  The file handle for the prefix.

   3.  The untranslated suffix of the path, with the first symlink
       expanded and prepended.  The suffix will be null if the path
       contains no symlink.

Note that our additional VFS-level path cache is separate from and
logically above DNLC.  DNLC is used within individual file systems;
the path cache is used within the VFS lookup code only.  Also, the
results of path-lookup cannot be used to fill entries in DNLC, since
DNLC maps component to vnode, whereas the path cache maps path to
vnode.

The path-lookup call should be directed only to servers that are
capable of handling it.  The proper approach would be to alter the
MOUNT protocol so that, at mount time, the file server indicates if it
can handle path-lookup, and, if so, which types of pathname syntax it
understands.  The client would then store this information in the
struct vfs for that mount.  However, to reduce the number of required
protocol changes, our code assumes that every mounted file system
understands the call, and tries it.  If a "bad operation" RPC error
occurs, or if the RPC succeeds but the server indicates that it cannot
handle the syntax of the pathname, the server's inability is recorded
in the struct vfs.  Besides avoiding a change to the MOUNT protocol,
this approach has the advantage of slightly easing incremental
deployment.  A disadvantage is that a server's limits are repeatedly
re-discovered (once per mount), and automounters -- which are
increasingly common -- tend to enormously increase the number of times
that a file system is (un)mounted.


2.2      DETAILS

This section explains how path lookup adjusts to the three special
cases: symlinks, mount points, and dot-dot.


2.2.1       SYMLINKS

Any component of a path may be a symlink, and symlinks may expand to
anything. Therefore, the servers and directories visited while
translating a path are not predictable simply by inspecting the
initial path.

For an example, consider the path "./x/y/z" illustrated in Figure 1.
If x were a mount point, then y/z should be resolved in a different
file system than it would be if x were not a mount point.  The client
could detect if x were a mount point, since the client knows its mount
points.  However, x could also be a symlink that would expand to w,
which in turn may or may not be a mount point, leading to the same
predicament.

The catch-22 is that the client cannot know which server to contact
until it knows whether the path "is what it seems to be" and the
client cannot know that a path is what it seems to be until its
components have been looked up at the server.

To break the cycle, we optimistically assume that the path contains no
special cases.  Referring to the example above, the client would
lookup the path x/y/z on the server for directory "." (which is
necessarily the right server for the lookup of x).  The server
responds with a partition of x/y/z into a symlink-free prefix and a
suffix that begins with the expansion of the first symlink, if any
exists.

The reason that the path-lookup operation translates only to the first
symlink and not to the end of the path is that the optimistic
assumption may be false.  If the path does contain a special case, the
server is probably wasting some effort translating a path that is
different from the one that should be translated.  In such cases, the
client must have enough information to survive the false assumption.
We chose to have the server return from the path-lookup as soon as it
encounters information that might signal that the optimistic
assumption is false.  Note that the server cannot return when the path
crosses a mount point because the mount points that are relevant are
those on the client and, practically speaking, the server cannot know
the paths of the client's mount points.  So the server is doing all it
can by returning when it encounters a symlink.  Having the server
return on every symlink has essentially no effect on performance
(because of the rarity of special cases) and somewhat simplifies the
client (since the client need retain information for and check for
only two of the three special cases).


2.2.2       MOUNT  POINTS

After the path-lookup returns, the client will examine the
symlink-free prefix for mount points.  If no mount point is found,
then the prefix was translated on the correct server.  So the
algorithm repeats by sending the suffix, if any, to the same server.
If the path of some mount point is contained in the prefix, then the
path lookup may have been directed to the wrong server: so the portion
of the path (prefix and suffix) below the first mount point is sent to
the server for the mounted file system (assuming that it understands
path-lookup).

The reason for the mount-point check is that a server that looks up a
path does so with respect to its name space; however, the semantics of
file name translation demand that a path be translated with respect to
the name space of the client.  Consider the example in Figure 2.
During the translation of /usr/local/gnu/bin/emacs, the name
gnu/bin/emacs is translated by Server A because the client has mounted
that server's file system on its name /usr/local.  However, the client
has also mounted Server B's file system on the name /usr/local/gnu.
Therefore, the correct translation is that of bin/emacs with respect
to Server B's file system, rather than gnu/bin/emacs with respect to
Server A's file system.  So the translation provided by Server A may
be wrong.

In order to have enough information to check for mount points, the
client accumulates the symlink-free prefixes returned from all
path-lookup calls.  After each call returns, the current accumulated
symlink-free prefix is compared against all mount points.  In order to
provide fast search through all mount points, we added a trie index
that points to all NFS mount points.  The trie stores absolute path
names, as shown in Figure 3.  However, the pathnames generated by a
process are resolved relative to either its current root (curroot) or
current working directory (cwd).  This twist presents no problem: the
vnodes for cwd and curroot are available, and each contains a pointer
to its struct vfs; by definition, these structures are represented in
the trie by their complete, absolute pathnames.  Therefore, a pathname
lies in a file system different from the one housing the starting
point iff:

   1.  There is another mount point farther down the branch of the
       trie housing the struct vfs of the starting point.

   2.  The pathname is not embedded in the trie between the struct
       vfs of the starting point and the next struct  vfs.

Our implementation platform, SunOS 4.1.3, keeps the path of all its
mount points only in the file /etc/mtab.  For three reasons, we made
changes so that the name of a mount point is also kept in the
associated struct vfs.  First, for performance: the pathnames of mount
points have to be accessed on every path lookup.  Second, to avoid
race conditions: after initiating an I/O to access /etc/mtab the
kernel would continue; the kernel's next operation might be another
that access or manipulates /etc/mtab.  Finally, as part of earlier
work [10], we had already written some code to store mount point names
in struct vfs.


2.2.3       DOT-DOT

Dot-dot must be handled with care similar to that for mount points.
The reason is the same: the server will interpret dot-dot with respect
to its name space, whereas the required semantics are with respect to
the client's name space.  Usually the two interpretations are the
same.  The only exception is if dot-dots in the path result in going
above the root of the remote file system.6 For example, suppose
/usr/local is an exported file system; then the path "/usr/local/.."
refers to /usr on the client, not the server.

Unfortunately, we thought of no simple and clean check for and
adjustment to the possibility of backing up over the root of the
containing file system.  The rub is that it is messy for the client to
remember the path of every starting point (i.e., process cwd or
curroot).  So before each call to path-lookup, the client checks if
the path goes above the starting point (i.e., cwd or curroot) of the
translation.  If so, the path lookup is aborted and the regular VFS
lookup algorithm is used.  This conservative approach means that any
path that begins with dot-dot is not processed using path-lookup.
Similarly, after the final path-lookup operation responds, the client
checks the symlink-free prefix; if the prefix would back up over the
starting point, then the path lookup algorithm is aborted.


2.2.4       PATH  CACHE

The path cache is referenced from within the new VFS lookup algorithm
and from within the NFS code for validating caches.  Most of the code
for the path cache was copied and adapted from DNLC.

Since NFS provides no means for a server to call back to a client, the
path cache can contain outdated information.7 Stale cache entries are
removed by NFS's normal timeout-driven checking of vnode attributes
and, since the cache is managed LRU, by aging the oldest entry during
an insert operation.  All these traits are shared with DNLC.

One difference between the path cache and DNLC is that explicit
deletion (such as when a file is deleted or renamed) is handled
slightly differently.  The delete and rename operations delete from
the path cache by vnode since a vnode is a unique ID and since it
would be difficult to construct a path for the target.  DNLC entires
can be deleted by either vnode or component name.


2.2.5       PROTOCOL  CHANGE

The definition of the new call added to version 2 of the NFS protocol
is:

struct  pathlookupargs  {
            nfspath  pathname;
            int  syntax;
};

struct  pathlookupokres  {
            nfs_fh   file;
            fattr     attributes;
            nfspath  prefix;
            nfspath  suffix;
};

union  pathlookupres
switch  (nfsstat  status)  {
case  NFS_OK:
            pathlookupokres  pathlookupres;
default:
            void;
};

pathlookupres
NFSPROC_PATH_LOOKUP(pathlookupargs)  =  18;

A new "unintelligible syntax" error code was necessary.  However,
since, at the server, path lookup is simply an iterative application
of the regular lookup operation, all the same access constraints
apply.


2.2.6       SERVER  CHANGE

To implement the path-lookup operation on the server side, we stole
code from other NFS operations, especially lookup.  Essentially, the
path-lookup implementation is that of lookup with two main additions:

   1.  Instead of translating a single component, the code iterates
       over components until it reaches the end, a symlink, or an error.

   2.  When a symlink is encountered, it is read (by calling the
       server-side operation to read a symlink) and prepended to the
       remaining untranslated path.


3      EVALUATION

This algorithm is implemented in SunOS 4.1.3 and is part of the
operating system regularly booted on nine SparcStations.  Fewer than a
thousand lines of code were added.  On the server side, only a small
addition was made to the module that implements NFS operations
(nfs_server.c).  Most changes were on the client side, where some
modules received major changes: vfs_lookup.c, and a few others needed
to store the path name of mount points in the struct vfs.

Before implementing, we studied the distribution of lengths of
pathnames.  The longer the pathname given to the path lookup
algorithm, the greater the upside potential.  Measuring in the same
environment as noted earlier -- eight major multi-user departmental
file servers -- we found considerable variation in path length
distribution from machine to machine and time to time.  However, on
average path lengths of 4 were most common, with lengths of 3 the next
most common.  Paths of length 3 or 4 together accounted for over 70%
of lookups.  Of the remainder, most were shorter.

To measure the effect of our changes, we used a kernel build as a
benchmark; all major kernel sources, libraries, and include files were
on a remote file system.  Within a short period of time, a kernel
build opens a large number of files in a relatively small number of
directories.  Kernel builds provide a friendly test for path lookup:
the average path length is somewhat longer (4.6) than the more
comprehensive number noted above. Because of background activity, the
results varied a little, but on average the build ran 8% faster with
path lookup in effect.  Eight percent is a substantial speedup
considering that it is the effect of changing only the lookup
operation.

Measuring the effect path lookup has on the server is at once easier
and harder than measuring client latency.  Measuring number of server
operations is easy, but each path lookup operation can be expected to
perform more work than an ordinary NFS lookup.  We used nfsstat to
measure the number of operations serviced and vmstat to record
processor idle time and I/O operations.  For a set of kernel build
benchmarks, the number of NFS operations declined 20% and processor
idle time averaged 16% higher with path lookup in effect.  Apparently,
processor overhead for handling NFS requests is substantial.


3.1      VIOLATION  OF  NFS  DESIGN  PRINCIPLE

As noted earlier, the addition of path-lookup violates the
longstanding design decision to keep the NFS protocol free of path
syntax.

One possible rejoinder is that, while it is true that it is desirable
to keep operating system specifics out of the NFS protocol, this
design decision was made several years ago; since then, NFS, though
widely ported, has received almost all its use on DOS and UNIX
platforms.  We question whether the substantial negative impact on
performance caused by component-by-component lookup is acceptable
considering that the abstraction offered by omitting pathname syntax
"abstracts" over effectively only two implementations.

Another, possibly better, rejoinder would be to re-design our protocol
change so that it accepted and returned not paths, but rather vectors
of opaque components.  It would still be necessary to exchange an
indication of how to interpret the components, but at least the
letter, if not the spirit, of the original design principle would be
preserved. We have not yet made this change to the protocol.


4      RELATED  WORK

As noted in the introduction, lookup, getattr, and null comprise the
vast majority (over 80%) of NFS operations handled by our main
servers.  The high number of null operations is attributable to our
use of the Amd automounter [5]; Amd periodically "pings" every mounted
file system, and NFS null is the ping operation.  While the high
number of null operations can thus be dismissed as site-specific, the
dominance of getattr and lookup is typical for most NFS installations
-- as the operation mix in nhfsstone indicates.  So naturally there
has been interest in sharply reducing the frequency of these
operations.

Most such interest has focused on getattr, although in most
experiments the motivation has not been simply to reduce the frequency
of getattr but rather to improve the consistency guarantee provided to
the client by an NFS server.  One early experiment is "Spritely NFS"
[6], in which a callback scheme similar to that in Sprite [4] was
added to NFS with the intention of providing strict cache consistency
and improving performance by eliminating the overhead of refreshing
the attribute cache with getattr. The ideas in Spritely NFS were used
in modified form in "NQNFS" [3] which is a second protocol available
from the NFS implementation of 4.4BSD. NQNFS differs from Spritely NFS
in that the latter requires the server to keep state indicating the
cache status of files at clients; however, NQNFS borrows the "lease"
idea from Gray and Cheriton [2] in order to avoid the need for servers
to keep state across failures.  In NQNFS, a client is allowed to cache
a file for a specified period of time.  The only recovery action a
server must take is to wait until such time as all of its leases must
have expired.  Neither of these systems addresses the issue of
reducing lookups.

More recently, the specification for version 3 of NFS [8] included,
among its many changes, the requirement to return attributes as a side
effect of every appropriate NFS operation.  The intention is to reduce
the number of separate getattr operations that must be invoked in
order to verify attribute cache consistency.

Cache consistency and NFS version 3 are a bit far afield, but we are
not aware of any attempts -- besides DNLC -- to reduce the cost and/or
number of NFS lookup operations.  However, the notion of looking up
and caching whole or partial paths has been proposed before for new
file system designs [9, 1].

In 1986 Welch and Ousterhout described "prefix tables" [9].  Prefix
tables are useful in an environment where a shared global hierarchy of
files is partitioned into "domains," which are spread across servers.
Each client maintains a prefix table that maps file name prefixes to
the servers on which the associated domains reside.  Prefix table
entries are hints: if a file is not where a table says it is, shorter
prefixes are tried until the file is found.  If a client has no
prefixes at all for a file (as will be the case initially), it
broadcasts the file name to all servers.  Relevant prefix/server
mappings are returned by all servers that have such mappings.  In this
way, prefix table information can be easily propagated without the
requirement that any two clients have precisely the same table.  And
because prefix tables contain only hints that need not be correct,
this method avoids creating either an availability or a consistency
problem.

The prefix table idea is quite similar to our work; however, there is
one major difference between the model of file system use in NFS and
that in the prefix table proposal.  Welch and Ousterhout describe a
construct called a "remote link," which is apparently a replacement
for the idea of client mounts. Distinct domains are stitched together
with remote links, which are server-side mounts; that is, the client
has no control over how to overlay domains on top of one another --
the information is encoded in remote links in the file system, and all
clients see the same arrangement of domains into a hierarchy.
Implementing static mounts on the server side is a significant
simplification for whole-path translation (and a significant loss of
flexibility for the client).  In our design, the server returns after
encountering a symlink and the client must check the symlink-free path
for mount points -- both of these features exist only because the
server cannot know the client's mount points.  In Welch and
Ousterhout's system, unlike in NFS, there is no complication with
having the server expand a symlink and continue translating it without
contacting the client.  In their work, the only time a server returns
a pathname to the client not completely translated is when some
component of the path crosses a domain boundary: either dot-dot in the
upward direction or a remote link in the downward direction.  In these
cases, the client must check its prefix table to learn which server to
send the remainder of the path to.  In summary, prefix tables is an
elegant idea but one targeted for a significantly different and easier
model of file system definition and use.

In [1], Cheriton and Mann describe a naming system that is scalable
enough to encompass the world and general enough to name many types of
objects (not just files -- processes, windows, network connections,
etc.)  Since the features that draw most of their design attention are
those that permit scaling to enormous size, it is hard to make a
meaningful comparison between our work and theirs.  Their system has
the notion of looking up and caching whole or partial pathnames.
However, they reject the notion of client mounts on the grounds that
such client-specific name space management operations do not scale
well and stand in the way of forming a consistent global name space.
The notion of symbolic links seems absent from their design,
presumably on similar grounds.


5      SUMMARY

Measurements of NFS pathnames and lookup performance yield several
pronounced facts:

    o  The hit rate of DNLC is not easily improved; nevertheless,
       enough lookup operations cannot be satisfied from DNLC so that
       lookup is the operation that most commonly goes over the wire
       to the server.

    o  Average path length is long enough so that translating a
       completely uncached path will often require as many as 3 or 4
       lookup operations.

    o  Paths given to the VFS lookup algorithm almost never contain
       symlinks or cross mount points, and so can typically be
       translated at the server in a single operation. 

Given these facts, one may question whether the elegance and relative
simplicity of the VFS lookup algorithm -- which translates uncached
pathnames one component at a time -- is sufficient compensation for
its high overhead.

Indeed, the server load caused by repeated NFS lookup operations can
be reduced by up to 16% if path lookup is used instead of component
lookup.  Also, path lookup can have a noticeable effect on client
latency for workloads that are open-intensive.

Path lookup is not hard to implement, the only tricky aspect being
that a translated path must be examined even after a "successful"
translation in order to ensure that the middle of the translated path
did not cross a mount point.  If it did, then the pathname looked up
at the server is not the path that should have been looked up -- the
portion of the pathname below the mount point has different meaning on
the two different servers.


6      ACKNOWLEDGEMENTS

Andreas Prodromidis helped gather many of the reported statistics.
Margo Seltzer suggested making the arguments/results of the
path-lookup call be vectors of opaque components.

This work was supported in part by ONR grant number N00014-93-1-0315,
and by National Science Foundation CISE Institutional Infrastructure
grant number CDA-90-24735.


REFERENCES

 [1]  D. R. Cheriton and T. P. Mann.
      Decentralizing a Global Naming Service for Improved Performance
       and Fault Tolerance.
      ACM Trans. Computer Systems, 7(2):147-183, May 1989.

 [2]  C. G. Gray and D. R. Cheriton.
      Leases:  An Efficient Fault-Tolerant Mechanism for Distributed
       File Cache Consistency. 
      In Proc. Twelfth ACM Symp. Operating Systems Principles, pages
       202-210, December 1989.

 [3]  R. Macklem.
      Not Quite NFS, Soft Cache Consistency for NFS.
      In Proc. 1994 Winter USENIX, pages 261-278, January 1994.

 [4]  M. N. Nelson, B. B. Welch, and J. K. Ousterhout.
      Caching in the Sprite Network File System.
      ACM Trans. Computer Systems, 6(1):134-154, February 1988.

 [5]  J. Pendry and N. Williams.
      Amd - The 4.4 BSD Automounter.
      Imperial College of Science, Technology, and Medicine, London,
       5.3 alpha edition, March 1991.

 [6]  V. Srinivasan and J. C. Mogul.
      Spritely NFS: Implementation and Performance of
       Cache-Consistency Protocols.
      Research report 89/5, DEC Western Research Lab, May 1989.

 [7]  Sun Microsystems, Inc.
      NFS: Network File System Protocol Specification.
      RFC 1094, IETF Network Working Group, March 1989.

 [8]  Sun Microsystems, Inc.
      NFS: Network File System Version 3 Protocol Specification.
      June 25, 1993.
      Available from gatekeeper.dec.com:/pub/standards/nfs/nfsv3.ps.Z

 [9]  B. Welch and J. Ousterhout.
      Prefix Tables: A Simple Mechanism for Locating Files in a
       Distributed System.
      In Sixth Intl. Conf. Distributed Computing Systems, pages
       184-189, May 1986.

[10]  E. Zadok and D. Duchamp.
      Discovery and Hot Replacement of Replicated Read-Only File
       Systems, with Application to Mobile Computing
      In Proc. 1993 Summer USENIX, pages 69-85, June 1993.


Dan Duchamp is an Associate Professor of Computer Science at Columbia
University.  His current research interest is the various issues in
mobile computing.  For his initial efforts in this area, he has been
named an Office of Naval Research Young Investigator.

Mail address: Computer Science Department, Columbia University, 500
West 120th Street, New York, NY 10027.  Email address:
djd@cs.columbia.edu.


___________________________________________________

1. This data was gathered by the nfsstat utility on eight file
servers, all running SunOS version 4.1.3 and NFS version 2.  The total
number of NFS operations, including null, was nearly 4 million.  The
frequency of the other common operations was 27.5%, 22.7%, and 4.8%
for null, getattr, and read, respectively. Across the eight servers
there was considerable variance among the relative frequencies, but,
in every case, lookup, getattr, and null were by far the most common
operations.

2. For example, the most common operations in the nhfsstone benchmark
are, in order: lookup (34%), read (22%), write (15%), and getattr
(13%).

3. Following convention, we call the argument to VFS lookup a path, or
pathname.  A path consists of a sequence of components.  The process
of mapping a path or a component to a vnode we call translation or
resolution.

4. The DNLC module is defined at the VFS level, and is callable by any
underlying file system, such as NFS or UFS.

5. The High Sierra file system, for CD-ROM.

6. The NFS server checks for and prevents this case on every lookup.

7. The same is true for DNLC and, indeed, for a client-side cache of
any type of information about remote NFS files.