Automated Upgrades in a Lab Environment

     Paul Riddle - University of Maryland, Baltimore County

                            ABSTRACT

     Back in the late 80s and early 90s, when disk drives were
expensive, it was more economical to buy one server and configure
it with enough disk space to support several "diskless"
workstations.  Now that disks are cheaper, most workstations now
come with internal disks which contain an entire bootable
operating system.

     Most vendors provide ways of automatically upgrading
multiple "diskless" workstations; unfortunately, the same is not
true for "diskfull" configurations.  Upgrading "diskfull"
workstations typically involves either a lot of manpower or a lot
of tedious, repetitive work.  In any moderate to large sized
network, something needs to be done to automate the upgrade
process.

     This paper describes a scheme which we use to upgrade our
various networks of Silicon Graphics workstations.
Interestingly, it relies on the same technology that allows
"diskless" workstations to boot over the network.

                          Introduction

     Our upgrade scheme works using diskless booting.  Each
workstation boots over the network from another workstation,
which we designate as an "upgrade server."  Once booted, the
workstation runs an upgrade script (written in Perl [1]) which
partitions its system disk, creates filesystems, installs an
operating system distribution, and then installs customized
system files.  When finished, the workstation reboots from its
system disk.  This scheme allows for unattended system upgrades
and has proven to be quite flexible; we have used it to upgrade
two separate networks of SGI Indigos from Irix 4.0.5 to Irix 5.2.

          What We Were Looking For In An Upgrade Scheme

Automation

     The upgrade procedure should not require that we physically
visit every workstation.  This is a problem in our environment
where many workstations are located in private offices to which
we don't have easy access.  Visiting each machine also requires a
lot of manpower and can be error-prone; operator errors can lead
to machines being upgraded incompletely, improperly, or not at
all.

Flexibility

     An upgrade scheme should be able to deal gracefully with
different sized system disks, different models of a vendor's
workstations, etc.  It should be able to repartition and create
filesystems on the machine's local disk, if necessary.

Reliability

     The upgrade procedure should be reliable.  It should never
leave a machine in a partially-upgraded state.  If an upgrade is
interrupted or otherwise fails, it should pick up where it left
off, or start over the next time the machine is rebooted.  It
should have some way of notifying the system administrator when
an upgrade fails or completes successfully.

Speed and Convenience

     Upgrading should be reasonably fast and should not require a
lot of downtime.  Alternatively, it should be automated to the
point where it can be done overnight, when there is less demand
for workstations and network bandwidth.

                         Our Environment

     The University of Maryland, Baltimore County is one of the
largest educational installations of Silicon Graphics (SGI)
equipment in the country.  There are approximately 200 SGI
workstations on campus, spread out over about 8 different
administrative domains and 10 subnets.  The abundance of SGIs
required us to come up with some way of keeping them up-to-date
with the latest release of Irix (SGI's flavor of UNIX).  We chose
two different workstation networks to use as "Guinea Pigs" for
testing our upgrade scheme.

     For one upgrade environment, we used three student labs
consisting of a total of about 90 SGI Indigos, some with entry
level (RPC) graphics, and others with extended (XS24 or Elan)
graphics.  Each workstation has a 420-megabyte internal system
disk.  The workstations are spread over two different subnets.

     A second upgrade environment consisted of about 30 SGI
Indigos, mainly with low-end graphics, and 10 SGI Indy systems.
Some of these machines have 420-megabyte system disks and others
have 1-gigabyte system disks.  All are on the same subnet.

     For both environments, the task was to upgrade from some
revision of Irix 4.0.5 (4.0.5F in some cases and 4.0.5H in
others) to Irix 5.2.

                  Alternatives To Our Approach

     We evaluated several other methods of upgrading before
choosing to implement one based on diskless booting.  Each of
these has its advantages, but fails to meet our requirements in
one or more ways.

Upgrading Systems Individually

     The most obvious and straightforward upgrade strategy is
simply to upgrade systems manually, one at a time.  We discarded
this idea quickly because it was too time consuming.  It also
requires physically visiting each workstation.  Additionally,
manually upgrading a workstation is a tedious process which
involves many steps.  When many workstations are upgraded in this
way, it can lead to subtle differences and inconsistencies
between systems.

Manual Disk ``Cloning''

     A faster method is to upgrade manually one of each different
type of system, and then upgrade the rest of the machines using a
sector-by-sector disk copy.  This is much faster and more
reliable than upgrading individually, but still requires
physically visiting each and every machine.  The disk cloning
doesn't extend to systems with differing system disk geometries,
either.  For example, you can't clone a 1-gigabyte system disk
onto a 420-megabyte system disk; it just doesn't work.  Operator
error also creeps into the picture; although you're less likely
to end up with inconsistencies between systems, there is still a
good chance that machines can be missed or otherwise improperly
upgraded.

     Although this method doesn't really meet our needs, we did
use it for awhile because it is simple and straightforward.
Trained student employees provided the manpower.

Upgrading Running Systems With rdist[2]

     Still another approach was to use rdist or a similar tool to
upgrade a running system[3].  This worked well for a minor OS
revision, but was not capable of handling a major revision such
as upgrading from Irix 4.0.5H to Irix 5.2.

Using Unused Swap Space For Upgrade Filesystem

     Another method was to use unused swap space to create an
upgrade filesystem.  SGIs allow swap to be removed from a running
system, so it was possible to dynamically delete enough swap to
create room for the upgrade filesystem, boot from there, and
upgrade the system disk over the network.  However, this approach
is not 100% reliable, since there's a chance that adequate swap
space may not be available at upgrade time.  Also, this approach
doesn't allow for repartitioning the system disk during the
upgrade, since part of the disk is in use as the upgrade
filesystem.

                          Our Solution

     In designing an upgrade scheme, we worked to come up with a
solution that satisfied all of our criteria: automation,
flexibility, reliability, and speed.  An important requirement
was to avoid having to visit each workstation individually.  This
ruled out any solution involving disk "cloning" or upgrading
individually from CD-ROM.  We worked around this by having
workstations copy the operating system over the network from a
server.

     In order to do this, the workstation needs to be booted to a
state where its network interface is operational and its system
disk is not being used.  Enter diskless booting.

     Diskless booting is an attractive solution because it allows
for complete control of the system disk when performing the
upgrade.  The disk can be reformatted, repartitioned, mounted,
unmounted, etc. at will.  However, diskless booting is not
without its problems.  The booting protocol requires that the
upgrade server be located on the same logical network as the
client being upgraded.  Many simultaneous upgrades can place an
undesirable load on the network.  The next section describes how
we worked around the former problem.  For the latter problem, we
place limits on the number of simultaneous upgrades at the
expense of time.

                      The Upgrade Procedure

     Doing upgrades is a three-step process.  First, you need to
configure each upgrade server to support diskless clients.  Then,
you must do a prototype installation for each different type of
environment you are supporting.  Finally, each workstation needs
to be configured to boot from the upgrade server and then
rebooted to start the upgrade process.

Configuring Servers For Diskless Booting

     The first step in configuring the upgrade server is to build
the diskless booting area.  Let's assume that the hostname for
the upgrade server is sonata.  The upgrade area is rooted on
sonata under /upgrade.

     The upgrade area contains everything that a workstation
needs to boot diskless over the network and perform its upgrade
procedure.  A minimal number of OS files are necessary to support
a diskless environment.  All prototype and site-dependent
distribution trees also live under the upgrade area.

     Prototype distributions are located under /upgrade/proto.
Under recent releases of Irix, machines with different graphics
boards and/or processors require slightly different installations
of the operating system.  Each installation requires a separate
prototype tree.  For example, if your site has R4000 Indigos with
entry (RPC) graphics and R3000 Indigos with Elan graphics, you
would need two prototype distributions, which might be called
/upgrade/proto/4krpc and /upgrade/proto/3kelan.  (The names are
arbitrary; you can choose whatever names you want.)  Under these
trees would be two prototype Irix installations, one for both
machine architectures.  Prototype distributions are either disk
images or filesystem images generated by dump; the next section
describes how to generate them.

     We were able to work around the need for multiple prototype
distributions by making several modifications to the default Irix
distribution provided by SGI.  This was done at the cost of a few
extra megabytes of disk space on each system, which we decided
was an acceptable tradeoff.  The specific details of our
modifications are beyond the scope of this paper, but we will
make them available via FTP along with the rest of our upgrade
tools.

     Site distribution trees are located in /upgrade/dist.  Once
the client has copied the appropriate prototype distribution, it
uses rdist to copy selected site distribution trees.  These trees
contain system files which need to be modified from the defaults
supplied by SGI, and any additional site-dependent files which
need to live on the workstation's local disk.

     The main purpose of site distribution trees is to separate
customized files from standard files.  This reduces the
possibility that customized files will be lost when doing an
upgrade.

     The /upgrade tree must exported to all clients.  Each client
mounts /upgrade as its root filesystem.  To allow for multiple
simultaneous upgrades, we export /upgrade readonly and take pains
to ensure that the clients do not try to write to it.

Building Prototype Environments

     To build prototype installations, we manually upgraded one
of each type of workstation and then copied the resulting
installation onto an external hard disk.  Putting the
distributions on an external disk allowed us to move them around
from machine to machine, thereby enabling us to set up
installation servers on different subnets.  We found that a 1.6
gigabyte external drive was large enough to hold two separate
prototype installations.

     For networks with identically-sized system disks, we used dd
[4] to copy the disk image over to the external drive.  This is
the fastest way to do things.  Unfortunately, it doesn't work on
networks where workstations have system disks with differing
geometries.  In this case, we used dump, [5] which is slower, but
works on any disk regardless of geometry and partitioning.  Dump
also requires a separate prototype file for each filesystem on
the client's disk.  For example, rather than a single disk image,
we might have two separate dump images called
/upgrade/proto/3kelan.root and /upgrade/proto/3kelan.usr.

     Once we built all of the prototypes, we attached the
external drive to the upgrade server and mounted it under
/upgrade/proto.

The Upgrade Procedure

     Workstations upgrade themselves using the following
procedure.  First, each client must be configured to boot
diskless from its upgrade server.  On Silicon Graphics boxes,
this is done by setting two variables in non-volatile RAM (nvram)
on each client:
 client# nvram diskless 1
 client# nvram bootfile \
     bootp()sonata:/usr/etc/boot/upgrade/unix
 client# /etc/reboot

     We did this for each of our clients using a simple shell
script.  Other methods include rdist, cron, etc.

     When the workstation reboots, it loads the kernel image
specified in nvram and mounts /upgrade via NFS[6] from the
upgrade server, sonata, as its root filesystem.  It then starts
up the init process, which in turn runs /etc/rc.

     /etc/rc begins by reading the "netaddr" variable from nvram,
which contains the client's IP address.  It looks up this value
in the /etc/hosts file to determine the system's hostname, and
then configures the network interface.

     Next, /etc/rc execs the upgrade program, which is a Perl
script.  The script has four basic functions:
 1 Repartition the local disk (optional).  If desired, the
   existing partitions can be used.
 2 Create new filesystems on the local disk.
 3 Copy a prototype Irix distribution from the upgrade area to
   the local disk.  As mentioned earlier, the prototype
   distributions are virgin Irix distributions installed
   directly from CDROM, with no modifications.  This is done
   using dd or dump, invoked using a remote shell from the
   client to the upgrade server.  A sample dd command would look
   something like
 rsh sonata dd ibs=32768 obs=1450 \
    if=/upgrade/proto/3kelan | \
    dd ibs=1450 obs=32768 \
    of=/dev/rdsk/dks0d1vol
   The blocking factor is determined by taking the MTU of SGI's
   ethernet interface and subtracting 50 bytes for TCP overhead.
   The output from the disk image is sent directly to the
   client's raw disk device.
 4 Copy any number of site-specific distribution trees on top of
   the new OS distribution.  This is where all customized system
   files are installed.  The copy is done by invoking rdist on
   the server via remote shell.  For example:
 rsh sonata rdist -c \
    /upgrade/dist/3kelan client-hostname:/
   Once this has finished, the upgrade script resets the nvram
   variables and reboots the system with the newly installed
   operating system.
   If any part of the upgrade is interrupted (due to someone
   turning off or resetting the machine, power failure, etc.),
   the upgrade procedure will start over when the system
   reboots.

                    Performance Observations

     We found that it took approximately 20 minutes to copy a
420-megabyte disk image over a lightly loaded ethernet using dd.
Using dump, The procedure took about 30 minutes.  By comparison,
a direct sector-by-sector disk copy took around 10 minutes.
Unfortunately, ethernet doesn't have quite the bandwidth required
to upgrade more than one workstation at a time.  We found that
the most efficient way to get the upgrade done was to write a
script that upgrades each workstation sequentially, and let it
run overnight.

                           Future Work

     Currently, our scheme requires that the root, swap and usr
partitions be allocated to specific partitions on the local disk.
This works fine for just about any workstation.  However, we
would like to expand this to be a bit more flexible and support
customized configurations.

     Our scheme is also very dependent on NFS.  We'd like to
eliminate NFS from the picture (except where it is required for
diskless booting) and switch to a different method of copying the
prototype areas.  FTP[7] appears to be a very attractive
solution.

     Maintaining separate prototype distributions can eat up a
lot of disk space.  We have a couple of ideas which would
alleviate this problem.  One is to use actual running systems as
prototypes; however, this would require a different upgrade
server for each individual machine type, which may be difficult
to do in terms of network topology.  Another solution, one which
we may implement in the future, would be to have each client do a
direct install from a CDROM server.  Unfortunately, this tends to
be a slow process, and also requires a front end (such as
expect[8]) to drive the installation process.  Expect scripts
would need to be tailored for each different Irix release, which
would be tedious and problematic.

     The current method we use to initiate an upgrade is somewhat
of a kludge.  It would be nice to have a server/client type
protocol which allows the admin to start upgrades remotely and
monitor their progress.

                       Author Information

     Paul Riddle is a Systems Programmer with Academic Computing
Services at the University of Maryland, Baltimore County (UMBC).
He has been working at UMBC since 1989.  When he graduated in
1992, he made the transition from underpaid student to full-time
employee.  Currently, Paul works with sendmail and DNS, and helps
to keep the student labs running, among other things.  Someday he
hopes to become motivated enough to get a Master's degree, too.
Reach him via U.S. Mail at The University of Maryland, Baltimore
County; 5401 Wilkens Avenue; Baltimore, MD 21228.  Reach him
electronically at paulr@umbc.edu.

                          Availability

     We expect to have the final version of our upgrade software
ready by September 1, 1994.  It will be available via anonymous
FTP from ftp.umbc.edu in the directory /pub/sgi/upgrade.

                           References

 [1] Wall, L., & Schwartz, R., Programming Perl, O'Reilly &
     Associates, Inc., 1990.
 [2] "rdist(1C) Manual Page," IRIX Reference Manual, Silicon
     Graphics, 1993.
 [3] Manning, C., and Irvin, T., "Upgrading 150 Workstations in a
     Single Setting", Proc. 7th Usenix Systems Administration
     Conference (LISA VII), 1993.
 [4] "dd(1M) Manual Page," IRIX Reference Manual, Silicon
     Graphics, 1993.
 [5] "dump(1M) Manual Page," IRIX Reference Manual, Silicon
     Graphics, 1993.
 [6] "NFS Protocol Specification," Networking on the Sun
     Workstation, Sun Microsystems, 1986.
 [7] Postel, J. & Reynolds, J., "File Transfer Protocol (FTP),"
     RFC 959, Network Information Center, 1985.
 [8] Libes, D., "Using expect to Automate System Administration
     Tasks", Proc. 4th Usenix Systems Administration Conference
     (LISA IV), 1990.