################################################
	   #                                              #
	   # ##   ## ###### ####### ##    ## ## ##     ## #
	   # ##   ## ##  ## ##      ###   ## ##  ##   ##  #
	   # ##   ## ##     ##      ####  ## ##   ## ##   #
	   # ##   ## ###### ######  ## ## ## ##    ###    #
	   # ##   ##     ## ##      ##  #### ##   ## ##   #
	   # ##   ## ##  ## ##      ##   ### ##  ##   ##  #
	   # ####### ###### ####### ##    ## ## ##     ## #
	   #                                              #
	   ################################################


	 The following paper was originally published in the
       Proceedings of the 1997 USENIX Annual Technical Conference
		   Anaheim, California, January 6-10 1997.


	For more information about USENIX Association contact:

		   1. Phone:	510 528-8649
		   2. FAX:	510 548-5738
		   3. Email:	office@usenix.org
		   4. WWW URL:  https://www.usenix.org


\documentclass{usenixart}

\def\a#1{{\it #1}\/}
\def\f#1{{\tt #1}\/}
\def\p#1{{\it #1}\/}
\def\m#1{{\tt #1}}

\sloppy
\flushbottom

\title{Cget, Cput, and Stage - Safe File Transport Tools for the Internet}
\author{Bill Cheswick \\
	{\it Bell Laboratories, Lucent Technologies}\\
	{ches@bell-labs.com}}

\begin{document}

\twocolumn[\maketitle
    \begin{quote}
	\begin{center}
		{\large Abstract}
	\end{center}
	\p{Cget}, \p{cput}, and \p{stage} are three simple programs that
	implement authenticated or encrypted file transfers on the Internet.
	\p{Cget} and \p{cput} read
	and write files to a remote host, and \p{stage} ensures
	that a remote directory accurately mirrors a local master directory.
	
	These routines use private key cryptography
	for authentication and privacy between pairs of secured
	hosts.  They are simple, paranoid Unix
	tools that can be used to support systems that operate in a
	hostile environment.
    \end{quote}
    \vspace{5ex}]

\section{Introduction}

A host can be made reasonably resistant to compromise from the
Internet if it isn't running any dangerous network services.
Most hosts don't come from the manufacturer configured in this
manner---they have to be stripped of all of their network services by
hand. Then only the desired services are installed.  If the network services
are secure (a big ``if''), then machines are much harder to breech.

Two such secure hosts can exchange files and remain reasonably secure
if there is a safe file transport service available. This file
transfer service would have to be resistant to all the popular attacks
found on the Internet today, including the recent
IP spoofing\cite{rtm} and TCP hijacking\cite{Joncheray}.
If the data is sensitive, then
the service would also have to ensure privacy.

Such a service is needed often these days.   In particular,
publicly-available Internet services like FTP and http
are often provided by
hosts running on the dirty side of the firewall, usually
in a DMZ.  
How can we administer such hosts, perhaps from the relative
safety of another host behind a firewall?
How do we install new programs, or install new content?

The standard out-of-the-box
network services do not provide this security.
In fact, they have had a history of jeopardizing their servers.
FTP has been a constant source of trouble: its passwords are easy to sniff,
its protocol has various flaws,
and various servers have had security
holes~\cite[CA-88:01, CA-92:09, CA-93:06, CA-94:07, CA-94:08, CA-95:16]{cert1}.
\p{Rcp} relies on address-based authentication, and the
integrity of the Domain Name System,
which is easy to fool~\cite{bellovin,Vixie95}.
It is also susceptible to IP spoofing attacks~\cite{Tsutomu}.
NFS has a variety
%% cite NFS holes
%% cite RPC holes
of weaknesses: it can be fooled with address spoofing, root handles
can be sniffed or guessed, and it relies on RPC services that have
security weaknesses of their own.

These services are frequent targets for successful hackers.
Still, they are often employed because they are widely available,
and most developers are familiar with them---even when
they are clearly unsuited for the job.
Developers don't have time to build new tools: the frenzy of
Internet hype and growth can leave management focused on time-to-market
issues.  Security is often left to the last minute, and patched in
after the design is finished.

Marcus Ranum and I faced these problems in the fall of
1995.
We encountered {\em ad hoc} solutions using standard, dangerous
Internet services.  For example,
billing data was transferred in the clear using FTP.  Developers
cast about for ways to transfer configuration files and other
important data.  The standard tools they used were
jeopardizing some very important hosts.

We wanted a very simple solution, in the tradition of small Unix
tools.  
This is not a very tall order: it requires a simple file transfer
program running some strong cryptographic or authentication
routines, and a shared secret key.  

We were only setting up a few point-to-point links, so it was
easy to distribute a secret key to each end of a connection. We
didn't want an authentication server or need public key cryptography.
We envisioned that a pair of simple programs,
plus a key file and a configuration
file, were all that were needed.  A simple implementation meant that
more people were likely to understand and use the software, even if
they were in a hurry to make a deadline.

There were a variety of possible off-the-shelf solutions available at
the time: our problems were not new.  Most of them appeared
to have adequate security, but none were as simple as we desired.
(Some of these are discussed
in section \ref{othersolutions}.)

We have a better chance of avoiding security bugs if the
programs are small and simple.

Marcus and I built three programs.
Each was as
simple as possible, and written with all the
minimalism and paranoia we could muster.  
\p{Cget} reads a single file from a remote server, and
\p{cput} writes a file back.  \p{Cget} and \p{cput}
are quite primitive:
they do not support file deletion, directory creation, or program
execution.
(They were originally named \p{get} and \p{put}, but
that clashed with SCCS's routines of the same name.)
\p{Stage} mirrors a master
directory to a slave host. Files and directories are created,
deleted, and downloaded as needed.
It is ideal for updating an external web server
or FTP archive from an internal staging host. 
\p{Unstage}, a recent addition,
works in reverse, updating a local copy of a remote
master directory.
It can be used to suck logs from a server, or perhaps
get a distribution from a master tree.

Our initial cryptographic routines used DES encryption. 
For many applications, this is overkill.  Most transfers
are not secret---FTP and Web data are generally intended
for public viewing.

If we kept strong encryption, it would make
 the software release difficult and unlikely.

So the crypto layer has been rewritten---the interface and code
are cleaner than the first version.   The new crypto routines
use only HMAC keyed message digests
to protect the conversation.  If my
authentication protocol is OK (always a big `if'),
an eavesdropper may watch or interrupt a session, but cannot
modifier replay a session without detection.

The next section describes the authentication protocol
and some cryptographic
issues.
Section \ref{interfacelibrary} describes the user interface to
the cryptographic protocols.
Server design issues are explored in section \ref{server}.
The \p{stage} service is discussed in some detail in Section \ref{stage}.
Section \ref{uses} has a couple of applications
for these programs, including the confinement of an arbitrary
TCP service.
Section \ref{Vulnerabilities} covers vulnerabilities,
and section \ref{performance} has some performance figures.
A tiny sample of the related work is discussed
in section \ref{othersolutions}.
Section \ref{further} describes some enhancements and limitations to
these routines, and
availability information is in section \ref{availability}.
Appendix A describes the staging protocol.


\section{The Authentication Protocol}
\label{protocol}

The client and server exchange messages in
SSL format, although we do not use SSL's complex key
setup.  Each message contains a two byte length field,
the payload, and a 16 byte binary digest.

I use
the HMAC\cite{hmac} digest with MD5, which appears to be headed
for general usage on the Internet.  An HMAC digest of
a message $M$ using key $k$ is shown as
$$
	[M]_k
$$

The client and server share a secret key, $K_{ss}$.
The protocol uses challenges in both directions to derive
session keys.
The challenges are sixteen random bytes encoded in pairs
of hex digits.
The session
key for each end is derived from $K_{ss}$ and the challenge
from the other end:
\begin{eqnarray*}
	K_s: & [C_c]_{K_{ss}}\\
	K_c: & [C_s]_{K_{ss}}
\end{eqnarray*}
The server writes with $K_s$, the client with $K_c$.
The initial exchange between client $C$ and server $S$
proceeds as follows:
\begin{center}
\begin{tabular}{lll}
Message 1& $C \rightarrow S:$ & $N, C_c, [N, C_c, S_c]_0$ \\
Message 2& $S \rightarrow C:$ &	$C_s, [C_s, S_s]_{K_s}$ \\
Message 3& $C \rightarrow S:$ & $\hbox{``{\tt OK}''}, [\hbox{``{\tt OK}''}, S_c]_{K_c}$
\end{tabular}
\end{center}
Here $N$ is a service name (see section \ref{servicenames}),
and $S_c$ and $S_s$ are sequence numbers, $C_c$
is the client's challenge and $C_s$ is the server's challenge.
The sequence numbers are four bytes long.  $S_c$ starts at zero,
and $S_s$ starts at $2^{31}$.

Message 1 delivers the client's challenge.  It uses the
key 0, which helps us detect casual probes of the service.
Message 2 proves to the client that the server is using the new
challenge, has the secret key, and provides the server's challenge.
Message 3 has a trivial payload, but proves to the server that the
client is using the fresh challenge and the secret key.

The session keys are used to prevent replay attacks using messages
from previous sessions.  If we simply keyed our digests with the
$K_{ss}$, an attacker could replay a previous session, perhaps
replacing
a new file with some older one. Each end uses a different session
key so a message can't be played back to its originator.
Similarly, the sequence numbers prevent replays of earlier messages
in the same session.  Without these, an attacker might hijack the TCP
session, and replay earlier messages.  The sequence numbers differentiate
the hashes from each end of the conversation.

Although the client can force the server to use a specific challenge,
and therefore the same key, it can't finish the protocol initialization
without using the server's fresh challenge and the secret key.

A man-in-the-middle can't change a message
without detection:  he cannot determine the session key without
the secret key, and he cannot
obtain the right answers from an additional connection to the server,
since the session key will be different.

This is a simple protocol, and it looks like it ought to do the job,
but I am not a cryptographer, and history teaches that
it is hard to get cryptographic protocols
right.  Though I don't see how this protocol can
be abused, I'd feel better if each session key were based
on both challenges.


\subsection{Administrative concerns}

Protocol setup can interact with administrative concerns.
Our original protocol was terse and unhelpful when, say, one
end had the wrong key.  The obscurity may have added some security,
but it sure didn't help our users who were trying to set up the
service.

The setup may fail for a number of reasons:  the key is wrong
or missing, the key file is not readable, the service is
non-existent, etc.  Each of these errors means that the server
cannot return
the correct digest in its first message.
If the digest is wrong during protocol setup,
the client checks the payload for the
string
``{\tt remote reported an error:}''.  If present, the rest of the
message is an error message from the server describing the
problem.

Until the exchange is complete, the protocol is subject to attack.
In particular, it is possible for an attacker to inject a false
error message
during this setup phase.


\subsection{Keys}

Secret keys are appropriate here: we don't need
public key cryptography.
The usual complaint about secret keys is the distribution
problem---how do we move them around securely?
For us, these programs are only employed in a handful
of hosts, involving perhaps a dozen services and their
keys.

Our keys are printed in hex bytes or base 64 encoding,
both human-readable.
They can be distributed
by hand or over the phone when the service is installed.
We type ours in
at the consoles of the hosts involved.

This might not scale to a large
setup,  but one could imagine an ISP allowing a thousand customers to
use \p{stage} to update a thousand separate web directories.
It would not be much harder to distribute a binary key than a password
that a user has to remember,
and the client wouldn't need an account on the web server.
(This is always a good feature:  users are annoying, and tend
to disrupt security arrangements.)

The secret keys are generated by a program named \p{makekey}.  Its
keys, and the protocol's session challenges, are generated
with \p{truerand}\cite{cryptolib}.
(\p{Truerand} runs a counter in a tight CPU loop
while waiting for an alarm timeout some milliseconds later.  The
bottom two or three bits of the counter are considered random.)

We use full random binary keys: there are no passwords that
a user must remember.


\section{Interface Routines}
\label{interfacelibrary}

To use the crypto routines, the client uses the following code:
\begin{verbatim}
   fd = tcpconnect(host, port);
   ep = start_client_crypto(fd, key,
            srv_nam);
   if (ep != 0) {
       /* error */
   }
   n = cread(fd, buf, sizeof(buf));
   n = cwrite(fd, &gt, ngt);
   if (n < 0) {
      cperror("writing gunilla table");
      exit(1);
   }
\end{verbatim}
and the server uses
\begin{verbatim}
   fd = bindto(service_port);
   ep = start_server_crypto(fd);
   if (ep) {
       perror(ep);
       exit(1);
   }
   n = cwrite(fd, "hello", 6);
   n = cread(fd, buf, sizeof(buf));
   ...
\end{verbatim}
(Some error processing is simplified for clarity.)
\p{cread} and \p{cwrite} are analogous to the standard
I/O routines.  \p{Cperror} reports standard or cryptographic
errors.  This protocol preserves message delimiters: each
\p{cread} will return only the bytes sent by the corresponding
write.  The maximum message size is $2^{16}-1$.

The server must supply a routine named \p{setservice},
which obtains the secret key or returns an error message.


\section{Services, servers, and server trust}
\label{server}

Our server programs assume that they have no friends.  For example,
they
shed privilege early to minimize the code we must trust.
The server programs obtain the
key for the calling host, \p{chroot} to a target directory,
and change their user
id and group to some less-privileged account.
All inputs from the client
are carefully checked for pathological values.

We have two servers:  \p{getd} for \p{cget} and \p{cput},
and \p{staged} for \p{stage} and \p{unstage}.  They operate
on different TCP ports, and are called by \p{inetd}.

\subsection{Service names}
\label{servicenames}

Originally, these routines were keyed to a host's numeric IP address.
A single client host could \p{cget}/\p{cput} to one area of a
server, and \p{stage} to another.  If others needed to \p{stage}
to the same server, they would have to connect from a different
client host.
The IP
address of the caller was used to select the proper key and
service configuration on the server.
Though it provided only slight security, the connection had to
originate from that IP address.
The secret key provides the real security.

This approach worked well for simple setups, but the one-service-per-client
limitation became inconvenient as the services were used more.
Also, a traveling
host couldn't access a fixed server, because the client's IP address
was unpredictable.  We could provide additional services on different
ports on the server, but this is awkward.

The new authentication protocol includes a service name.  The
key and other information are based
on this name.  A single client host can have a number of
services on a given server.  Different users can have different
access to the same server, all controlled at the client end by
read access to the relevant key file.
This trust model shouldn't be pushed too far:  Unix \a{root} accounts
are generally not very resistant to user attack.
A serving host should extend about the same level of trust to
all services from a given client.

\subsection{Trusting the Server Software}
So far, I have assumed that
\begin{enumerate}
\item	we control the server machine entirely, and
\item	we trust the server code until it drops privileges.
\end{enumerate}
Neither assumption may be true if
we would like to persuade some one else (say, an ISP)
to run our servers on
their host.
They would be more willing to run our software if we don't
need \a{root} permission, and perhaps even more willing
if they can contain our software with their own \p{chroot}.

The problem is that \p{chroot} requires \a{root} permission, and it
is hard to change the user id safely with standard shell commands
after the \p{chroot}.  We have to include a \p{setuid} program within
the software ``jail'' to change from user \a{root}, and the 
user with access to this directory might find a way to disable
this program.

The \p{chroot} program needs an option to set the UID of
the executing program.
I wrote a trivial version of \p{chroot} named \p{jail} to do this.
I can
give this tiny program to the ISP.  It's only a few lines long:
they can examine it, trust it, and confine our servers nicely
with it.

We have some problems when the server is enclosed in the
jail.  
How does
the program obtain its key, unless the key is stored within the jail
itself?  We may wish to keep the key secret from the user.
If the key is stored in the jail, a remote user may have undesirable
read access to it.
It could be piped in through a file descriptor opened by \p{jail},
but that's a bit awkward to set up.  The key could also be a parameter
to the server program, but that can make it visible to other users
on the serving host through the \p{ps} command.

It would also be nice to let the jailed server issue \p{syslog} messages.
This requires possibly-dangerous special files inside the jail, or some
mechanism for \p{jail} to perform the \p{openlog} and pass the file
descriptor to the server in the jail.

I am not satisfied with the solutions to either of these
problem. \p{Chroot} is a good start, but Unix lacks
adequate confinement primitives.

%%\section{Cget}

\section{Stage}
\label{stage}

Once the crypto routines were working for \p{cget} and \p{cput},
\p{stage} was an obvious application.
\p{Stage} runs through a local master directory, comparing
each file and directory with the contents of the remote slave
directory.
The master directory for a service is identified by an
entry for that service in configuration file
on the client, usually in \f{/usr/local/etc/stage.conf}.

Like \p{cget}/\p{cput}, \p{stage} uses the service name to look
up the appropriate key.  \p{Staged} also uses the service name
to determine the target directory, user id, and file permission mask.
It uses \p{chroot} to confine itself to the target directory.
This is good: it is somewhat more complex than \p{getd}, and therefore
more likely to have bugs.  It does check its input from the client
carefully: strings can't be too long, ``{\tt ..}'' is not allowed
in path specifications, etc.

\p{Stage} will update a file if it has changed.  A file is considered
changed if its modification date or length are different, or
if its MD5
checksum is different.
The checksum is time consuming---an option suppresses this check.
When a file is copied over, its modification date is set to match
the master copy, if the operating system on the server allows it.

Ordinary users can use \p{stage} to update all or some portion
of the master directory.  It only takes a few seconds to check and
transfer a
few megabytes.
\p{Stage} does not exit
until the update is complete:  there is no queuing
mechanism involved.  Should the program abort or fail for some reason,
it can be rerun to ensure that the directories match.

The user can \p{stage} a file or a directory.  Either must appear under the
master directory.
If he stages
a non-existent path, that path will be deleted on the server,
if it exists.  Hence:
\begin{verbatim}
   rm -rf foo
   stage remote foo
\end{verbatim}
will delete a directory or file named \f{foo} at the other end.
There is a subtle distinction here:
\begin{verbatim}
   stage remote *
\end{verbatim}
and
\begin{verbatim}
   stage remote .
\end{verbatim}
are not the same.   The first entry will update all existing files and
directories in the current directory.  The second will do that,
plus delete
any remote entries that don't appear locally.

\p{Stage} makes no special provisions for special files or
soft or hard links.
It makes no special provisions for files or directories
that change during the
staging process.

The user does not have to have
read access to the key file:  \p{stage} could be \p{setuid} to
an account or group that has read permission for the key.

\p{Staged} serves \p{stage} requests on the remote ser\-ver.
It consults
\f{/usr/\allowbreak{}local/\allowbreak{}etc/\allowbreak{}staged.conf},
which has one line for each supported service name.
Each line contains the service name, the slave directory,
and an optional UID and umask.
File ownership is not propagated to the server:
the slave files are owned by the
account that the service is configured for.
\p{Staged} logs all file activities via the \p{syslog} facility.

The \p{stage} client uses a little protocol
to control the remote server. It is very simple
(see Appendix A), a subset of basic file system access
primitives.  It could be optimized to improve performance.

\section{Applications}
\label{uses}

These routines have been well-received, and are employed in
a number of places in AT\&T and Lucent.
In our lab, \p{stage} has been providing quick and simple
support for \f{ftp.\allowbreak{}research.\allowbreak{}att.\allowbreak{}com} for over a year, and
now supports \f{ftp.\allowbreak{}research.\allowbreak{}bell-labs.\allowbreak{}com} and Lucent's
Web service on \f{www.\allowbreak{}lucent.\allowbreak{}com}.
We had to make a proxy version of \p{stage}
for this last service
to support it through our older firewalls.
It replaced a clunky FTP-based implementation
that was unable to delete files on the slave server.
\p{Cput} has been
especially useful for moving new binaries to external hosts.

One researcher has been transporting large, extremely sensitive
databases over networks having dubious or unknown
security.
The encrypted version of \p{cput} is vital for him.

We have also supported our various external dirty Unix hosts
with these tools.  Automated jobs use \p{cput} to
update mail alias files and internally-generated name
server configuration files.  \p{Cget} fetches daily log files
so users can monitor access to the FTP directory without requiring
logins or more invasive access to the server.
Hal Purdy at AT\&T Research has ported the encrypting
\p{staged} to Windows 95 to
support a roomful of PCs outside the firewall.


\subsection{An Example: updating mail alias files}
Traditional Unix file system permissions can be used to provide
finer control over access to the target directory.
For example, we have a 
slave mail directory on external server
that contains programs and data files.
The master host is allowed to
overwrite and create the mail alias files, but
is not allowed to change the executable
files in the slave directory.
We could just split them into separate directories,
but we are used to the current configuration.

The executable files are owned and writable by account \a{bin}.
The directory and the alias files are owned by \a{daemon}.
This lets the master host update and add new alias files, but
it doesn't have write permission for the executable files.
If we supplied complete Unix file system semantics, they could
delete the executables (since they have write permission on
the directory) and create new ones.
But \p{cget} won't delete a file, only replace it.
There's one additional protection:  we can have the umask in \p{getd}'s
configuration file clear the executable bits in downloaded files.

If we rely on UNIX's built-in file permissions, we reduce the
amount of special-case per\-mis\-sion-checking software in \p{getd}.
Smaller programs are safer.

\subsection{Bob's TCP service}

Bob has a service that he would like to offer
the world.  I don't
know much about it.  It looks at some files,
answers queries on a TCP port, and writes logs.
It doesn't need to be \a{root}, or use any fancy
system services or files.  It's just a program.

I like Bob, and I trust him (mostly).  His service is not
only harmless, it appears to be quite worthwhile.  He'd
like to run it on our external server.

I'd like to help, but I don't want to jeopardize
a host that we've taken great pains to secure.
I don't want an error on Bob's part to reduce that
host's security.
How far do we have to trust him?

We can lock Bob's software and files inside
a \p{chroot} environment.   He won't be running as \a{root},
and we won't leave any dangerous programs in his jail,
so we are pretty confident that he can't get access to the
rest of our file system.  We add
the following single line to \f{/etc/inetd.conf}:
\begin{verbatim}
z3950 stream tcp nowait root
    /sbin/chroot chroot /usr/bob 
    /bin/su - bob -c /tree/bin/zsrv
\end{verbatim}
We use \p{chroot} to confine Bob, then \p{su} to
give up \a{root} privilege, before Bob ever gets to execute
an instruction.  
The security measures are right out there in the open, on
the line where an auditor can see and understand them.
Unfortunately, \f{/bin/su} is inside his jail, so we
use \p{jail} instead:
\begin{verbatim}
z3950 stream tcp nowait root
    /sbin/jail jail -u 99 -g 2 /usr/bob
         /tree/bin/zsrv
\end{verbatim}
\p{Jail} is just like the \p{chroot} command with
the additional {\bf -u} and {\bf -g} options to
set the user id and group.

His program, \p{zsrv} should be linked with static libraries:
it greatly simplifies the setup of the jail.

We split Bob's directory (\f{/usr/bob}) into two subdirectories:
one (\f{/usr/bob/tree}) that he can stage into, and one
(\f{/usr/bob/log}) that he can \p{cget} or \p{unstage}
from.  He can update his server software and other
files under \f{/tree} at will.  He can change the network
server,
but it's always owned and executed on account \a{bob}, not \a{root}, so he
is very unlikely to get out of his jail.

What can Bob, or someone who has hacked into Bob's account, do to us?
Most of the problems are of the denial-of-service type:
\begin{itemize}
\item {File System Full.}  He can fill the partition
	that holds \f{/usr/bob/tree}.   If this is a concern, we
	can put him in his own partition.  Then it only
	breaks his program, though it may cause annoying
	log messages elsewhere.
\item {Core dumps.}  These can fall under the
	file-system-full problem.   The \p{chroot}
	environment assures that the core dump will go
	in his directory, not somewhere else.
\item {CPU hog.}  If he uses too much of the CPU,
	we can \p{nice} him down before he starts.
\item {Memory Full.}  He could eat up or thrash
	memory.  
\item {Open Network Connections.}  Our jail doesn't
	stop Bob from opening outgoing network
	connection.
	This could be abused in a few ways.  In particular,
	he could try to embarrass us or take advantage of
	the good name of our serving host.
	Since these can be spoofed anyway, it shouldn't
	be an unusual problem.
\end{itemize}

Bob has been running his Z39.50 service for over a year.  Aside
from a few core dumps and some occasional configuration
changes while companies lurch apart, Bob hasn't been
a problem.


\section{Vulnerabilities}
\label{Vulnerabilities}

The administrator is often the source of
security problems.
It is easy to leave key files around with unintended
read permissions or incorrect ownership.

If the server is compromised, the game is lost:
our original goal was to protect the server.

If the client is compromised, the intruder can gain access
to all the services allowed to that host. 
The server's paranoia can block 
further spread of such an attack, limiting the intruder's access 
to a particular directory on the server.
This can be an effective barrier which adds another 
layer to the depth of security.

If either host is compromised, then so are the keys,
since they are stored on files.
This alarms some people, but the keys are only
as valuable as the host---once
the host is compromised, the key is useless any way.
We've attempted to limit the extent of such a catastrophe
by limiting the trust each host has for another.

The theft of a key can compromise the privacy
of old sessions of the encrypting transport.
We don't change keys often.

Session interruption can be a problem.  If important security files
are transmitted, the user must note that an attacker can abort a transfer.
For example, \p{tcpwrapper}~\cite{venema} has a list of permitted connections
in \f{/etc/hosts.allow} and a stoplist in \f{/etc/hosts.deny}.  If they
are transmitted in that order, there is a time
when the \f{/etc/hosts.allow} is in place without the exceptions.
This time window can be enlarged by interfering with the second transfer.
An administrator must bear these concerns in mind when moving
security-related files.  In this case, the
\f{/etc/hosts.deny} file should be
transmitted first.

As mentioned before, the protocol's error messages may be subject
to social engineering during setup.  It's an unlikely attack since
the timing is tight and the administrator is likely to be watching
both the client and the server during initial configuration anyway.

As always, a network service accessible to the public
can always receive 
denial-of-service attacks.  Even if the digest is wrong, enough incoming
packets can swamp any service, making it unavailable to its
intended users.

\section{Performance}
\label{performance}

We have found that these routines work with acceptable speed.
With our first protocol
using Eric Young's DES encryption library
we could transfer 500 KB/s through the localhost port of
an SGI running with 150 MHz CPUs.

In another test, our entire 700 megabyte FTP directory was copied
between two fast SGI hosts through a fire\-wall in about 47 minutes.
When \p{stage} was rerun just after this transfer, without any
updates, the check took about 330 CPU seconds and a little over
32 minutes.  With checksumming turned off, it took six minutes.

We don't expect that an entire large directory will be staged more than
about once a day.  Users find the instant-update feature handy, and tend
to stage the little bits they change quite often.  The protocol is not
especially efficient: it could be enhanced to speed up the file-checking phase.

%% comparitive times for cget

\section{Related Work}
\label{othersolutions}
% track

We have reinvented a well-rounded wheel.  There are a number of
software solutions available, some well-suited to their environments.
We want a very light-weight solution.  Private keys
are fine, without authentication servers or fancy certificates.  If
the infrastructure is in place for such tools, use them:
\p{Rcp} with
Kerberos would be fine, if we were running Kerberos in the first place.
But most of our customers don't run Kerberos.

\p{Ssh}\cite{ssh} was fairly new when we did this work.
It provides a transport
protocol that will probably be quite secure---it is under repair
at this writing.
In late 1995, \p{ssh} was a beta release.  But for our uses, \p{ssh}
has too many features---it's too large.  It offers optional features
we don't want, like X11 transport and login facilities.  We didn't
necessarily want these additional services into our secure hosts.
The code is more complicated, and there are more things to misconfigure.
(We do use \p{ssh} in other places:  it's a nice package.)

There are a number of mirroring and general
software distribution packages available.  The best known is
\p{rdist}~\cite{rdist}.
\p{Rdist} typically uses \p{rcp} or \p{ssh} for transport.
The former is not appropriate, but \p{ssh} is a good
choice for supporting \p{rdist}, and it has it enthusiastic
supporters.
There's a lot of mechanism there for our simpler applications.
\p{Rdist} has earned three
CERT advisories~\cite{cert4,cert5,cert6},
which also makes us nervous.

Two other transport programs include \p{filetsf}~\cite{filetsf}
and \p{Mirror}~\cite{mirror}, a Perl program.
\p{Mirror} uses FTP for transport, \p{filetsf}
uses \p{lpr}/\p{lpd}.
Again, we don't want these additional services on our safe machines.

Other file transfer programs have appeared recently, such as
\p{SSLftp}.  See \cite{ylo} for a wide
assortment of related tools.

We've seen other batch approaches to these problems.
A file can be signed or encrypted with PGP, and transported
by FTP or even email.
These batch processes are bulky and unsatisfying.
They require action by special accounts, often initiated
by a polling program run by \p{cron}.
\p{Stage} provides immediate updates, initiated
by the end user.

Lower level encryption, like IP/SEC and IP/V6,
offer more-general solutions.
We could use various existing
tools if these were deployed.
Unfortunately, they are not widely available
yet.  We needed these tools a year ago.

\section{Limitations and further work}
\label{further}
Although these routines are fairly straightforward, some users have
prevailed with some minor enhancements.  \p{Cput} does not
have an option to run a program when the transfer is completed---a
feature found in some file transport programs.  In one application
the receiving host must
scan the receiving directory with a \p{cron} job looking for new files
to process. These files are large (on the order of a gigabyte) and take
a while to transfer.  The \p{cron} job needs to know when the file is
available.  We set the file permissions to 0000 until the transfer is
complete.
The same application needed a unique file name on the destination host.
An optional string ("{\tt \%u}") in the destination file ensures a unique
file name.

These routines make no effort to deal with a file that gets shorter
during the transfer. The user should ensure that the source files don't
change during transfer.

\p{Stage} makes no attempt to lock the target directory.
If two people stage to the same part of a target directory
at the same time, the results are undefined.  \p{Stage} can also
overwrite programs while they are executing,
causing core dumps in many versions of Unix.
Since it doesn't handle special files or links,
it is probably unsuitable for updating
a remote root directory.

One could teach \p{stage} about hard and symbolic links,
but it would add a lot of complexity to the program,
which doesn't seem to be worth it.

There is no mechanism here for a client user
to determine how much disk space is available on his external
partition.  The easiest solution is to install the master directory
on a partition of the same size as the slave's partition.  The user
can monitor his inside usage.  \p{Stage} makes no special effort to delete
external files before installing new ones, so the outside partition
could conceivably fill up during an update if it was nearly full.

Some users wanted more control over the update process.  \p{Stage}'s
scan can take a long time, particularly if the directory tree has
many gigabytes and checksumming is used.  These users wrote scripts
to create a list of files to update, and \p{xargs} can feed this list
to \p{stage}.  I had to add a parameter to suppress the descent below
directories that were not mentioned on the command line, so we
wouldn't do more work than these scripts wanted.

Cryptography can be no better than the quality of the keys.
It is hard to generate key material with general purpose computers.
I rely entirely on \p{truerand} to get this right.  I did generate
10 megabytes of random data from \p{truerand} (it took several days)
and had Eric Grosse, one of our local numerical analysts, run it through
a suite of randomness tests.  It passed.

There have been some problems reported with a slightly-restricted
version of MD5 lately.  Perhaps MD5 will fall soon.
It is still possible that HMAC using MD5 would still be safe:
HMAC frustrates some attacks on its hash primitive.
In any case, I will switch to SHA1.

In general,
we like these routines the way they are, and are resistant to creeping
featurism.  We like their simplicity, and their interaction with
standard Unix tools.  

\subsection{The Joint Ventures Problem}
Joint ventures often occur between two companies that don't
otherwise trust each other.   
Many such joint ventures only need to share a directory tree.
This can reside on a neutral host somewhere.  The contents
of the directory tree, or the existence venture itself, may be highly
proprietary.

The \p{stage} command offers most of the functionality needed to implement
such an arrangement.  \p{Unstage} reverses
the file transfer: the remote directory is the master and the local
directory is the slave.  User's can share their work with these
routines.

These two tools lack only a locking mechanism, which would reserve
a subdirectory or file.  For example, assume that two authors work
for separate companies, but need shared access to the source
for their book.  One could lock a chapter that he is working on, and
the other would have only read access to that chapter.  The chapter
could be staged back and the lock released.  

I've tried to come up with some simple mechanism to enforce
locking using file system
permissions in the master directory, without a satisfactory result.
It would be nice to change the owner of a locked directory, but
that requires more privilege than I am willing to give the server
software.


\section{Availability}
\label{availability}
The early DES versions of these routines are
freely available to AT\&T and Lucent
employees, and may be found on the companies' Intranets~\cite{sourceurl}.
I will not attempt to distribute these.

Marcus has published his original \p{get} and \p{put}
routines, with the original crypto API but not the
DES routines \cite{mjrget}.

I expect to have publication clearance for the authentication-only
versions for non-commercial use in time for this conference.
I am keen to release these routines to the general public.
A general release will expose them to public
review and possible improvement.
Good cryptography and secure programming are hard to do---
it is in our corporate interest to run these routines through
the wringer.

See \cite{release} for obtaining this software.

\section{Acknowledgements}
Marcus Ranum wrote the initial versions of \p{cget} and \p{cput}.
Andrew Hume entrusted many gigabytes of sensitive data to early
versions of \p{cput}, and made several helpful suggestions.
Hal Purdy ported \p{staged} to Windows 95.
Lorette Archer, Steve Bellovin, Matt Blaze,
John Linderman, Adam Moskowitz, and Bob Waldstein gave helpful
suggestions and feedback on the software or this paper.

\begin{thebibliography}{10}
\bibitem{sourceurl}
	{\raggedright\tt https://netlib.bell-labs.com/1127/\allowbreak{}ropes/\hfil{}\allowbreak{}crio.tar.Z}
\bibitem{mirror}
	{\raggedright\tt ftp://src.doc.ic.ac.uk/\allowbreak{}packages/\hfil{}mirror/}
\bibitem{hmac}
	Bellare, M., Canetti, R., and Krawczyk, H.,
	{\it Keyed Hash Functions for Message Authentication},
	Advances in Cryptology -- CRYPTO~96 Proceedings, Lecture Notes in
	Computer Science, Springer-Verlag Vol. 1109,
	N. Koblitz, ed, 1996, pps.~1--15. 
\bibitem{bellovin}
	Bellovin, Steven M., 
	{\it Using the Domain Name System for System Break-ins},
        Fifth USENIX Security Conference Proceedings, pps.~199--208, June 1995.
\bibitem{cert1}
	{Computer Emergency Response Team (CERT).\break{}
	See {\tt ftp://ftp.cert.org/pub/cert\verb|_|\hfil\break{}advisories}.}
\bibitem{cert4}
	Computer Emergency Response Team (CERT),
	``/usr/ucb/rdist Vulnerability'', CA-91:20, Oct. 1991.
	(superseded)
\bibitem{cert5}
	Computer Emergency Response Team (CERT),
	``SunOS /usr/ucb/rdist Vulnerability'', CA-94:\allowbreak{}04, Mar.~1994.
	(superseded)
\bibitem{cert6}
	Computer Emergency Response Team (CERT),
	``Vulnerability in rdist'', CA-96.14, Aug.~1996.
\bibitem{rdist}
	Cooper, Michael A.,
	{\it Overhauling Rdist for the '90s},
	Proceedings of the Sixth Systems Administration Conference (LISA VI),
	pps.~175--188,
	Long Beach, CA, October, 1992.
%\bibitem{conehead}
%	Andrew Koenig, {\it Automatic Software Dis\-tri\-bu\-tion},
%	Proceedings of the USENIX Summer Conference, pps~312--322, Salt Lake
%	City, Utah, June, 1984.
\bibitem{Joncheray}
	Joncheray, Laurent,
	{\it A Simple Active Attack Against TCP},
	Proceedings of the Fifth Usenix Unix Security Symposium,
	pps~7--19,
	Salt Lake City, Utah, June 1995.
\bibitem{cryptolib}
	Lacy, J.B., Mitchell, D.P.,  Schell, W.M.
	{\it CryptoLib: Cryptography in
	Software,} UNIX Security Symposium IV Proceedings, USENIX Association,
	1993, pps.~1-17.
\bibitem{rtm}
	Morris, Robert,
	{\it A Weakness in the 4.2BSD Unix TCP/IP software,}
	Computing Science Technical Report 117, AT\&T Bell Laboratories,
	Murray Hill, NJ, February 1985.
\bibitem{filetsf}
	Sellens, John,
	{\it filetsf: A File Transfer Systems Based on lpr/lpd},
	Proceedings of the Ninth Systems Administration Conference (LISA IX),
	pps.~195--212, Monteray, CA, September, 1995.
\bibitem{Tsutomu}
        Shimomura, Tsutomu,  and Markoff, J.,
        {\it Takedown.}
	Hyperion, 1996.
\bibitem{venema}
	Venema, Wietse,
	{\it TCP WRAPPER:  Network Monitoring, Access Control and Booby Traps},
	UNIX Security III Symposium,
	pps.~85--92, Baltimore, MD, September 1992.
\bibitem{Vixie95}
        Vixie, Paul,
        {\it DNS and BIND Security Issues.}
        Fifth USENIX Security Conference Proceedings, pps.~209--216, June 1995.
\bibitem{ssh}
	Ylonen, Tatu,
	{\it SSH - Secure Login Connections Over the Internet},
	6th USENIX Security Symposium,
	pps~37--42,  San Jose, CA, July 1996.
\bibitem{ylo}
	{\raggedright\tt https://\allowbreak{}www.cs.hut.fi/\allowbreak{}ssh/\hskip 0pt plus 50pt{}crypto/}
\bibitem{release}
	{\tt ftp://ftp.research.bell-labs.\allowbreak{}com/\allowbreak{}ches/\hfil{}\break{}crio.html}
\bibitem{mjrget}
	{\tt https://www.clark.net/pub/mjr/pubs}
\end{thebibliography}

\vskip 0.2in

\section*{Appendix - the Stage Protocol}

This is the little protocol that \p{stage} and \p{unstage}
use to control \p{staged} and the remote
directory. The commands and responses are ASCII fields separated
by a single blank and terminated with a zero byte.

\vskip 0.1in


\def\tab#1#2{\hbox{\hbox to 0.4in{#1\hfil}
		\hbox{\tt #2\hfil}
		}
}

\tab{send}{rm {\it fn}} 
\tab{rcv}{OK}
\begin{quote}
	Remove the given file or directory. Everything
	beneath the directory is removed as well.
	Returns either
	``OK'', ``ENOENT'' (not found), or a string describing
	some other error.
\end{quote}

\tab{send}{st {\it fn}}
\tab{rcv}{{\it uid} {\it gid} {\it mode} {\it mtime} {\it size}}
\begin{quote}
	Return the stat of a file or directory.  The mode is octal, the
	other values are decimal.  ``ENOENT'' is returned if the file doesn't
	exist, and other strings contain a displayable error message.
\end{quote}

\tab{send}{cs {\it fn}}
\tab{rcv}{{\it md5 checksum}}
\begin{quote}
	Return the 32-hex digit MD5 checksum, or an empty string if the file
	doesn't exist.
\end{quote}

\tab{send}{pu {\it fn}}
\tab{send}{{\it user} {\it group} {\it mode} {\it mtime} {\it size}}
\tab{send}{({\it size} bytes)}
\tab{rcv}{OK}
\begin{quote}
	Push a new file {\it fn}.  It must not already exist.  {\it User} and
	{\it group} are alphabetic, and currently ignored.  {\it Mode} is octal,
	and {\it mtime} and {\it size} are decimal.  The modification and access
	times are set to {\it mtime}, if allowed.  Returns ``OK'' or a printable
error message.
\end{quote}

\tab{send}{md {\it fn}}
\tab{send}{{\it user} {\it group} {\it mode} {\it mtime} {\it size}}
\tab{rcv}{OK}
\begin{quote}
	Create a directory with the given {\it mode} and {\it mtime} (if possible).
	{\it User}, {\it group}, and {\it size} are ignored.
	Returns ``OK'' or a printable
	error message.
\end{quote}

\tab{send}{ls {\it fn}}
\tab{rcv}{/{\it fn1}/{\it fn2}/.../{\it fnn}//}
\begin{quote}
	Return a list of files in the given directory, separated by
	slashes and terminated with a double slash.  If {\it fn} isn't a directory,
	doesn't exist, or is empty, ``//'' is returned.
\end{quote}

\tab{send}{ge {\it fn}}
\tab{rcv}{OK}
\tab{rcv}{size {\it bytes}}
\tab{rcv}{({\it size} bytes)}
\begin{quote}
	Get a remote file {\it fn}. Returns ``OK'' or a printable error message.
	If OK, return the size of the file in bytes, and the contents
	of the file.
\end{quote}

\tab{send}{ex}
\tab{rcv}{OK}
\begin{quote}
	Exit.
\end{quote}

\end{document}