The following paper was originally published in the
       Proceedings of the Fifth USENIX UNIX Security Symposium
		   Salt Lake City, Utah, June 1995.


	For more information about USENIX Association contact:

		   1. Phone:	510 528-8649
		   2. FAX:	510 548-5738
		   3. Email:	office@usenix.org
		   4. WWW URL:  https://www.usenix.org


		A Domain and Type Enforcement
			UNIX Prototype

[footnote: UNIX is a registered trademark in the United States and
other countries, licensed exclusively through X/Open Company Ltd.]

		Lee Badger 
		Daniel F. Sterne 
		David L. Sherman 
		Kenneth M. Walker 
		Sheila A. Haghighat 
	
		Trusted Information Systems, Inc.  
		3060 Washington Road 
		Glenwood, Maryland 21738

Abstract

UNIX system security today often relies on correct operation of
numerous privileged subsystems and careful attention by expert system
administrators.  In the context of global and possibly hostile
networks, these traditional UNIX weaknesses raise a legitimate
question about whether UNIX systems are appropriate platforms for
processing and safeguarding important information resources.  Domain
and Type Enforcement (DTE) is an access control technology for
partitioning host operating systems such as UNIX into access control
domains.  Such partitioning has promise both to enforce organizational
security policies that protect special classes of information and to
generically strengthen operating systems against penetration attacks.
This paper reviews the primary DTE concepts, discusses their
application to IP networks and NFS, and then describes the design and
implementation of a DTE UNIX prototype system.

Introduction

As UNIX systems become a major part of the National Information
Infrastructure, UNIX security mechanisms are coming under increasing
pressure to resist attacks by highly motivated individuals, companies,
and governments.  Currently, UNIX security rests on protection bits,
the root user, and the setuid/setgid mechanism, which place a great
deal of security responsibility on privileged application programs and
expert system administration.  This has two important consequences.
The first is that UNIX systems often exhibit a ``weakest link''
phenomemon in which compromise of any privileged subsystem (e.g.,
fingerd, lpd, rdist) makes an entire host vulnerable.  The second is
that reliance on numerous privileged applications increases the
difficulty of implementing coordinated security policies that provide
uniform protection to data and processing resources.  These two
problems motivate a legitimate concern over whether UNIX systems are
appropriate platforms for processing and safeguarding important
information resources in global and possibly hostile networks.

UNIX (and other operating systems) can in theory be hardened against
threats inherent in such environments by adding an access control
layer that restricts privileged processes so that damage resulting
from compromise or error is limited.  This benefit, however, has not
been realized by mainstream UNIX systems even though a number of
access control mechanisms [4,2,6,8,18] have been available for years.
One reason may be that security enhancements often impose significant
costs resulting from more complex system administration, application
incompatibility (or unavailability), and additional user training.
This raises a central question for practical UNIX security: can
significant enhancements be added in a way that is understandable,
effective, and unobtrusive?

This paper presents our experiences with a new form of access control,
Domain and Type Enforcement (DTE) [1] and a prototype DTE UNIX system.
In recognition of the fact that access control techniques have not
been easily accepted by operating system vendors (or users), DTE has
been formulated specifically to address requirements of greatest
concern for both vendors and users, namely: flexibility, simplicity,
operating system interoperability, binary application compatibility,
and performance.  This paper reviews DTE, [footnote: DTE is described
in more detail in [1].] discusses how DTE can be applied to IP
networks and NFS and then discusses design and implementation issues
of the DTE UNIX kernel.  Finally this paper reviews related work and
discusses our plans for further development of DTE over the next few
years.

DTE

DTE is an enhanced form of type enforcement, a table-oriented access
control mechanism originally proposed by Boebert and Kain [9] and
later refined in the LOCK system [21].  As with many access control
schemes, type enforcement views a system as a collection of active
entities (subjects) and a collection of passive entities (objects).
In type enforcement for UNIX, an access control attribute called a
domain is associated with each subject (process), and another
attribute called a type is associated with each object (file, message,
shared memory segment, etc.).  A global table, the Domain Definition
Table (DDT), represents allowed access modes between domains and types
(e.g., read, write, execute), and another table, the Domain
Interaction Table (DIT), represents allowed access modes between
domains (e.g., signal, create, destroy).  As a system runs, access
attempts are mediated using table lookups: access attempts for modes
not authorized in the tables are denied.

Although type enforcement is very flexible, the access control tables
can quickly become too complex, and type enforcement is difficult to
use in practice.  Additionally, the presence of type attributes on
files appears to require a new and incompatible file system format.
To address these issues, DTE enhances type enforcement in two ways:

DTE policies are specified in DTE Language (DTEL), a high-level
language suitable for expressing reusable access control
configurations that are compatible with current applications and
system configurations.

During system execution, DTE file security attributes are not stored
one-to-one with files on disk, but are instead maintained implicitly
in a form that capitalizes on the directory hierarchy to compactly
represent portions of a file hierarchy that have identical attributes.
Using implicit typing, DTE can therefore be applied to existing files
with no change to file system formats.

DTE is a configurable, kernel-level access control mechanism.  At each
system boot, a DTE UNIX system processes a DTEL specification and
establishes access controls during UNIX kernel initialization.  All
processes, including root processes, are subject to DTE controls.
DTEL currently provides four [footnote: For brevity we omit peripheral
DTEL statements and features and also restrict our attention here to
implemented features with which we have actual experience.] primary
statements for expressing a DTE configuration:

[type] Declares one or more object types to be available to other
parts of a DTEL specification.

[domain] Expressed as a list of tuples, defines a restricted execution
environment composed of three parts: 1) ``entry point'' programs,
identified by pathname, that a process must execute in order to enter
the domain (e.g., (/bin/login)), 2) access rights to types of objects
(e.g., (rwx->foot)), and 3) access rights to subjects in other domains
(e.g., (sigkill->userd)).  A DTEL domain controls a process's access
to files, a process's access via signals to processes running in other
domains, and a process's ability to create processes in other domains
by executing their entry point programs.  For backward binary
compatibility, the domain statement also provides an access designator
to force domain transitions on older programs that are not aware of
DTE: if a domain X has auto access rights to another domain Y, a
subject in X automatically creates a subject Y in when it executes,
via exec(), an entry point program of Y.

[initial_domain] Selects the domain of the first process.

[assign] Associates a type with one or more files.  An assign
statement may be recursive, in which case it applies to a directory
and everything below, and one assign statement may override another;
for instance, an assign statement for /tmp/foo may override a
recursive assign statement for /tmp.

An important goal for DTE is to superimpose useful security policies
on existing UNIX configurations while using implicit typing to
maintain backward compatibility with existing data formats and
applications.  Figure [removed] shows a DTEL specification of a
commercial policy designed to provide data protection and user
authorizations in an engineering organization.  To validate that our
example specification is not trivial, we have run it on our prototype
DTE system and found it to provide useful protection.  This
specification provides three types of protected user data, one type of
system data, three user domains, and two supporting system domains.
The user domains correspond to job descriptions, such as engineer or
accountant, and the system domains provide operating system support.
Additionally, this specification assigns type attributes to all files.

A DTE system running the specification of figure [removed] starts the
first process in the systemd domain, which is then inherited for all
other system processes except the login program.  The specification
uses the auto mechanism to run login in the logind domain even though
the existing getty program does not request the domain transition.
The logind domain has the authority to create the user domains
(engineerd, projectd, and accountingd), based on user authentications.
Each user login session is confined by one of the user domains
controlling access to protected data, which resides in three
directories under /projects.  Though simple, this sample specification
can be incrementally refined to add additional user domains,
distinguish between console and network user sessions, simultaneously
support additional organizational policies, and harden UNIX itself by
running its root daemons in tightly constrained domains.

DTE Networking

Since UNIX systems are usually networked, DTE systems must work
naturally while communicating both with other DTE systems and with
non-DTE systems.  In particular, multiple DTE systems must provide
mechanisms allowing coordinated protection of information among
themselves, and DTE systems must protect themselves from non-DTE
systems.  To accomplish this, DTE adds two attributes to network
communications carrying user data: 1) the type of the data written by
the sending process and 2) the domain of the process that sent the
data, the ``source domain.''  A receiving process can always view the
data's type, which the receiver must know to adequately protect the
data, or possibly to protect itself from the data.  Additionally, a
receiver can always view the sender's domain; a DTE server that
receives a request can therefore use the client's domain to decide
whether to perform the requested function.

To maintain compatibility with existing network protocols and
applications, DTE attributes are carried as IP options, [footnote:For
experimental purposes, we currently assume that network packets are
not stolen or modified.  We plan to take advantage of known and
emerging cryptographic techniques and protocols for communications
authentication [15], integrity, and confidentiality [10,11] as
appropriate.] with no change to packet contents.  DTE mediates
communications over standard datagram and stream-oriented services.
In each case, DTE imposes access control mediation both at send time
and receive time: to successfully send data of type t, a process's
domain must permit write access to t, and to successfully receive data
of type t, a process's domain must permit read access to t.  For
datagram protocols such as UDP, a single type labels the contents of
an entire packet.  For stream protocols such as TCP, different
portions of a stream may have different types of data; a sequence of
contiguous bytes having the same type is a substream.

These design choices give a high priority to compatibility and
interoperability.  Our datagram approach is not unusual, and
homogeneously typed datagrams work well for existing applications
since they are unaware of DTE and therefore only generate one type of
data.  Our stream approach, however, is less typical.  A simpler
approach would bind a security attribute to a stream socket and
therefore to all data communicated on it.  Typical UNIX service
interactions, however, make this approach problematic.  An important
example is inetd, which receives socket connections for services it
spawns: inetd must be able to connect to a socket and then hand the
descriptor to a child process that may run in a different domain.  The
use of substreams removes the need for inetd to run in an all-powerful
domain.  Programs like telnet and rlogin provide other examples: if a
user runs a program that produces output of multiple types, a single
connection can carry the output back to the client in multiple
substreams, but statically typed connections would force dynamic
creation of new TCP connections to send the data.  While multiple
connections could be used to transmit multiple types of data, this
would change application-layer protocols (like rcmd) and prevent DTE
network applications from interoperating with their non-DTE peers.

In addition to maintaining compatibility with UNIX network
abstractions and application-level protocols, it is also necessary to
define how DTE systems interoperate with non-DTE systems.  In order
for a DTE system to properly control network applications, all
communications must carry type and source domain attributes.  At the
same time, however, DTE applications must interoperate with
applications running on non-DTE systems that do not provide DTE
attributes.  To provide interoperability without weakening DTE, DTE
hosts associate a domain with every foreign non-DTE host and mediate
all network traffic with that host so that the effect of the mediation
is as though the host were actually running DTE and the process
sending (or receiving) from that host were running in the associated
domain.  Using DTEL, a DTE system can associate a single domain with
the ``universe'' of foreign non-DTE hosts, associate a different
domain to each class A, B, or C network, and finally associate
specific domains to individual non-DTE hosts that, for various reasons
(such as quality of administration), are more or less trustworthy than
their LAN.  This technique has performed well in our corporate LAN,
allowing us to appropriately ``trust'' specified non-DTE hosts.
Although we are using source-address ``authentication'' for
compatibility at present, our plans include moving to stronger
authentication, such as is envisioned for IP6, as the overall network
infrastructure evolves.

Although our experience with DTE networking is still somewhat limited,
we have been able to run existing UNIX applications such as rsh,
rlogin, telnet, ping, sup, and mount in suitable DTE domains and we
have encountered no ``show stoppers.''  We have discovered, however,
that although TCP/IP hosts should drop IP options they don't
recognize, that doesn't always happen and SunOS 4.1.1 on Sun 3
systems, in particular, crashes when presented with an unrecognized
option.  As a result, we have added features to our systems that
prevent the sending of DTE attributes to hosts that are not known to
be currently running DTE.  We are now formulating the requirements of
a DTE protocol that would maintain timely information on the DTE
status of a machine as well as provide DTE policy negotiation
functions that ensure that different machines ``mean'' the same thing
by DTE attributes they exchange.  Although we only have experience to
date with UDP and TCP, our techniques appear to apply to raw IP, and
potentially also to multicast protocols such as ISIS [5] and PSYNC
[22].

DTE NFS

The ubiquitous use of NFS highlights the need for DTE to both support
NFS on DTE systems and also to interoperate with non-DTE systems that
use NFS.  An integration of DTE and NFS for DTE-aware clients and
servers is relatively simple and involves sending and receiving DTE
attributes between DTE systems that then use the attributes for
mediation in the same way they use locally stored DTE attributes.  To
make DTE useful in the short term, however, interoperability with
non-DTE NFS clients and non-DTE NFS servers may be even more
important.

A significant benefit of implicit typing [1] in this regard is that
DTE client workstations locally associate types with all files, even
files provided over NFS by file servers that are not DTE-aware.  This
ability has allowed us to use DTE workstations to make selected
portions of our corporate file server available to selected groups of
users with a minimum of administrative effort.  As electronic commerce
increases the need for cooperation between organizations, we expect
this scenario to become more common.  Figure [removed] displays the
concept.  A guest user has an account only on a DTE system.  This
system mounts from an existing file server and applies the type
``proprietary data'' to some files on the imported file system and the
type ``non sensitive data'' to the others.  All guest user processes
running on the DTE system are restricted according to the local DTE
policy to access only the non-sensitive data.

DTE network features allow a DTE system to refuse communication with
selected non-DTE hosts and to prevent important types of data from
being exported to non-DTE hosts (regardless of which communication
service is used).  If communication with a non-DTE NFS server is
allowed, the client-side DTE/NFS subsystem associates types with
imported files based on their pathnames.  A premise of our work is
that access controls must be flexible: it is up to the system
administrator of a DTE system to determine whether a non-DTE host
should be trusted to properly maintain data of various types.
Although all the data received at the IP layer will be typed according
to the DTE domain associated with the non-DTE file server, the DTE/NFS
subsystem on the client system resides in the DTE UNIX kernel and is
trusted to override the default communications type with correct file
types as specified in the system's DTEL specification.

Initially, we added DTE only to the NFS client side, as described
above.  We are currently testing a DTE/NFS server that can serve
clients on both DTE and non-DTE systems.  When the client is on a DTE
system, all NFS requests are labeled by the client system with the
source domain of the requesting process.  The DTE/NFS server then uses
the source domain as a client credential to consult the system's DTEL
specification and determine whether the request is authorized.  In
addition, each IP packet that carries the contents of a file accessed
via DTE/NFS is labeled with the type associated with that file.  A
potential benefit of this approach is that both source domain and type
attributes are readily visible to routers and network firewalls and
could allow future versions of such devices to consult them when
making filtering and routing decisions.  An additional benefit is that
the NFS protocol need not be modified.  Although NFS client requests
sent by non-DTE systems lack source domain attributes, the DTE/NFS
server's IP subsystem attaches them (in accordance with the DTE
system's DTEL specification) before passing the requests to the
DTE/NFS subsystem for mediation.  From the non-DTE client's point of
view, the DTE/NFS server behaves like a non-DTE server, except that
access may be denied for some requests where, in the absence of DTE,
the request would have been granted.

The NFS protocol is designed so that NFS server systems may crash,
reboot, and resume NFS service without requiring clients to perform
new lookup operations on files that were open at the time of the
crash.  Each NFS request contains an NFS file handle that identifies
the file by file number, which allows a typical UNIX system to access
the file directly without performing a name translation.  Unlike the
permission bits and owner identifiers associated with a file, however,
the implicit DTE attributes are not stored within inodes but in a
separate attribute database organized by pathname instead of file
number.  If a newly rebooted DTE/NFS file server could not locate
security attribute information for an NFS request, it would have to
refuse the request, resulting in a stale file handle at the client
application.  To prevent this, the DTE/NFS prototype reconstructs
pathnames based on inode numbers by maintaining a cache of parent
inode numbers for non-directory files accessed via NFS, thereby
permitting it to find file attributes in the DTE attribute database.

On our DTE/NFS prototype, the NFS daemon, like all other processes,
runs in its own domain and is constrained in accordance with the
system's DTEL specification.  On most systems, this domain will likely
be configured to give the daemon the ability to access and export many
types of information.  Nevertheless, it is not necessary to make all
types accessible to it.  If highly sensitive or critical types of
information are stored on a system, it may be highly desirable to
prevent them from being exported.  Standard NFS provides features for
limiting the exporting of files, but these features are
coarse-grained, dealing only with whole file systems and are available
only to a system administrator.  By making certain types of files
inaccessible to the NFS daemon, DTE provides a strong additional
mechanism that can be employed by administers to prevent individual
files on arbitrary file systems from being exported.

Our experience with DTE/NFS servers is still very limited; however,
our initial results are encouraging: NFS clients on DTE or non-DTE
systems can be granted fine-grained restricted access to NFS-exported
file hierarchies without change to applications or to non-DTE system
configurations.  The DTE prototype system's security attribute
management strategy requires implementation of a new system cache and
secondary storage to store the cache across system reboots.  The
cache, however, requires little human administration and requires only
a small amount of additional I/O that only occurs in the context of
I/O already required by NFS.

DTE UNIX Prototype

To gain experience with DTE concepts, we have implemented a prototype
DTE UNIX system based on OSF/1 MK4.0.  Although our system is based on
a Mach microkernel, the DTE features are located in relatively high
layers of the UNIX server's architecture, require no knowledge of
microkernel interfaces, and are therefore reasonably portable to
kernelized UNIX systems.  We have also recently ported the DTE
prototype to run on TMach Version 0.2 [7], a high-assurance trusted
computing base designed to satisfy DoD security requirements as
specified in the Trusted Computer System Evaluation Criteria [20].
Even though TMach employs a TMach-specific file system format, the
integration required almost no change to the DTE implementation
because the integration points between the UNIX server and TMach are
generally at low layers in the UNIX architecture, whereas DTE is
mostly implemented in the upper layers of the UNIX ``kernel.''

Figure [removed] shows the prototype's architecture.  To enhance
portability, the majority of the DTE implementation is located in an
isolated subsystem consisting of lines of commented C code and lines
of commented lex and yacc code.  Other UNIX kernel subsystems call
into the DTE subsystem to request security services.  This part of the
integration consists of another lines of code, bringing the total DTE
integration to approximately lines of kernel-resident code.  The DTE
prototype's kernel provides new system calls for DTE-aware
applications to use for retrieving security attributes for display to
the user and for implementing security relevant functions.

In addition to kernel changes, we have implemented a DTE version of
the login program that authenticates users for specific roles
[17,3,26] and then confines user sessions to specific domains using
domain transitions authorized by the DTEL specification.  To allow
users to view DTE attributes for processes and files, we have
implemented DTE-aware versions of a number of UNIX utilities such as
ls and ps, and we have implemented a DTE-aware version of emacs that
displays type attributes of file buffers and allows users to
simultaneously view and manipulate labeled information in multiple
windows.

As the prototype boots, it reads its DTEL specification and confines
all processes, regardless of UNIX root privileges, to specified
domains.  DTE is active before single-user mode has been reached.
According to its DTEL specification, the prototype labels files,
network packets, and processes; determines domain interactions; and
mediates process access requests.  We have tested a number of policies
using the prototype, including a policy to partition the components of
a simulated command and control system, a policy to strengthen UNIX by
confining UNIX root processes in separate domains, and an enterprise
data protection policy (similar to that of figure [removed]).
Additionally, we use DTE client workstations to permit but safely
limit access by ``guest'' users who are authorized to see some but not
all TIS sensitive data.

The DTE prototype's design and implementation have given a high
priority to maintaining operating system interoperability and binary
application compatibility.  Three aspects of the DTE prototype are
central to achieving these goals: 1) preserving existing data formats
by employing implicit security attributes, 2) ensuring that implicit
attributes are recoverable in the presence of system shutdowns and
power failures, and 3) adding DTE networking support without change to
existing protocols.

Implicit Attributes

For entities that must be recreated at each system boot (such as
process structures or IP datagrams), the DTE prototype attaches
security attributes explicitly to each object.  Compatibility and
performance can be maintained with this strategy because modifications
need not affect secondary memory data formats or require additional
I/O.

Files, however, present a more difficult case both because security
attributes must be maintained on disk to survive system reboots and
because files are usually numerous.  To address these issues, the
prototype associates security attributes with files ``implicitly''
based on their locations within directory hierarchies.  For
portability, most of the prototype's functions for file security
attributes are implemented at the Virtual File System (VFS) layer and
build associations between vnodes [19] and security attributes.  Since
all currently accessed files are represented by vnodes, all files in
use have associated security attributes.  When the prototype boots, it
creates in kernel memory a tree of map nodes that describe how
security attributes are bound to the hierarchical file name space.
Although our current prototype simply keeps this tree entirely in
memory, it can in principle be paged to disk as necessary.

A sequence of map nodes proceeding from the root map node to a leaf
map node names an existing path in the hierarchical filesystem name
space.  Each map node optionally associates one or more security
attributes with the path component associated with it.  The prototype
currently maintains two kinds of security attributes bound to files:
type names and domain entry points.  To represent attributes
implicitly, a map node may also associate security attributes with
files whose pathnames merely include the map node as a prefix.  Such
map nodes represent ``implicit'' associations.  For each security
attribute, a map node provides the following options:

[implicit at] The attribute is bound to this path component.  In the
absence of higher-priority map nodes that conflict with this map node,
the attribute is also bound to all pathnames having this path
component as a prefix.

[implicit under] The attribute is not bound to this path component,
but, in the absence of conflicting higher priority map nodes, the
attribute is bound to all pathnames having this path component as a
prefix.

[explicit] The attribute is bound to this pathname only.

Informally, the prototype resolves map node conflicts by giving
priority to the map node that represents a longer path, interpreting
implicit under attributes to be ``longer'' than implicit at attributes
for the same path and always giving priority to explicit attributes.

Each path provided to a domain or assign statement potentially
generates a map node for every component of the path.  For example, a
path ``/a/b/c'' given in a DTEL statement generates three map nodes
(the root map node is automatically present).  Map nodes are shared,
however, so if a second DTEL statement specifies ``/a/b/c/d,'' only
one new map node is generated.  DTEL provides flags to set the initial
options of map nodes: the DTEL assign statement, which associates
types with files, takes a ``-r'' option to designate implicit at and a
``-u'' option to designate implicit under.  DTEL domain statements
automatically generate explicit associations for their entry point
attributes.  For example, the following DTEL statements generate the
map nodes displayed in figure [removed].

	assign       roott    /;			
	assign -u    unixt    /;			
	assign       criticalt    /dtpolicy;	
	domain  food  = (/usr/bin/login), ...;	

That figure [removed] shows five map nodes, one for each unique
component in the paths ``/usr/bin/login'' and ``/dtpolicy.''  Each map
node records the name of its path component and optionally records
attribute associations (in figure [removed], ``e'' for explicit, ``a''
for implicit at, and ``u'' for implicit under).  Figure [removed]
shows that the root map node is explicitly of type ``roott'' and that
all files under the root ``inherit'' the type ``unixt.''  This
inherited type is overridden, however, for the file ``/dtpolicy,''
which has an explicit type attribute of ``criticalt.''  The domain
``food'' has an entry point program, ``/usr/bin/login,'' and that file
therefore has an explicit domain attribute and it also inherits the
type ``unixt.''

Attributes represented by map nodes are related to files by
association with standard vnode structures that have been slightly
extended to interact with the map node tree.  At system
initialization, the root vnode is associated with the root map node.
Subsequently, all name resolution operations establish bindings so
that every vnode is related to a map node.  In the case that a map
node exists for a file represented by a vnode, a name resolution
operation attaches the vnode directly to the map node.  If a map node
does not exist, the name resolution mechanism attaches the vnode to
its parent vnode; since every resolution operation operates from a
known absolute or relative path, every new attachment is relative to a
known vnode, and all vnodes are eventually connected to the map node
tree through a chain of parent vnode pointers.  To maintain parent
vnode pointers, the DTE prototype references parent vnodes, resulting
in a somewhat increased kernel memory requirement for active vnodes.
Figure [removed] shows the vnode associations that result from process
access to the files ``/usr/george/papers/usenix'' and
``/usr/bin/login.''  Because the login program's pathname is fully
represented by map nodes, vnodes for the path attach directly.  For
the path to George's usenix paper, the first two vnodes of the path
connect directly to map nodes, and the rest point to the last map node
in the path.  Both files have the type ``unixt,'' which is provided by
the root map node.

By binding attribute values to vnode structures, the DTE prototype
ensures that attributes are always available before they are needed
even though the attributes may not be stored one-to-one on secondary
storage.  The DTE prototype retrieves attribute values of files using
a simple algorithm that follows vnode parent pointers up until the
first map node is reached and then optionally follows map nodes until
the ``governing'' map node is reached.

Efficiency is a primary concern for the DTE prototype.  The overhead
of associating new vnodes with appropriate map nodes during name
resolution is negligible, requiring a small and constant number of
pointer manipulations.  The attribute retrieval operation is a more
likely cause of performance degradation, but we believe it is also
small.  In the DTE prototype, the UNIX kernel function iaccess() (and
a handful of similar functions) call DTE functions that retrieve file
security attributes.  Most UNIX access control functions funnel down
to the iaccess() function, which is called with great frequency since
every system call requesting an operation on a pathname must call
iaccess at least once for every component of the path.  In the worst
case, each attribute retrieval could require a search to the root map
node.  Given the modest depth of typical UNIX pathnames and the
in-memory status of the map node tree, however, this appears small
relative to other overheads of UNIX kernels.  At the cost of
additional complexity, however, various optimizations could be taken
to short-circuit attribute retrieval searches as required.

Recovery Mechanisms

Although useful security configurations can be constructed that ``lock
down'' the mappings between areas of the hierarchical filesystem name
space and security attributes, resulting in a static tree of map
nodes, a more common case in our experience is to allow the map node
tree to evolve as files are moved and created to reflect the needs of
applications that use files.  For example, an application might create
a file of type ``foot'' in an area of the name space that inherits
``bart;'' such an event would add a DTEL assign statement, with its
map nodes, to the system configuration.  Similarly, a rename()
operation may require that the map node tree be edited so that the
rename operation doesn't inadvertently change the type of a file as a
side effect.  In general, the DTE prototype emulates the semantics of
one-to-one attribute storage even though the attributes are not in
fact maintained in that manner.

Given the criticality of accurate security attribute associations,
dynamism in the map node tree introduces the need to maintain
up-to-date associations even in the presence of system reboots or
crashes.  Writing map nodes to secondary storage poses an obvious risk
to performance; the DTE prototype addresses this using a combination
of alternate snapshot files and logging.  Every thirty seconds, the
map nodes are written to disk.[footnote: For large policies, the
mechanism could be enhanced to periodically write out only the changed
portion.]  Additionally, more timely information is kept in two
alternate log files: at system reboot, the most recent snapshot and
log file is read to reconstruct the most recent valid state.  The
batched writes of the policy impose little overhead since no program
waits for the writes to complete.  In contrast, the log files require
synchronous I/O and must be updated as little as possible.

Two basic classes of operations affect the map node tree: create
operations and rename operations.  In each case, the DTE prototype
incurs no additional overhead if the operation does not produce an
edit of the map node tree.  If the operation creates a new object
(e.g., a new empty file at an unused pathname, or a rename to an
unused pathname), recovery is simple since the attributes can be
written first.  Maintenance of DTE recovery information in this case
requires one synchronous write operation in addition to the two
synchronous write operations performed by UNIX to create or rename a
file.  If an operation overwrites an existing object, however, the use
of implicit attributes complicates the recovery strategy: because
every file is always associated with attributes inherited from the
root directory, neither order of operations:

	replace a file first and then record the new attribute, or

	record the new attribute first and then replace the file,

prevents mislabeling if the system crashes between the two operations.
To address this, the DTE prototype records this information as a
sequence of optimized transactions that makes sparing use of
synchronous I/O and, most importantly, that never converts a
memory-speed operation to disk speed.

Both the create and rename VFS-layer operations can overwrite an
existing file as a side effect.  In the case of create, the UNIX VFS
layer knows if there is an existing file to overwrite and truncates it
for reuse with a new identity.  To prevent a crash from relabeling
existing file contents, the DTE prototype adds an fsync operation,
ensuring that the file is empty, and then writes the new attribute to
the log file, resulting in a worst-case scenario of two additional
synchronous I/O operations for file creation.

A rename operation rename(``foo'', ``bar'') is essentially:

	unlink(``bar''); 
	link(``foo'', ``bar''); 
	unlink(``foo'');

If bar exists, an update to a log file must be made conditional on
successful completion of the rename operation or the log file update
may relabel the original bar.  The log file update cannot be written
after the rename operation because a system crash could prevent
writing of the update.  For this operation, the DTE system writes an
uncommitted transaction to the log file containing the file number of
the file to be moved and, on the next write to the log file,
piggy-backs the commit of the previous transaction.  During system
recovery, the last transaction can be verified through an examination
of on-disk file numbers.  This strategy holds the recovery I/O burden
to at most one synchronous I/O for every rename operation.

In general, the prototype design requires no additional disk access on
a per-system call basis.  This approach promotes high performance
since most DTE-related overhead is in memory operations where data
structures can be optimized.  For recovery, however, it is necessary
to add disk writes during file creates that cause changes in the
attribute association database.  Depending on a system's
configuration, it could be that none, some, or all file creates would
cause attribute associations to change.

Network Implementation

In addition to associating attributes with files and processes and
performing access control over those entities, the DTE prototype also
inserts DTE attributes into IP datagrams and provides mediation of
network messages.  A fundamental goal of DTE network mediation is to
preserve interoperability with non-DTE systems: this requires using
existing IP, UDP, TCP, and NFS services and, as much as possible,
preserving application layer protocols such as rsh and rlogin.
Although we expect that it will be useful to add DTE awareness to some
network applications such as rcp and rdist, we believe that DTE
systems must first be useful in networks of non-DTE systems.

Our general scheme is to add DTE attributes in the IP option space;
these attributes are tokenized and currently consume bytes of the
-byte IP option space.  DTE networking support at other layers is
carried in these attributes at the IP layer.  Due to the use of pipes
and sockets in UNIX, a UNIX process may cause numerous IP datagrams to
be generated and may not be aware of the network consequences of its
actions.  For the DTE prototype, each message is generated in the
context of a process's domain and carries the domain's identity as the
message's ``source domain.''  Additionally, each message carries a
type attribute; typically, each DTE domain has a default output type
that labels messages generated from normal UNIX system calls such as
write() and send().

For each standard UNIX system call that can generate a message, the
DTE kernel retrieves the calling process's domain and default output
type from the DTE policy database generated using DTEL.
Traditionally, UNIX systems employ a data structure, called an mbuf,
that allows buffers of data to be chained together in a manner that
facilitates the prepending and stripping of protocol headers in
different layers of a UNIX kernel's protocol stacks.  The DTE
prototype uses a slightly extended form of the typical mbuf structure
that provides header space for storing source domain and type
identifiers.  Standard UNIX system calls that send messages save these
attributes in extended mbuf chains; at the bottom of the protocol
stack, these attributes are extracted from the chains and encoded as
IP options on a per-datagram basis.  For received messages, the
mechanism works in reverse, extracting received IP options and
encoding them in mbuf chains for retrieval by receiving processes.

In addition to support for ordinary UNIX system calls, the DTE
prototype provides a number of analogous DTE-specific system calls
that allow processes to specify the type of data that they wish to
send; DTE access control prevents processes from generating data types
unless they have appropriate authorizations as specified in the DTEL
specification.

In general, the DTE prototype treats every IP datagram as
homogeneously typed; this simplifies access control over datagrams
since a process using the raw IP interface, for example, can be
allowed or denied access to a datagram based on its domain's access to
the datagram's type.  This strategy, although simple, does allow
several ambiguous situations: for example, if a protocol such as TCP
piggy-backs control information in packets that also carry user data,
should those packets have a protocol-specific type or a user type?
Currently, our approach is to label packets with user types when they
contain any user data and with protocol-specific types when they
contain only protocol data.  In the future, a natural extension to the
strategy may include a secondary ``subsystem'' label for use by
protocol subsystems that are trusted to accurately carry user data.
To minimize security mechanism, however, we are deferring secondary
packet labels until a definite need has been demonstrated.  In either
case, the use of homogeneously typed datagrams simplifies the
implementation of TCP substreams since TCP substreams are always made
up of complete IP packets.

UNIX system calls that write data onto a TCP connection enqueue onto a
single chain of mbufs associated with a TCP socket; the TCP sliding
window processing breaks the data stream into separate IP datagrams
based on a variety of critera to optimize performance and guarantee
that receipt of all the data is acknowledged before it is forgotten on
the sending side.  On the sending side, the DTE prototype implements
TCP substreams by breaking the single mbuf chain into multiple chains
where all the data of each chain has the same type attribute.  The TCP
sliding window processing has been modified slightly to generate a new
datagram at chain boundaries.  On the receiving side, this mechanism
works in reverse to return substream type information that is then
used both to mediate receive operations by processes and to deliver
type information for use by DTE-aware processes.

A significant extension to the DTE prototype was required to implement
DTE/NFS servers.  Essentially, NFS file handles specify inode numbers
that have no direct relation to the map nodes that implement implicit
attributes for the prototype.  A means was therefore required for
mapping from inode numbers to map nodes.  For directories accessed via
NFS, the solution is simple since every directory contains a ``..''
entry: using the ``..'' entries, it is possible to reconstruct the
portion of a pathname required to establish attribute values.  The
prototype currently carries out this reconstruction at every NFS file
handle reception; however, temporarily raising the reference counts of
heavily used vnodes probably would increase performance and prevent
DTE overhead from being an NFS server bottleneck.

For files, the on-disk representations do not imply parents without an
exhaustive search of file system inodes.  To avoid this, the DTE
prototype stores (file-inode-number, parent-directory-inode-number)
pairs during NFS lookup operations in a cache.  These entries provide
a mechanism to reach the first directory that then allows pathnames to
be reconstructed as necessary.  To prevent any possibility of
introducing additional stale file handles at client applications, the
cache must be maintained on secondary storage.  For intentional
DTE/NFS server shutdowns, the cache can be written out only before
shutdown.  To avoid stale file handles after DTE/NFS server crashes,
the cache must be maintained during operation.  In this case also, the
cache contents can be batch written at timed intervals, resulting in a
minimal impact on performance.

Related Work

The work most related to DTE and its UNIX implementation falls into
two general classes: access control systems and UNIX security
mechanisms.

DTE is most closely related to mandatory access control techniques
[4,9,6,18,8] and type-enforcing systems [9,21,25,24,27].  In general,
DTE policies are a proper superset of the DoD lattice model [4] and
its integrity variation [6]: DTE can be configured to provide a
lattice but can also enforce nonhierarchical security policies such as
assured pipelines [9] that drive information through policy-specified
pathways of arbitrary connectivity and complexity.  DTE can also be
configured to provide integrity categories as in [18] and to support
the transformation procedures and constrained data items of the
Clark/Wilson model [8].

Type enforcement was first proposed in [9] for the Secure Ada Target,
a system later renamed LOCK [25].  LOCK provides a Trusted Computing
Base (TCB) on top of which a UNIX emulation layer provides UNIX
services.  As a consequence, the type enforcement mechanism controls
UNIX emulations instead of individual UNIX applications and does not
distinguish among multiple applications running on a single UNIX
emulation.  This limitation also exists for a Mach-based LOCK
derivative [14], which adds type enforcement to the Mach port, task,
and virtual memory abstractions but provides no type enforcement
within the UNIX emulation layer.

In [24], type enforcement was added to Trusted XENIX as a TCB subset.
This system provides type enforcement at the UNIX system-call
interface and can individually control UNIX applications.  The TCB
subset architecture prohibited change to low-level disk formats and
mandated use of a separate runtime database to manipulate such
attributes.  This strategy is a precursor of the DTE runtime implicit
type concept.  Type enforcement has also been integrated into at least
one Internet firewall product, the SCC Sidewinder [footnote:
Sidewinder is a trademark of Secure Computing Corporation, Inc.]
system [23], but the authors are not aware of any published technical
details.

A number of UNIX security controls and tools have been developed.
Access Control Lists (ACLs) [13] provide greater flexibility in UNIX
discretionary access controls, and user-mode capabilities [16] also
allow finer-grained control over propagation of access rights, but
both mechanisms are discretionary in nature and provide little
protection against error-prone root programs.  A variety of trusted
UNIX systems have been implemented and evaluated against the Trusted
Computer System Evaluation Criteria [20].  These systems typically
provide MLS security but lack the flexibility of DTE.  Additionally,
tools such as COPS [12] check for system miscofigurations but do not
improve on the base UNIX security mechanisms themselves.

The Trusted Systems Interoperability Group (TSIG) has developed
Internet draft standards for NFS and other protocols that support
Multi-Level Secure (MLS) networking.  These standards communicate
significant amounts of information to represent security labels on
subjects and objects that may ``float'' up dynamically and to
represent process privileges that may be communicated across networks.
For DTE, all of the required security information is contained in the
relatively space-efficient type and domain identifiers carried in the
IP-layer traffic, avoiding most changes to higher-layer protocols.

Future Directions

We are actively exploring several directions for DTE.  The most
immediate and important one is the integration of DTE into Internet
firewalls.  Over the next two years, we will integrate DTE into
firewalls in three phases:

[DTE Firewalls] An integration of DTE into an Internet firewall and
selected hosts.  This integration will add defense-in-depth to the
firewall security perimeter.  The DTE firewall will direct traffic
from specified external hosts or of specified protocols only to flow
to internal DTE hosts that can contain any malicious effects.  Our
primary goal here is to allow more network services to be safely
imported into a LAN than is now prudent.

[Distributed DTE Firewalls] An integration of IP-layer encryption with
the DTE firewall.  This phase will connect multiple DTE enclaves
across the Internet.

[Domain and Type Authority Service] A DNS-like network service that
will distribute portions of DTEL policies.  Communicating DTE hosts
will authenticate to this service and use its DTE policy information
as a basis for establishing appropriate inter-host trust relations and
also for agreement on how data of specific types should be protected
by communicating hosts.

In order to accomplish these goals, we will soon begin investigating
how multiple hosts can exchange DTE information to negotiate network
DTE policies, how DTE mechanisms can most effectively use encryption
to protect DTE network attributes, how DTEL can be modularized to
reduce policy complexity, and how DTE policies can be dynamically and
safely extended or modified at runtime.

Conclusions

A central question in practical UNIX security is whether significant
enhancements can be added in a way that is understandable, effective,
and unobtrusive.  This is a difficult question because applications
and systems have evolved over time and now interact in subtle ways:
practical security enhancements must allow existing programs to
function properly while preventing unsafe interactions.  DTE is an
access control mechanism that uses a specification language to add
simplicity and uses implicit typing to maintain compatibility and
interoperability.  This paper reports on recent extensions to DTE to
provide greater security for IP-based networking and NFS services, and
on design considerations of a DTE UNIX prototype.  Our primary results
are positive and, although the DTE prototype is a research tool, we
have used it internally to provide guest users with safely restricted
access to our corporate data.

In sum, DTE has provided a useful research platform for building a
hardened, compartmentalized UNIX system.  In addition, DTE mechanisms
appear suitable for interoperating and enforcing policies within
networks of existing systems having no DTE controls.  This capability
is critical because any enhanced protection system must interoperate
with existing systems through an extended transition phase as access
controls are gradually adopted.

Bibliography

[1] L. Badger, D. F. Sterne, D. L. Sherman, K. M.  Walker, S. A.
Haghighat, ``Practical Domain and Type Enforcement for UNIX,'' 1995
IEEE Symposium on Security and Privacy, Oakland CA, May 1995.

[2] L. Badger, ``A Model for Specifying Multi-Granularity Integrity
Policies,'' 1989 IEEE Symposium on Security and Privacy, p. 269,
Oakland, CA, May 1989.

[3] R.W. Baldwin, ``Naming and Grouping Privileges to Simplify
Security Management in Large Databases,'' Proceedings of the 1990 IEEE
Symposium on Security and Privacy, p.  116, Oakland, CA, May 1990.

[4] D.E. Bell and L. Lapadula, ``Secure Computer System: Unified
Exposition and Multics Interpretation,'' (Technical Report No.
ESD-TR-75-306, Electronics Systems Division, AFSC, Hanscom AF Base,
Bedford MA, 1976).

[5] K.P. Birman, T. Joseph, K. Kane, F. Schmuck, ``The ISIS
Programming Manual and User's Guide,'' Department of Computer Science,
Cornell University, June 1988.

[6] K.J. Biba, ``Integrity Considerations for Secure Computer
Systems,'' USAF Electronic Systems Division, Bedford, MA,
ESD-TR-76-372, 1977.

[7] M. Branstad, H. Tajalli, F. Mayer, D. Dalva, ``Access Mediation in
a Message Passing Kernel,'' 1989 IEEE Symposium on Security and
Privacy, p. 66, Oakland, CA, May 1989.

[8] D.D. Clark and D.R. Wilson, ``A Comparison of Commercial and
Military Computer Security Policies,'' Proceedings of the 1987 IEEE
Symposium on Security and Privacy, Oakland, CA, p. 184, 1987.

[9] W.E. Boebert and R.Y. Kain, ``A Practical Alternative to
Hierarchical Integrity Policies,'' Proceedings of the 8th National
Computer Security Conference, Gaithersburg, MD, p. 18, 1985.

[10] J. Ioannidis, M. Blaze, ``The Architecture and Implementation of
Network-Layer Security Under Unix,'' Presented at the USENIX Summer
1994 Technical Conference, Boston MA.

[11] NBS, ``Data Encryption Standard,'' Jan. 1977.  Federal
Information Processing Standards Publication 46.

[12] D. Farmer, ``The COPS Security Checker System,'' Proceedings of
the Summer 1990 USENIX Conference, Anaheim, CA, p. 165.

[13] G. Fernandez, L. Allen, ``Extending the UNIX Protection Model
with Access Control Lists,'' Proceedings of the Summer 1988 USENIX
Conference, San Francisco, CA, 1988, p. 119.

[14] T. Fine and S. E. Minear, ``Assuring Distributed Trusted Mach,''
1993 IEEE Computer Society Symposium on Research in Security and
Privacy, Oakland, CA, p. 206, 1993.

[15] J. Kohl and C. Neuman, ``The Kerberos Network Authentication
Service (V5),'' RFC 1510, September 1993.

[16] D. Klein, ``A Capability Based Protection Mechanism Under Unix,''
Proceedings of the 1985 Winter USENIX Conference, Dallas, Texas, p.
152.

[17] C.E. Landwehr, C.L. Heitmeyer, and J. McLean, ``A Security Model
for Military Message Systems,'' ACM Transactions on Computer Systems,
Vol. 2, No. 3, August 1984, pp. 198-222.

[18] S.B. Lipner, ``Non-Discretionary Controls for Commercial
Applications,'' Proceedings of the 1982 IEEE Symposium on Security and
Privacy, Oakland, CA, p. 2, 1982.

[19] M. K. McKusick, ``The Virtual Filesystem Interface in 4.4BSD,''
USENIX Computing Systems, Vol 8, Winter 1995, p. 3.

[20] National Computer Security Center, ``Department of Defense
Trusted Computer System Evaluation Criteria,'' DoD 5200.28-STD, Dec.
1985.

[21] R. O'Brien and C. Rogers.  Developing Applications on LOCK.  In
Proc. 14th National Computer Security Conference, pages 147--156,
Washington, DC, October 1991.

[22] L.L. Peterson, N.C. Buchholz, R.D.  Schlichting, ``Preserving and
Using Context Information in Interprocess Communication,'' ACM
Transactions on Computer Systems, 7(3):217-246, Aug. 1989.

[23] Secure Computing Corporation, Sidewinder Press Release, October
10, 1994.

[24] D. Sterne, ``A TCB Subset for Integrity and Role-Based Access
Control,'' Proc. 15th National Computer Security Conference, pages
680--696, Baltimore, MD, 1992.

[25] O.S. Saydjari, J.M. Beckman, and J.R.  Leaman, ``LOCK Trek:
Navigating Uncharted Space,'' Proceedings of the 1989 IEEE Symposium
on Security and Privacy, Oakland, CA, p.  167, 1989.

[26] D. J. Thomsen, ``Role-based Application Design and Enforcement,''
In Proc. of the Fourth IFIP Workshop on Database Security, Halifax,
England, September 1990.

[27] S. Wiseman, ``A Secure Capability Computer System,'' Proceedings
of the 1986 IEEE Symposium on Security and Privacy, Oakland, CA, p.
86, 1986.