USENIX 2003 Annual Technical Conference, FREENIX Track Paper
[USENIX Annual Conference '03 Tech Program Index]
Submitted to USENIX ATC Conference 2003, FREENIX Track
The TrustedBSD MAC Framework: Extensible Kernel Access Control for FreeBSD 5.0
Robert Watson Network Associates Laboratories Rockville, MD email@example.com http://www.nailabs.com/ - Wayne Morrison Network Associates Laboratories Rockville, MD firstname.lastname@example.org - Chris Vance Network Associates Laboratories Rockville, MD email@example.com http://www.nailabs.com/ - Brian Feldman FreeBSD Project Rockville, MD green@FreeBSD.org
We explore the requirements, design, and implementation of the TrustedBSD MAC Framework. The TrustedBSD MAC Framework, integrated into FreeBSD 5.0, provides a flexible framework for kernel access control extension, permitting extensions to be introduced more easily, and avoiding the need for direct modification of distributed kernel sources. We also consider the performance impact of the Framework on the FreeBSD 5.0 kernel in several test environments.
The TrustedBSD MAC Framework included in FreeBSD 5.0 provides a general facility for extending the kernel access control policy . By providing common security infrastructure services, such as kernel object labeling, and the ability to instrument kernel access control decisions, the Framework is capable of supporting a variety of policies implemented by different vendors. This paper explores the design, implementation and performance of the MAC Framework, as well as its impact on policy design and the FreeBSD kernel architecture.
In these three roles, FreeBSD is deployed in a wide variety of environments, ranging from electronic cheque processing and point of sale devices to web cluster deployment, firewall, and routing appliances. Each of these environments has different security requirements, often requiring flexibility beyond that provided for by the traditional UNIX security protections. Especially in embedded network environments, the requirements can range from simple operating system hardening to the introduction of mandatory and fine-grained security policies.
12]. Typical practice has been to maintain a distinction between the base OS product and the trusted variant in terms of maintenance, product identification, and price. In addition, there are third party vendors who develop and market trusted operating system extensions, often in close coordination with the OS vendor. Finally, there is a broad range of access control research across many operating systems and performed in many forms   --most frequently, this work is performed on open source operating systems due to ready access to operating system source code.
In order to successfully maintain a security extension product for an operating system, access to the operating system code is typically required, be it an open source system, or licensed from the closed source vendor. Product maintenance raises a number of challenges, not least the challenge of tracking the operating system vendor's primary product life cycle, which is frequently incompatible with the development cycle required for high assurance products. Many practical impediments also present themselves: security extensions have their fingers deep in the heart of the operating system, touching almost all elements of the kernel source code. Local security extensions invariably conflict with vendor-provided security patches, as well as vendor-provided feature improvements over the OS development cycle. In addition, security extensions frequently conflict with one another if deployed in parallel, leading not only to potentially inconsistent policy behavior, but also possible bypass of protections provided by one of the policies. Direct source code modification of the vendor operating system presents many challenges to security extension authors.
In the past, research has been performed on how to most easily extend operating system security policies, including into system-call interposition technologies such as LOMAC , Generic Software Wrappers , and systrace , extensible security mechanisms such as the General Framework for Access Control  and FLASK . Many of these extension technologies, especially these using system call wrapping techniques, fall down in the face of modern UNIX operating system kernels which support true kernel and user process parallelism in SMP environments, and fine-grained threading of user processes. Preventing races inherent to system call wrapping is difficult, and most system call wrapper security technologies are susceptible to at least one of a class of related vulnerabilities. Any successful extension technology for contemporary operating systems must be designed with the notion that SMP and threading are realities, and must be well-integrated into the kernel locking mechanisms.
Trusted operating systems typically provide two or more mandatory system policies: almost all provide MLS for user data protection, but many also make use of the Biba policy to protect the integrity of the Trusted Code Base (TCB). The TrustedBSD Project, in seeking to provide access to trusted operating system features, provides several MAC policies for use with FreeBSD; the challenges associated with this work include introducing the services securely, and without substantially impacting the performance and reliability of FreeBSD installations not taking advantage of these new features.
3 there are several high-level goals for the design:
The MAC Framework addresses a number of needs in access control and extension implementation:
Some entry points, such as access control checks, return error values; other notification entry points are assumed always to succeed. Frequently, a set of related entry point invocations will be made around complex operations: for example, access control checks are required to create a new object in the file system namespace. Likewise, when label modifications occur, a two-phase commit is performed by the Framework to confirm that all policies will permit the relabel, and then to notify all policies to perform the actual operation.
Entry points are currently found in the cross-file system VFS code, device file system, mount/umount code, protocol-independent socket calls, pipe IPC code, BPF packet sniffing code, IP fragment reassembly, IP socket send and receive code, network interface transmission and delivery, credential and process management code (including debugging, scheduling, signaling, and monitoring interfaces), kernel environmental variable management, kernel module management, per-architecture system calls, swap space management, and a variety of administrative interfaces such as time management, NFS service, sysctl(), and system accounting. These entry points permit policies to augment security decisions in a variety of forms.
While some system hardening models employ existing subject and object information (UNIX credential data, file permissions, ...) a number of important mandatory policies require additional subject and object labeling. For example, the MLS confidentiality policy makes decisions based on subject and object sensitivity labels: subjects are assigned clearances, and objects are assigned classifications. When policies require additional labels, the MAC Framework supports them through a policy-agnostic labeling primitive, which permits policies to tag kernel objects with information required for policy decision-making.
The label structure stored in kernel data structures is maintained by the MAC Framework: based on the life cycle of the data structure, the Framework provides per-object entry points for memory initialization, object allocation, and object destruction. The label structure consists an array of slots, each providing a union of a void * pointer and a long; slots are allocated to policies requiring label storage for use in a policy-specific manner. Policy writers might choose to store an integer value, allocate per-label memory, or make use of referenced structures relying on the initialization and destruction calls to maintain reference counts. Initialization calls will often be used for memory allocation, and in some cases a blocking disposition will be passed as an argument to the call indicating whether blocking allocation is permitted; in these cases the initialization call is permitted to return a memory allocation failure, which will abort the allocation of the object.
Label storage is currently provided in the following kernel objects: BPF descriptors, process credentials, devfs directory entries, network interfaces, IP fragment reassembly queues (IPQ), sockets, pipes, mbufs, file system mount points, processes, and vnodes. A blocking disposition is provided for mbuf, socket, and IPQ initialization; if a failure occurs during label allocation, the mbuf and socket allocator code will return a memory exhaustion failure to the consumer. Unlike most other kernel objects, memory to hold the mbuf label is not stored within the mbuf structure itself: instead, it is stored in an m_tag hung off the mbuf header m_tag chain. Tags to hold MAC data will be allocated only when policies requiring MAC labels on mbufs are present in the system. This permits improved network performance of the MAC Framework in scenarios where flexible access control is required, but where mbuf labeling is not.
Additional support for persistent label storage is provided by any file system supporting extended attributes, including UFS1 and UFS2; while policies can determine whether and how the attributes are bound to policy-specific labels, the Framework constructs transactions to read, write, and cache vnode labels on supporting file systems.
This flexibility supports a wide variety of behaviors required for many interesting and useful access control policies.
3] . Likewise, locally maintained security extensions are frequently deployed in combination with existing system security policies, forming cohesive (and ideally stronger) protection. As the MAC Framework is intended to assist vendors in combining security components to be deployed in a variety of environments, the MAC Framework supports the simultaneous loading of several modules. While it may not be possible to coherently (or even safely) compose all access control policies, the MAC Framework provides a simple composition model that has proven useful in existing shipped systems: rights intersection. This composition largely maintains and assumes independence between the active policies, composing their behaviors only for two classes of operations:
The same composition model is employed to combine results from the native UNIX access control model and any models added using the MAC Framework. Currently, the task of determining whether two policies may be safely composed is left to the system designer or administrator, a reasonable requirement for many of the deployed environments of interest.
Retrieve the label of current or arbitrary process; set the current process label. mac_get_pid(), mac_get_proc(), mac_set_proc().
Get and set file or pipe label by file descriptor. mac_get_fd(), mac_set_fd().
Get and set file label by path; optionally follow symbolic links. mac_get_file(), mac_set_file(), mac_get_link(), mac_set_link().
Execute a command and atomically modify the process label. mac_execve().
Policy-specific system call multiplexor mac_syscall().
Test for the presence of the MAC Framework or a specific policy mac_is_present().
Convert labels to and from human-readable text mac_from_text(), mac_to_text().
Allocate storage for a label appropriate to hold the specified label elements, or for a specific object based on system default label elements mac_prepare(), mac_prepare_file_label(), mac_prepare_ifnet_label(), mac_prepare_process_label().
Release storage associated with a label mac_free().
To support atomic change of label with execution events, mac_execve() provides an extension to the existing execve() system call accepting a requested target label. This is required to support the execve_secure() functionality used by the SEBSD port of FLASK/TE to FreeBSD from SELinux .
In addition, a general security policy entry point, mac_security() is provided so that policies may extend the set of system services without allocating new system call numbers.
For an initial pass at supporting automatic labeling at login, we extended the existing BSD login class database. The master.passwd(5) assigns one class to each user; a class may be shared by many users, and includes information such as the resource restrictions for the user, login and accounting properties, etc. We introduced two new fields:
The existing setusercontext(3) interface is extended to support a new flag LOGIN_SETMAC, indicating that the MAC label should be set as part of the login process. This flag is also implied by the LOGIN_SETALL flag used widely across programs setting user contexts. Process labeling tools, described in the next section, may then be used to update the process label subject to policy constraints. By instrumenting this one function and its relevant consumers, we were able to easily modify most key system daemons and applications to recognize the new process properties, including sendmail(8), cron(8), login(1), su(8), ftpd(8), inetd(8) and others, making the changes relatively low impact. In the future, we may divorce the label selection database from the class database for the purpose of improved management, but this would not require changes to the user context API.
In addition, the following utilities were also modified to inspect and set MAC labels:
Further extensions could easily be made to applications such as the KDE file system browser, Konqueror, to display and manage labels on file system objects.
A broad scope of policies may be implemented using the MAC Framework; the Framework is structured so that policy authors may select what performance, security, and functionality trade-offs they wish to make in policy design, augmenting the system policy in ways that reflect local requirements. This flexibility makes the MAC Framework a useful tool in a broad variety of environments, reflecting the variety of deployment scenarios in which FreeBSD is used.
In this section, we explore some of the issues associated with performance measurement of the MAC Framework. The Framework is currently integrated into the FreeBSD 5.0-CURRENT development branch--as a result, current performance measurements are used to guide the development process and explore the eventual impact, rather than representing final performance results. Substantial effort has not yet been invested in fine-grained performance tuning, although initial measurements suggest performance well within the bounds of acceptability.
For each test, we consider several kernel configurations:
The GENERIC kernel permits us to explore baseline performance as a control for other configurations; MAC tests the overhead to simply include extensible security support in the system with no policies. The three sample policies allow us to consider the overhead of entering a policy module for each entry point (MAC_NONE), the cost of unlabeled file system protections using a locked policy (MAC_BSDEXTENDED), and the cost of a fully labeled system integrity policy touching most aspects of system operation (MAC_BIBA).
These tests were run on a FreeBSD 5.0-CURRENT system from the trustedbsd_mac development branch from late March, 2003; tests were run on a single-processor 800MHz Intel PIII system with 128mb of memory and ATA 7200rpm 20gb hard disk. For file system related benchmarking, all writable file systems were recreated using the same geometry between tests since file system aging effects are not of interest for these tests; reboots occur between each test to flush storage-related caches and reset slab allocator and mbuf allocator state. All file systems use UFS2 for high performance meta-data storage.
The results of this test demonstrate a small but measurable performance change (0.1%) with MAC support. A slight relative increase in cost for the BSD/extended policy may be the result of acquiring a policy lock in order to process an access control decision; however, there is no statistically significant difference in performance between the various MAC policies and the base cost of the MAC Framework; UFS2 provides for high performance label access for the Biba policy.
In the m_pkthdr approach, all kernels pay a memory overhead for labeling support, although kernels without options MAC do not pay the label life cycle costs. With the m_tag approach, only kernels with options MAC pay the memory overhead, although we presupposed that there would be a higher cost for using tags for label storage due to greater administrative overhead in maintaining lists and allocating storage. To optimize the m_tag approach, we implemented lazy tag allocation: tags are only allocated to hold label data when a policy expresses interest in labeling mbuf headers.
We consider two tests from the netperf suite: UDP_RR and UDP_STREAM, which respectively test the per-transaction cost of a Request/Receive RPC, and raw network throughput. The request/response test measures the throughput of the system relative to synchronous one-byte packets between a client and a server, and is intended to measure the performance impact of a change in terms of number of packets transfered. The stream test uses a larger packet size and does not synchronously wait for a response before continuing, generally measuring the performance impact of a change in terms of data transferred.
In Figure 5, the performance cost per-packet is illustrated: the introduction of MAC support produces a measurable change; depending on the strategy for labeling mbufs, that change varies substantially. With inclusion of the label directly in the mbuf header, an 11.5% performance overhead is accepted for enabling MAC support. Adding the stub policy increases that cost to 12.2%; adding a complex labeled policy performing per-label memory allocation, such as Biba, increases that drop to 14.9% of the GENERIC packet throughput.
With lazy m_tag labeling, the performance trade-off is changed: the cost of introducing MAC is 4.8% (substantially less than using mbuf headers); with a stub policy implementing mbuf label entry points but not allocating labels, that cost increases to 8.5% (also less than mbuf headers). However, performance with a Biba performance is reduced by 17.1%, showing an increased cost for heavily labeled policies such as Biba.
The second set of trade-offs best fits the needs of the TrustedBSD Project: minimize overhead for MAC-disabled systems, and permit a performance/complexity cost decision by policy authors.
In Figure 6, a similar pattern emerges, where-in the introduction of MAC support results in a 6.1% throughput penalty. Use of a stub policy increases that cost to 7.7%; Biba labeling increases the cost to 10.3%. However, with lazy m_tag labeling, the base cost of MAC support is reduced to 3.7%; the stub policy increases this cost to 4.4%, and with Biba to 9.7%. The proportionally lower performance cost with this test derives from the reduced relative overhead resulting from reduced packet counts relative to data transfered.
Again, lazy m_tag allocation better meets our requirements by permitting better non-MAC performance with a more clear performance trade-off for complexity. Unlike the packet count testing, performance for the Biba policy actually increases with lazy m_tags relative to mbuf headers, a result that we attribute to differences in the caching and allocation policies for the UMA Slab Allocator versus the mbuf allocator in handling memory clearing for new allocations.
In the area of prior deployed systems, ``Trusted'' variants of most commercial UNIX platforms exist, including Trusted Solaris, and Trusted IRIX . In addition, there are a number of third party security extension products that exist for these systems, including Argus's PitBull product, which provides a product alternative with many of the same features . These products largely rely on Multi-Level Security (MLS)  and Biba integrity  to provide mandatory data confidentiality and TCB integrity; earlier TrustedBSD work has focused on implementing support in FreeBSD for these models using similar integration approaches . TrustedBSD MAC policy modules exist expressing both MLS and Biba in functionally similar forms.
In the area of access control extensibility, early work included the Generalized Framework for Access Control (GFAC), which proposes a separation of policy and enforcement ; this model is implemented in Linux in RSBAC .
The FLASK framework provides for similar types of separation of policy and enforcement, although with higher level labeling abstraction in the form of a security ID (SID) and a focus on Linux Security Modules (LSM) provides a set of kernel extension hooks to facilitate integration of systems such as SELinux without committing the Linux operating system to a particular model . LSM provides a void pointer for label storage in each supported kernel object, and has access control notions similar to the TrustedBSD MAC Framework. However, the semantics of the hooks are weaker, and the LSM framework does not provide for policy composition and persistent labeling, relying on policy modules to implement these services. Type Enforcement for policy representation . A prototype port of the SELinux FLASK and TE implementations has been made to layer on top of the TrustedBSD MAC Framework via the SEBSD policy module.
The authors would like also to thank the anonymous reviewers of this paper, as well as other contributors to the TrustedBSD Project, including Chris Faulhaber, Ilmar Habibulin, Adam Migus, and Thomas Moestl. The authors would also like to thank Angelos Keromytis and Erez Zadok for their shepherding of this paper.
This document was generated using the LaTeX2HTML translator Version 99.2beta8 (1.46)
The command line arguments were:
The translation was initiated by Robert Watson on 2003-04-08
Robert Watson 2003-04-08