Creating a CMC Model from the Implementation

Figure 1 shows a skeleton event-driven implementation of a routing protocol, very similar to the AODV protocol that was actually checked (see section 4). This implementation is used as a running example in the following discussion. The main() function of the implementation calls the initialization function (line 34), and then enters an event dispatch loop (line 35). Depending on the input event, it calls one of four event handlers defined in lines 1 through 26. Each handler processes an event: a user request for a route to a destination, a request from another node, a response from another node to one of its previous requests, and a timer event requiring the protocol to invalidate its old routes.

**Figure 1:** A simple routing protocol implementation.
$\begin{figure}\small { \begin{verbatim}1 /* event handlers */ 2 on_user_requ... ...n_recv_response(...) 41 on_timeout(...) 42 } 43 } 44\end{verbatim}} \end{figure}$

This subsection describes the three steps that the user must perform to apply CMC to a protocol. Steps 1 and 2 essentially provide a unit test scaffold, which would be required for most test environments, such as running the implementation in a simulator. Step 3 is the only additional requirement for CMC.

Step 1: Specifying Correctness Properties. Before any system can be tested, the user must specify some correctness properties. Some properties are domain independent; for example, the program should not access illegal memory and should not leak memory. Some domain-specific properties can be specified as assertions at particular points in the implementation (e.g., the example protocol should never return an invalid route in line 4). In many cases, a careful implementer will have placed these assertions in the code long before CMC is applied. Other properties are inherently global, such as the requirement that there are no loops in the routing table. Such properties are specified as Boolean functions (written in C) that access the datastructures of process contexts.

Step 2: Specifying the Environment. Next, the user must build a test environment that adequately represents the behavior of the actual environment in which the protocol is executed. For networking protocols, the environment model fakes an operating system and anything outside of the protocol that is necessary for it to function. (The decision as to which part of the system is to be checked and which is the environment is decided by the user.) The environment model is a collection of substitute API functions and data structures to emulate its state. The environment should be modelled in as little detail as possible - otherwise, many superfluous states will be generated to model environmental behavior irrelevant to checking the protocol.

Many functions can be replaced by simple ``stubs.'' For example, gettimeofday() might return a constant or contain a counter. In the example, the model requires a network for exchanging routing packets. A simple network could be modelled as an unordered queue of bounded length. The model would include its own versions of interface functions, such as the broadcast_request() function that sends packets to the network. The environment may also contain several processes. For example, a process that nondeterministically removes a packet from the network can be used to model a lossy network.

To represent nondeterminism in the environment, CMC provides a CMCChoose() function (similar to $VS\_toss$ in Verisoft[15]). CMCChoose takes an integer argument

and returns an integer in the range $(0 \ldots n-1)$ . CMCChoose arbitrarily selects one out of the

possibilities in the environment. An example, shown in Figure 2, is the malloc() implementation that can either allocate the requested memory from the CMC heap or fail by returning NULL. CMCChoose is used to make one of these two choices. CMC will attempt to try all possible return values for each call to CMCChoose. Since calls to CMCChoose appear in the environment code or in implementations of standard system functions (such as malloc() and select()), it is generally not necessary to modify the actual implementation.

Providing an environment model can be time consuming. It is therefore important to reduce the modelling effort required to apply CMC to a previously unchecked protocol. A first obvious step is to engineer the models so that they are as re-usable as possible, reducing the incremental effort of checking a new protocol. This is especially beneficial when related protocols are checked (which is a large part of the reason this paper checks three different implementations of the same protocol). Finding other ways to reduce the cost of environmental modelling is an interesting area for future work.

**Figure 2:** Implementation of `malloc()` in CMC. `malloc()` nondeterministically fails to allocate memory.
$\begin{figure}\small { \begin{verbatim}void* malloc(size_t n){ if(CMCChoose(2... ...ic failure// alloc n bytes from heap and return }\end{verbatim}} \end{figure}$

Step 3: Identifying the Initialization Functions and Event Handlers. Given an event driven system, the user should provide the initialization functions and the event handlers for each process in the system.

The user should also provide a guard function for each event handler, which is a Boolean function that determines when that event handler is enabled in a given state. For instance, the guard function for the on_recv_request() handler returns true only if a request is pending for this particular process in the network.