Because the experiments in this paper use the DEC Alpha assembly language, our
abstract machine is essentially a high-level formal description of the Alpha
architecture [19]. To see how this is done, consider the subset of
the Alpha instruction set shown in lang. (Actually, we use a
larger subset of the DEC Alpha assembly language in our experiments, but this
smaller subset will suffice for presentation purposes.) In this table, n
denotes an integer constant and refers to machine register i. All
instructions operate on 64-bit values. For simplicity we allow the use of only
11 temporary and caller-save machine registers (which, for the purpose of this
presentation, we rename
through
). The consequence of this
is that programs cannot write into reserved and callee-save registers
(according to the standard C calling convention for the DEC Alpha
architecture), and are thus trivially safe with respect to these registers.
To define how programs are executed, we define an abstract machine as a
state-transition function, the essential core of which is shown in
abs-mach. In this specification, the DEC Alpha program is a
vector of instructions, , and the current instruction is
,
where
is the program counter. The variable
denotes the state of
the machine registers and memory. The state-transition function maps a machine
state
into a new state
by executing the current
instruction
.
The notation (often abbreviated as
) refers to the value
of register
in state
.
The expression
denotes
the new state obtained from state
by incrementing the value of register
. So, for example, the Alpha ``
''
instruction is defined by abs-mach to have the following semantics:
[
([d s op], + 1)
]
where
is the current register and memory state. This
specification states that the ADDQ instruction updates register
with the sum of
and op, and also increments the program counter.
We use the ``circled'' operation
to denote two's-complement
addition on 64 bits. This operation is defined in terms of the usual integer
arithmetic operations as
To model the state of memory, we use a pseudo register, called , that
gives the content of each memory location. We write
for the
contents of memory address a, and
for the
new memory state resulted from writing register
to address a. Memory
operations work on 64-bits and the addresses involved must be aligned on an
8-byte boundary.
In the definition of the load and store instructions, there is a crucial
difference between the DEC Alpha processor and our abstract machine. The
difference is that our abstract machine performs the safety checks that are
shown in boxes in abs-mach. For example, consider the definition
of the `` '' instruction:
[
([d s n], + 1),; if;
]
The predicate
is true when it is safe to read the word at memory
address a, which for the DEC Alpha implies that a is aligned on an 8-byte
boundary. Similarly, the predicate
is true when the address a
denotes an aligned location that can be safely read or written. In essence,
these checks define what is meant by safety, and more specifically for this
example, memory safety. For the purpose of this paper, the predicates
and
are defined by the safety policy through the
precondition, as shown in the next subsection.
Mathematically, the abstract machine does not return errors when a
or
check fails. Instead, the execution blocks because there are no
transition rules covering the error cases. In this setting, a program is safe
if and only if it runs without blocking on the abstract machine. Of course, the
presence of these safety checks means that the abstract machine is not a
faithful abstraction of the DEC Alpha processor. However, the purpose of
certification is to prove that all safety checks always succeed. If we have a
valid safety proof for a program, we know that we can safely execute it on a
real DEC Alpha and get the same behavior as on our abstract machine, even
though the Alpha does not implement the safety checks.
There are other notable differences between our abstract machine and a real DEC Alpha. For example, to simplify the presentation in this paper, we have restricted all branches to be only forward. Allowing backward branches and loops introduces a number of complications, but is handled in a conceptually straightforward manner through the addition of explicit loop invariants. As it turns out, the packet filter examples we use in our experiments do not have any loops, and so it is not inconvenient to eliminate them here. In a later section we will briefly describe our experiments with looping programs, including a safe IP-header checksum routine.
: The Verification-Condition Generator.
Another interesting aspect of the abstract machine is the level of abstraction of our specification. We might try to be ambitious and make a complete specification of the DEC Alpha processor. However, this would be extremely complex and probably difficult to trust. And, as a practical matter, for specific tasks such as the ones we are considering, many details and features of the Alpha are irrelevant. This justifies working at a higher level of abstraction above the details of the pipeline, cache, timing, and interrupt behavior.
We can also consider encoding other kinds of safety checks into our abstract machine. For the sake of simplicity, we have specified only a notion of fine-grained memory safety. With some ingenuity, an abstract machine designer can define safety policies involving other kinds of safety, like control over resource usage or preservation of data-abstraction boundaries. Once a safety policy is defined, application writers are free to use it to create binaries that guarantee safety.