Check out the new USENIX Web site. next up previous
Next: Limited x86 Simulation in Up: Prototype Prover Previous: General Approach

Implementation Walk-Through

In order to make discussion of the implementation more concrete, Figures 1-4 give an example of a short real-world filter program that was used extensively in the development of the prototype prover. First, a short libpcap filter program shown in Figure 1 was used as a starting point. It was chosen to be a short example of how BPF is typically used. The tcpdump program was used as a front-end to libpcap's internal compiler and optimizer, which generated the BPF program shown in Figure 2.

Then the C program shown in Figure 3 was constructed by hand, attempting to be as faithful to the contents of the BPF program as possible. However, the C implementation uses abstractions for protocol headers and address information for the sake of readability and to show that programs that might be written by a real programmer compile down to analyzable code. The C program adds some run-time bounds checking that is not present in the BPF instructions - the BPF virtual machine does these checks implicitly, while a native program must do them explicitly.

Finally, the C program was compiled with the GNU C compiler into a native x86 program whose assembly listing is shown in Figure 4.

Figure 1: Original libpcap filter program syntax
ip host 127.0.0.1 and udp port 42

Figure 2: BPF program resulting from compilation
(000) ld       [0]
(001) jeq      #0x2000000       jt 2    jf 16
(002) ld       [16]
(003) jeq      #0x7f000001      jt 6    jf 4
(004) ld       [20]
(005) jeq      #0x7f000001      jt 6    jf 16
(006) ldb      [13]
(007) jeq      #0x11            jt 8    jf 16
(008) ldh      [10]
(009) jset     #0x1fff          jt 16   jf 10
(010) ldxb     4*([4]&0xf)
(011) ldh      [x + 4]
(012) jeq      #0x2a            jt 15   jf 13
(013) ldh      [x + 6]
(014) jeq      #0x2a            jt 15   jf 16
(015) ret      #1576
(016) ret      #0

Figure 3: Equivalent C source code
\begin{figure*}\begin{center}
\begin{verbatim}...

Figure 4: x86 program resulting from compilation (GCC 2.95.2, -O6)
.text
        .align 4
.globl f
        .type    f,@function
f:
        pushl %ebp
        movl %esp,%ebp
        pushl %esi
        movl 8(%ebp),%ecx
        movl 12(%ebp),%esi
        cmpl $3,%esi
        jbe .L8
        cmpl $2,(%ecx)
        jne .L8
        cmpl $23,%esi
        jbe .L8
        leal 4(%ecx),%edx
        movzbl 4(%ecx),%eax
        andl $15,%eax
        cmpl $4,%eax
        jle .L8
        cmpl $16777343,12(%edx)
        je .L18
        cmpl $16777343,16(%edx)
        jne .L8
.L18:
        testl $8191,6(%edx)
        jne .L8
        leal 0(,%eax,4),%edx
        leal 12(%edx),%eax
        cmpl %eax,%esi
        jb .L8
        leal 4(%ecx,%edx),%eax
        cmpw $10752,(%eax)
        je .L25
        cmpw $10752,2(%eax)
        jne .L8
.L25:
        movl $1576,%eax
        jmp .L26
        .p2align 4,,7
.L8:
        xorl %eax,%eax
.L26:
        popl %esi
        movl %ebp,%esp
        popl %ebp
        ret
.Lfe1:
        .size    f,.Lfe1-f

The first thing that the prover does is load the code of the program to be proven into a memory buffer. This requires the program to parse the i386 ELF binary object format, which is currently done in a reasonable but minimal fashion. Only the .text segment is loaded, which is assumed to contain exactly one C function, and no relocation is performed. The program loader is currently linked into the same binary as the prover, but is otherwise decoupled from it. In practice, the loader would be separated from the prover and would be an untrusted component (for example, a part of the user process).

Next, the prover initializes preconditions. The register pre- and post-conditions are listed in Figure 5, and are generic to the x86 C calling convention. The first four general purpose registers are initially set to ``undefined,'' which is a special value in the prover that indicates that these cannot be read from until changed to a defined value (to prevent unknown or unintended values from being available to the program). The last four general purpose registers are initially set to abstract variables representing their initial value. This allows the program to generate values as offsets from their initial values (such as stack pointer relative addresses, which need to be resolved to an address relative to the initial value of the stack pointer) and it also allows the prover to check as a postcondition that the initial value has been faithfully restored when the program returns.

The memory preconditions are listed in Figure 6, and are specific to the problem of a BPF filter function with the function signature seen in Figure 3. Different functions will require different memory preconditions. The current prover implementation allows these to be changed easily. Memory words are defined to contain each input parameter, and the input packet buffer that is passed to the program is also defined. The input packet buffer is currently defined as a fixed-length buffer in order to greatly simplify the prover's operation; it is much easier to reason about falling within a known bound than an unknown bound (though this would be a useful feature to add in the future). At run-time, the caller would need to copy packets into a buffer of 8192 bytes in order for this to be safe. The input parameters are all treated as read-only because there is no legitimate need for them to be modifiable. In order to have some available scratch space, twelve words of stack space is set aside for read/write local variables. This scratch space is easily increased by defining more such words in the prover, but arbitrary amounts of temporary space are not supported. There are no memory postconditions; values that are read-only are never writable and thus need no postcondition checking, and values that are read-write are considered mutable.

Figure 5: Register Pre/postconditions for the x86 C Calling Convention
Register      Precondition   Postcondition
%eax - %edx   (undefined)    n/c
%esp          sp_0           sp_0
%ebp          bp_0           bp_0
%esi          si_0           si_0
%edi          di_0           di_0

Figure 6: Memory Preconditions for a BPF-like Filter Program
\begin{figure*}\begin{tabular}{llllll}
Name & Base Variable & Offset & Size & Co...
...et ({\tt p}) & $p$ & 0 & 8192 & (unknown) & read \\
\end{tabular}\end{figure*}


next up previous
Next: Limited x86 Simulation in Up: Prototype Prover Previous: General Approach
Craig Metz 2000-05-08