The IP Security Architecture , as specified by the Internet Engineering Task Force (IETF), is comprised of a set of protocols that provide data integrity, confidentiality, replay protection, and authentication at the network layer. At the lowest level of the IPsec architecture reside the data encryption/authentication protocols, AH and ESP. These are the ``wire protocols,'' used for encapsulating the IP packets to be protected. They simply provide a format for the encapsulation; the details of the bit layout are not particularly important for the purposes of this paper. Outgoing packets are authenticated, encrypted, and encapsulated just before being transmitted, and incoming packets are decapsulated, verified, and decrypted immediately upon receipt. These protocols are typically implemented inside the kernel, for performance and security reasons.
IPsec was the first consumer of the OCF services. The original implementation of the OpenBSD IPsec was described in . Here, we give a brief overview and then describe the modifications we had to make to it to allow use of the OCF.
In the OpenBSD kernel, IPsec is implemented as a pair of protocols sitting on top of IP. Thus, incoming IPsec packets destined to the local host are processed by the appropriate IPsec protocol through the protocol switch structure used for all protocols (e.g., TCP and UDP). The selection of the appropriate protocol is based on the protocol number in the IP header. The SA needed to process the packet is found in an in-kernel database using information retrieved from the packet itself. Once the packet has been correctly processed (decrypted, integrity-validated, etc.), it is re-queued for further processing by the IP module, accompanied by additional information (such as the fact that it was received under a specific SA) for use by higher-level protocols and the socket layer.
Outgoing packets require somewhat different processing. When a packet is handed to the IP module for transmission (in ip_output()), a lookup is made in the Security Policy Database (SPD) to determine whether that packet needs to be processed by IPsec. The decision is made based on the source/destination addresses, transport protocol, and port numbers. If IPsec processing is needed, the lookup will also specify what type of SA(s) to use for IPsec processing of the packet. If no suitable SA exists, the key-management daemon is notified to acquire one. Otherwise, the packet is processed by IPsec and passed to ip_output() again for transmission. The packet also carries an indication as to what IPsec processing has already occurred to it, to avoid processing loops.
In the original IPsec implementation, all cryptographic operations were done in-band with packet processing. This meant that a lot of time was spent performing symmetric-key encryption in the kernel. To make use of the OCF, we split the input and output processing paths. For example, let us consider the case where ip_output() determines (by consulting the SPD) that a packet must be IPsec-protected. It then calls ipsp_process_packet(), which handles all IPsec outbound-packet processing. After handling encapsulation issues, this routine calls the appropriate ``wire protocol'' output routine. In the ESP protocol processing, the original esp_output() routine was broken up in esp_output() and esp_output_cb(). esp_output() does all the data marshaling and ESP header manipulation, constructs a crypto request, passes it to the OCF and simply returns. Execution returns to ip_output() with an indication that the operation was successful.
Once the OCF processes the request, it calls esp_output_cb(), a pointer to which is included in the request itself. The callback routine completes the ESP protocol processing by checking for any errors in the crypto processing (re-queuing the request if the OCF indicated so), and calls ipsp_process_done(), the second part of the original ipsp_process_packet() routine. This routine completes IPsec book-keeping, and calls ip_output() with the new packet. ip_output() will then perform a new SPD lookup (making sure no IPsec loops occur, by examining the list of SAs that have been already applied to the packet). If necessary, the output processing cycle will occur again. Eventually, ip_output() will pass the packet to a network driver for actual transmission.
The cases for output AH and IPcomp processing are similar. Input processing is also similar: ipsec_common_input() is called by the network scheduler for all IPsec packets received. It locates the appropriate SA in the kernel SA database and calls esp_input(). Similar to the output case, esp_input() validates the ESP header fields, constructs a crypto request, passes it to the OCF and returns. Once the request is processed, the OCF will call esp_input_cb(), which will verify the packet integrity (by comparing the value on the packet with that computed by the accelerator), remove the ESP header, and pass the packet to ipsec_common_input_cb(). This routine performs further sanity and security checks on the decrypted packet, and re-queues it for further processing by the IP layer. AH and IPcomp input processing is similar, as is the case of IPsec over IPv6.
Input ESP and AH processing offer one example of use of the opaque data passed with each crypto request, discussed in Section 3. All the cryptographic accelerators that support message authentication (MAC) algorithms only offer a ``forward-compute'' mode. That is, the card can only compute the MAC on the packet, and it is up to the operating system to verify its validity by comparing it with the received value. Thus, we use the opaque data to store the MAC value from the packet and instruct the OCF to write the new MAC value in the appropriate location in the packet -- the operation is exactly the same as the output case. In the callbacks, we simply do a byte-wise comparison of the computed value (stored on the packet) and the received value (stored as opaque data in the request itself).
While the code was not very complicated, there were several minor headaches as a result of this asynchronous processing model. For example, one problem was communicating MTU information through arbitrarily-many IPsec SAs to the TCP layer, so as to correctly fragment application data and avoid fragmentation at the IP layer. We could not simply update the appropriate data structures with the correct MTU value after the packet had been encapsulated once, since we could not ``peek'' inside the encryption. Fortunately, we keep a record of which SAs have been applied to a packet during input and output processing. Thus, on receipt of the appropriate ICMP message, or when the IP layer indicates that the packet is too large to be transmitted without fragmentation, the list of SAs is traversed and each SA is updated with the correct MTU value based on its position in the SA chain (i.e., the first SA on output will advertise a smaller MTU than the last one, the difference being the ESP headers and encryption padding). The next packet that tries to traverse the chain will encounter a correct MTU value.