Similar to the underlying OCF, this uses a session-based model, since the general case assumes that keys will be reused for a sequence of operations. After opening the /dev/crypto device and gaining a file descriptor fd, the caller requests that a new session be created with CIOCGSESSION for a certain cryptographic operation, and specifies all related parameters (e.g., keys). Similar to the OCF, a single session supports both a cipher and a MAC, as we are simply exporting the same functionality available to the kernel. CIOCGSESSION returns a session identifier that can then be reused repeatedly for subsequent operations. When the session is no longer needed, it can be revoked using CIOCFSESSION. Many sessions can be requested against a single file descriptor fd; all sessions follow a particular fd through fork() and exec() calls, and are not otherwise visible to other processes. Obviously, the last close() on fd destroys all the sessions.
If the request cannot be satisfied using hardware accelerators, the kernel will return an error of EINVAL, so the caller can fall back to a software implementation. We considered adding an ioctl() that describes the abilities of the available hardware, allowing an application to determine if the needed algorithm is supported by looking at a list. However, numerous other variables exist (key sizes, block sizes, alignment) which might be difficult to describe. For the time being, we have punted on this issue. However, when first called, the OpenSSL engine will enumerate all OCF-supported algorithms. It does so by issuing a CIOCGSESSION request for each algorithm it supports in software, and caches the result. If an algorithm is not provided by the OCF, the library will use its software implementation (in reality, the kernel will admit that it supports cryptographic algorithms that it implements in software, and OpenSSL will make use of them as if they were implemented by hardware, unless a sysctl variable is set to prohibit this, which is the default setting).
Once a session is established, blocks can be encrypted or decrypted using the CIOCCRYPT ioctl(). Each time this is used, the caller can specify a new IV or MAC information that they wish to fold into the operation. Input and output buffers are specified via separate pointers, but they can point to the same buffer for in-place encryption. Naturally, the data size provided by the caller must be rounded to the default block size of the algorithm being used. A data size limit of 262,140 bytes exists at the moment, to hide a similar limit found in some chipsets. In the future, we may support larger blocks by splitting operations into smaller chunks.
The user-land data blocks are copied into memory allocated inside the kernel address space. This data is formatted into uio blocks as mentioned in Section 3. The OCF is then called to perform the operation using the initialization information stored in the application's /dev/crypto session. If the operation is successful, the results are copied back to the application buffers. Obviously, the cost of these two copies is higher for larger block sizes, as we shall see in Section 5.4. In the future, we hope to use page flipping for larger blocks when the kernel memory subsystem supports this.
For asymmetric operations, no session is required. The CIOCKEY ioctl() is used in an atomic fashion for each individual operation. Five operations are provided, with CRK_MOD_EXP being the most important. Support for the others, CRK_MOD_EXP_CRT, CRK_DSA_SIGN, CRK_DSA_VERIFY, and CRK_DH_COMPUTE_KEY has not yet been completed. Each of these has an operation-specific number of input and output parameters, which are always a packed byte array of big integers. The particular format we chose for these parameters makes it easy to interface to OpenSSL ``bignums,'' and to most of the early hardware we had access to.
Presently, OpenBSD lacks cloning devices. Therefore a cumbersome procedure for opening /dev/crypto must be followed. After the initial open() call, the caller must use ioctl() to retrieve a file descriptor (fd) to use, then perform all operations against this replacement fd. This replacement fd is a unique per-process descriptor, while the initially-opened one would naturally be shared between all callers. Without such semantics, the fork() and _exit() system calls do not exhibit the expected semantics with respect to file-descriptor inheritance and closing. Just as bad, we would end up with all processes able to see and use each other's keys. When cloning devices are implemented in OpenBSD, we will change the user-level code (mostly OpenSSL) to no longer use this complicated procedure, but the kernel will retain it for backward compatibility. While writing this code, we ran into numerous strange and difficult resource-management issues for session teardown.
It should also be noted that applications using /dev/crypto must ensure they use ioctl() with the F_SETFD command on the crypto descriptor to ensure that the ``close-on-exec'' flag is set. Otherwise, child processes will inherit unwanted descriptors, which is both a security and a resource-exhaustion concern.