Buffer Cache Locking

Next:Device Driver Support Up:Kernel Support for DAFS Previous:VM System Support

Buffer Cache Locking

In our first implementation of a DAFS server, we chose to directly use the buffer cache for block I/O. In an RDMA-based data transfer, the server sets up the RDMA transfer in the context of the requesting RPC. Once issued, the RDMA proceeds asynchronously to the RPC. The latter does not wait for RDMA completion. To serialize concurrent access to shared files in the face of asynchrony, the vnode (vp) of a file needs to be locked for the duration of the RPC. However, the data buffers (bp's) transfered need to be locked for the full duration of the RDMA. Locking the vp (i.e. the entire file) for the duration of the RDMA would also work but would limit performace in case of sharing since requests for non-overlapping regions of a file would have to serialize. Our decision to lock at a finer granularity than the vp for the duration of a transfer conflicts with current FreeBSD buffer cache locking assumptions:

Locking a buffer in the cache requires a process to acquire an exclusive lock on that buffer. A buffer lock can only be released by the same process that locked it or by the kernel.
Before an asynchronous disk I/O (i.e. an asynchronous write, or readahead), lock ownership has to be transfered to the kernel so that the block can later be released by the kernel (in biodone()).

A multithreaded event-driven kernel server that directly uses the buffer cache and does event processing in kernel process context faces problems in the following circumstances:

When a thread tries to lock a buffer it is already locking (because a transfer is in progress on that buffer) expecting to block until that lock is released by some other thread.
When a buffer is released from a different thread than the one that locked it.

Transfering lock ownership to the kernel during asynchronous network I/O does not help since lock release is done by some kernel process (whichever happens to have polled for that particular event) rather than by the kernel itself. The solution presently used is for the kernel process that issued an RDMA operation to wait until the transfer is done in order to release the lock. This also prohibits that process from trying to lock the same buffer again, thus causing a deadlock panic. A better solution is to enable recursive locking and allow lock release by any of the server threads.

Next:Device Driver Support Up:Kernel Support for DAFS Previous:VM System Support

Kostas Magoutis 2001-12-03